feat(test): Add integration tests for status server#3373
feat(test): Add integration tests for status server#3373digvijay-y wants to merge 6 commits intokubeflow:masterfrom
Conversation
|
🎉 Welcome to the Kubeflow Trainer! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
There was a problem hiding this comment.
Pull request overview
Adds Ginkgo integration coverage for the status server’s handleTrainJobRuntimeStatus endpoint, exercising real API-server validation behavior in the envtest-based integration suite.
Changes:
- Introduces a new integration spec that boots a TLS-enabled status server and posts status updates against it.
- Covers success, invalid payload validation (e.g., progressPercentage > 100), and not-found TrainJob updates.
robert-bell
left a comment
There was a problem hiding this comment.
Hey @digvijay-y, thanks for your patience and the reminder.
It's looking good so far, but I have a few comments. We should look at the copilot suggestions too.
There's a couple of extra scenarios we could check - would you be happy to take a look?
- testing that an update overwrites an existing trainer status. It shouldn't merge the status (e.g. if a progress percent is there initially, but not in the update, then it should be removed by the update).
- testing that an empty request
UpdateTrainJobStatusRequest{}is valid but doesn't overwrite an existing trainer status.
I think the current implementation should already handle both those cases correctly.
| return true, nil | ||
| } | ||
|
|
||
| func generateTestTLSConfig() (*tls.Config, error) { |
There was a problem hiding this comment.
This is quite a chunky utility - can we use crypto/tls/testcert instead?
Maybe something like
func generateTestTLSConfig() (*tls.Config, error) {
cert, err := tls.X509KeyPair(testcert.LocalhostCert, testcert.LocalhostKey)
if err != nil {
return nil, err
}
return &tls.Config{Certificates: []tls.Certificate{cert}}, nil
}
There was a problem hiding this comment.
crypto/tls/testcert is an internal stdlib package and can't be imported outside the standard library. Used runtime self-signed cert generation via crypto/ecdsa and crypto/x509 instead.
| gomega.Expect(k8sClient.Create(ctx, trainJob)).To(gomega.Succeed()) | ||
|
|
||
| ginkgo.By("POSTing a valid status update") | ||
| resp := postStatus(httpClient, serverAddr, ns.Name, jobName, trainer.UpdateTrainJobStatusRequest{ |
There was a problem hiding this comment.
can we add a test for an empty update request? The TrainJob status should be unchanged; any existing status should not be removed.
trainer.UpdateTrainJobStatusRequest{}
There was a problem hiding this comment.
Added as a dedicated It block , "Should accept an empty update request but not overwrite existing status". It POSTs an initial status, then POSTs UpdateTrainJobStatusRequest{}, and uses Consistently to verify the existing TrainerStatus is unchanged
| var status metav1.Status | ||
| gomega.Expect(json.NewDecoder(resp.Body).Decode(&status)).To(gomega.Succeed()) | ||
| gomega.Expect(status.Message).To(gomega.ContainSubstring("progressPercentage"), | ||
| "error message should identify the invalid field") |
There was a problem hiding this comment.
should we check the train job status isn't updated?
There was a problem hiding this comment.
Added a Consistently block after the 422 assertions that polls the TrainJob for 2 seconds and asserts TrainerStatus remains nil, confirming the rejected request had no side-effects.
|
@robert-bell I've Implemented the changes, would you take a look, whenever possible. |
robert-bell
left a comment
There was a problem hiding this comment.
Thanks for the updates @digvijay-y. Just one small extra comment, but otherwise this is looking good. It'll need a review from a maintainer before it can be merged. @astefanutti @andreyvelich please could you take a look when you have bandwidth?
| err := k8sClient.Get(ctx, types.NamespacedName{ | ||
| Name: jobName, | ||
| Namespace: ns.Name, | ||
| }, notFound) |
There was a problem hiding this comment.
should this be wrapped in ginkgo.Consistently too?
There was a problem hiding this comment.
Wrapped it up. Thank you for the review.
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com> signed-off-by: DIGVIJAY <yewaredigvijay@gmail.com> Signed-off-by: DIGVIJAY <144053736+digvijay-y@users.noreply.github.com>
|
@astefanutti @andreyvelich I have implemented changes suggested by @robert-bell , Could please take a look whenever possible? |
What this PR does / why we need it:
Adds integration tests for the status server handleTrainJobRuntimeStatus handler.
The tests include coverage of:
progressPercentage greater than 100+kubebuilder:validation:Minimum=0and+kubebuilder:validation:Maximum=100with this handler would return 422Which issue(s) this PR fixes (optional, in
Fixes #<issue number>, #<issue number>, ...format, will close the issue(s) when PR gets merged):Fixes #3346
Checklist: