Fix kubernetes container state#7214
Conversation
…AILED; handle empty reason as PENDING; unify constant as PENDING_WAITING_REASONS; add buildSparkAppUrl overload; restore POD IP URL test; add tests for failure waiting reasons; revert .idea/vcs.xml
…ng-state logic change
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a bug where Kubernetes applications in waiting state due to container failures (like image pull errors) were incorrectly marked as PENDING forever instead of FAILED. The fix introduces proper state mapping based on container waiting reasons.
- Adds logic to distinguish between valid pending reasons and failure reasons in container waiting states
- Updates container state mapping to mark applications as FAILED when appropriate error conditions occur
- Includes comprehensive unit tests for the new container state logic
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| KubernetesApplicationOperation.scala | Implements proper container state mapping logic with failure detection |
| KubernetesApplicationOperationSuite.scala | Adds unit tests covering pending and failure waiting reasons |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
turboFei
left a comment
There was a problem hiding this comment.
LGTM, could you fix the code style with dev/reformat?
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7214 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 695 696 +1
Lines 43479 43546 +67
Branches 5882 5892 +10
======================================
- Misses 43479 43546 +67 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
@turboFei done. |
This reverts commit e869a46.
### Why are the changes needed? This PR fixes #7195 where if `kyuubi.kubernetes.application.state.source` is `CONTAINER` and kubernetes fails to pull the image, or the image name is not valid, or any failure ocurs, kyuubi marks the application as pending, forever. ### How was this patch tested? - Added unit tests in KubernetesApplicationOperationSuite: To run the targeted suite: `./build/mvn -pl kyuubi-server -DskipITs -Dtest=org.apache.kyuubi.engine.KubernetesApplicationOperationSuite test` ### Was this patch authored or co-authored using generative AI tooling? No Closes #7214 from moelhoussein/fix-kubernetes-container-state. Closes #7214 4f83667 [Wang, Fei] Revert "reformatted" e869a46 [MElHoussein] reformatted 8ede9aa [Wang, Fei] code style d3b9cd2 [MElHoussein] revert unrelated test/url overload changes; keep only container waiting-state logic change 7d346fc [MElHoussein] engine(k8s): treat only specific waiting reasons as PENDING; others FAILED; handle empty reason as PENDING; unify constant as PENDING_WAITING_REASONS; add buildSparkAppUrl overload; restore POD IP URL test; add tests for failure waiting reasons; revert .idea/vcs.xml ca94d64 [MElHoussein] Fixing container state Lead-authored-by: MElHoussein <MElHoussein@geico.com> Co-authored-by: Wang, Fei <fwang12@ebay.com> Signed-off-by: Wang, Fei <fwang12@ebay.com> (cherry picked from commit b5d7f58) Signed-off-by: Wang, Fei <fwang12@ebay.com>
|
thanks, merged to master(1.11.0) and branch-1.10(1.10.3) |
Why are the changes needed?
This PR fixes #7195 where if
kyuubi.kubernetes.application.state.sourceisCONTAINERand kubernetes fails to pull the image, or the image name is not valid, or any failure ocurs, kyuubi marks the application as pending, forever.How was this patch tested?
To run the targeted suite:
./build/mvn -pl kyuubi-server -DskipITs -Dtest=org.apache.kyuubi.engine.KubernetesApplicationOperationSuite testWas this patch authored or co-authored using generative AI tooling?
No