Fix JRuby timeout to respect CF platform limit (180s max)#1141
Closed
Fix JRuby timeout to respect CF platform limit (180s max)#1141
Conversation
PR #1140 increased the timeout to 300s, but the CF platform has a configured maximum of 180s (cc.maximum_health_check_timeout: 180). This causes deployment failures: 'health_check_timeout Maximum exceeded: max 180s' The test polling timeout increase (3min -> 5min) from PR #1140 is correct and remains in place. That gives the test sufficient time to wait for JRuby warmup after the health check passes. This fix reverts only the manifest.yml timeout back to 180s while keeping the test timeout at 5 minutes, which achieves the goal of reducing flaky failures without violating platform constraints. Fixes: #1140 (partial revert of manifest.yml change)
Contributor
Author
|
Closing this PR - reverting to 180s doesn't solve the problem, it just returns to the original flaky test issue. We need a different approach that respects the platform 180s limit while still addressing test flakiness. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
PR #1140 was merged with a change that increases the JRuby test app timeout to 300 seconds in
manifest.yml. However, the CF platform deployment has a hard limit of 180 seconds configured viacc.maximum_health_check_timeout.This causes deployment failures in CI:
Root Cause
The CloudFoundry platform has this configuration that we cannot override:
Any attempt to set
timeout: 300in the manifest is rejected by the Cloud Controller.Solution
This PR reverts only the manifest.yml change from PR #1140, setting the timeout back to 180 seconds.
What this PR keeps from #1140:
What this PR reverts from #1140:
Changes
Only 1 file, 1 line changed.
Why The Original Goal Still Works
The test polling timeout (5 minutes) is separate from the CF health check timeout (180s):
CF Health Check (180s max):
Test Polling Timeout (5 minutes, increased in Increase JRuby test polling timeout for cflinuxfs5 stability #1140):
Timeline:
The 5-minute test timeout gives sufficient time for:
Testing
This fix allows the test to:
Related Issues
Impact
Priority
Critical - Current master is broken for environments with 180s timeout limit.