[pc] Pass ignore_loganalyzer to config_reload in test_retry_count teardown#25488
Open
sakshamkhurana21 wants to merge 1 commit into
Open
[pc] Pass ignore_loganalyzer to config_reload in test_retry_count teardown#25488sakshamkhurana21 wants to merge 1 commit into
sakshamkhurana21 wants to merge 1 commit into
Conversation
…rdown Signed-off-by: sakshamkhurana <sakkhurana@microsoft.com>
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses intermittent teardown failures in tests/pc/test_retry_count.py by muting LogAnalyzer during config_reload() inside the config_reload_on_cleanup fixture, preventing expected transient syslog ERR messages during reload from failing the test.
Changes:
- Add
loganalyzeras a dependency to theconfig_reload_on_cleanupfixture. - Pass
ignore_loganalyzer=loganalyzertoconfig_reload(..., safe_reload=True)during fixture teardown.
Collaborator
|
This PR has backport request for branch(es): 202605. ---Powered by SONiC BuildBot
|
saiarcot895
approved these changes
Jun 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary: Fix the flaky
pc/test_retry_count.py::TestDutRetryCount::test_kill_team_peer_lag_uptest.The test verifies LAG retry count behavior (that LAG stays up for 150s after killing teamd).
The test body passes — the functional behavior is correct. However, the test fails
intermittently on teardown because LogAnalyzer catches transient syslog errors that
occur during
config_reloadin theconfig_reload_on_cleanupfixture.Transient errors during config_reload include:
ERR teamd#teamsyncd: Failed to initialize team handler for LAG ... Unable to initialize team socketERR memory_checker: cgroup memory usage file ... does not existERR swss#orchagent: removeLag: Failed to remove ref countThese are expected during container restart — the system retries and recovers automatically.
The
config_reload_on_cleanupfixture was not telling LogAnalyzer to expect these transienterrors, so LogAnalyzer was failing the test.
Fixes the intermittent failure observed in Elastictest test plans including:
6a338e53d2130994bb47b365(https://elastictest.org/scheduler/testplan/6a338e53d2130994bb47b365)Type of change
Back port request
Approach
What is the motivation for this PR?
test_kill_team_peer_lag_upis a flaky test. Over the last 14 days: 71 errors / 39 distinct PRs.The dominant error signatures are:
teamsyncd: Failed to initialize team handler(68 occurrences, 96%)memory_checker: cgroup memory usage file does not exist(3 occurrences, 4%)Both occur during
config_reloadin the teardown fixture and are transient/self-healing.How did you do it?
Added
loganalyzeras a fixture dependency toconfig_reload_on_cleanupand passedignore_loganalyzer=loganalyzerto theconfig_reload()call. This tells LogAnalyzerto add start/end ignore markers around the reload operation, so transient errors during
reload are not captured.
This is the canonical pattern used by 8+ other tests in sonic-mgmt that perform config_reload:
tests/route/test_route_perf.pytests/pc/test_lag_member_forwarding.pytests/drop_packets/drop_packets.pytests/wan/lacp/test_wan_lag_min_link.pytests/bgp/test_bgp_suppress_fib.pyHow did you verify/test it?
Verified the fix follows the canonical pattern used by other tests.
Syntax and lint verified:
py_compileandflake8 --max-line-length=120clean.Any platform specific information?
None
Supported testbed topology if it's a new test case?
N/A
Documentation
N/A