Skip to content

test: fix syslog assertions in test_prefix_list_suppress for multi-ASIC#25204

Merged
yxieca merged 1 commit into
sonic-net:masterfrom
deepak-singhal0408:fix/prefix-list-suppress-syslog
Jun 12, 2026
Merged

test: fix syslog assertions in test_prefix_list_suppress for multi-ASIC#25204
yxieca merged 1 commit into
sonic-net:masterfrom
deepak-singhal0408:fix/prefix-list-suppress-syslog

Conversation

@deepak-singhal0408

@deepak-singhal0408 deepak-singhal0408 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Description of PR

Replace journalctl-based collect_recent_syslog() with syslog marker-based collect_syslog_since_marker() in tests/bgp/test_prefix_list_suppress.py.

On multi-ASIC chassis (T2/VoQ), container daemons log via rsyslog to /var/log/syslog, not the systemd journal. The old journalctl --since approach always returned empty on these platforms, causing negative assertions to silently pass and positive assertions to fail.

Summary:
Fixes ADO #38303118

The test was introduced in PR #24840 and fails on T2 multi-ASIC UpstreamLC linecards where journalctl does not capture container daemon logs (bgpcfgd logs are forwarded via rsyslog instead of systemd journal).

Type of change

  • Bug fix

Back port request

  • 202311
  • 202405
  • 202411
  • 202505
  • 202511
  • 202512
  • 202605

Approach

What is the motivation for this PR?

Nightly test test_prefix_list_mgr_running_on_every_device (from PR #24840) was failing on T2 multi-ASIC chassis UpstreamLC linecards. The root cause is that journalctl --since="N sec ago" returns nothing for container daemons that log via rsyslog (not systemd journal) on multi-ASIC platforms. This made the syslog-based assertions unreliable — negative checks silently pass (empty grep = no match) and positive checks always fail.

How did you do it?

  • Added place_syslog_marker() — injects a UUID-tagged marker via logger -t SONIC_TEST
  • Added collect_syslog_since_marker() — uses sed -n "/<marker>/,\$p" /var/log/syslog | grep <pattern>
  • Restructured test_prefix_list_mgr_running_on_every_device into a per-container loop (marker → restart → wait → assert) instead of batch restart + batch check
  • Removed BGPCFGD_LOG_WINDOW_SECONDS constant and unused restart_bgpcfgd_only() helper
  • Fixed SUPPRESS_PREFIX direct DB path test to use same marker approach

The /var/log/syslog approach works on all SONiC platforms (single-ASIC, multi-ASIC chassis, VS) since rsyslog is the universal log sink.

How did you verify/test it?

Scenario Result
T0 KVM VS (ToRRouter, single-ASIC) ✅ PASSED
T0 KVM VS (SpineRouter/UpstreamLC via CONFIG_DB hack) ✅ PASSED
test_suppress_prefix_on_non_spine_device (SUPPRESS_PREFIX direct DB path) ✅ PASSED

All three test functions exercising the marker-based syslog logic pass on latest master nightly VS image.

Any platform specific information?

None — the fix is platform-independent. Works on any SONiC device with rsyslog (all platforms).

Supported testbed topology if it's a new test case?

Not a new test case. Existing test supports topologies: t0, t1, t2.

Documentation

N/A — no new features or test cases added.

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates tests/bgp/test_prefix_list_suppress.py to make syslog-based assertions reliable on multi-ASIC chassis platforms by switching from journalctl --since log collection to a marker-based window over /var/log/syslog.

Changes:

  • Added syslog marker placement and “collect since marker” helpers to scope log assertions to a known point in time.
  • Reworked test_prefix_list_mgr_running_on_every_device to restart bgpcfgd per BGP container and assert logs per restart window.
  • Updated the SUPPRESS_PREFIX direct CONFIG_DB path test to use the same marker-based syslog windowing.

Comment thread tests/bgp/test_prefix_list_suppress.py
Comment thread tests/bgp/test_prefix_list_suppress.py
Comment thread tests/bgp/test_prefix_list_suppress.py
shixizhang
shixizhang previously approved these changes Jun 9, 2026
Replace journalctl-based collect_recent_syslog() with syslog marker-based
collect_syslog_since_marker(). On multi-ASIC chassis (T2/VoQ), container
daemons log via rsyslog to /var/log/syslog, not systemd journal. The old
journalctl approach always returned empty on these platforms, causing
assertions to silently pass (negative check) or fail (positive check).

Changes:
- Add place_syslog_marker() helper using logger + uuid
- Add collect_syslog_since_marker() using sed from marker in /var/log/syslog
- Restructure test_prefix_list_mgr_running_on_every_device into per-container
  loop: marker → restart → wait → assert for each container individually
- Remove BGPCFGD_LOG_WINDOW_SECONDS constant (no longer needed)
- Remove unused restart_bgpcfgd_only() helper
- Fix second caller (SUPPRESS_PREFIX direct DB path) to use marker approach

Fixes: ADO #38303118
Signed-off-by: Deepak Singhal <deepsinghal@microsoft.com>
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@deepak-singhal0408

Copy link
Copy Markdown
Contributor Author

Hi @yxieca, could you help merge this PR? this fix is for Nightly test failure, and also need this change in 202605 branch as well. Thanks,

@yxieca yxieca merged commit dd8c726 into sonic-net:master Jun 12, 2026
23 checks passed
sdeotarse-msft pushed a commit to SoumyaMishra18/sonic-mgmt that referenced this pull request Jun 12, 2026
…IC (sonic-net#25204)

What: Replaces journalctl-based collect_recent_syslog() with a syslog marker-based collect_syslog_since_marker() in tests/bgp/test_prefix_list_suppress.py, restructures test_prefix_list_mgr_running_on_every_device into a per-container marker→restart→wait→assert loop, and removes the unused BGPCFGD_LOG_WINDOW_SECONDS constant and restart_bgpcfgd_only() helper.
Why: On multi-ASIC chassis (T2/VoQ), container daemons log via rsyslog to /var/log/syslog, not the systemd journal, so journalctl --since always returned empty — negative assertions silently passed and positive assertions failed, causing nightly failures on T2 UpstreamLC linecards. Fixes ADO #38303118.
How: Injects a UUID-tagged marker via logger -t SONIC_TEST and scopes assertions with sed -n "/<marker>/,\$p" /var/log/syslog | grep <pattern>, which works on all SONiC platforms since rsyslog is the universal log sink.
Testing: T0 KVM VS (ToRRouter single-ASIC) PASSED; T0 KVM VS (SpineRouter/UpstreamLC via CONFIG_DB) PASSED; test_suppress_prefix_on_non_spine_device PASSED. All CI green.

Signed-off-by: Deepak Singhal <deepsinghal@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants