Skip to content

[code sync] Merge code from sonic-net/sonic-platform-daemons:202505 to 202506#45

Merged
mssonicbld merged 2 commits into
Azure:202506from
mssonicbld:sonicbld/202506-merge
Sep 4, 2025
Merged

[code sync] Merge code from sonic-net/sonic-platform-daemons:202505 to 202506#45
mssonicbld merged 2 commits into
Azure:202506from
mssonicbld:sonicbld/202506-merge

Conversation

@mssonicbld

Copy link
Copy Markdown
Collaborator
* 77c70bd - (origin/202505) xcvrd: gate config publish on syncd.restore_count = 1 (instead of system warm-reboot flag) to prevent post-warm-reboot port flaps (#671) (2025-09-03) [mssonicbld]<br>```

mssonicbld and others added 2 commits September 3, 2025 17:13
…tem warm-reboot flag) to prevent post-warm-reboot port flaps (#671)

<!-- Provide a general summary of your changes in the Title above -->

#### Description
This change switches xcvrd’s warm-reboot readiness check from the system warm-reboot flag to syncd’s restore_count != 0 in STATE_DB. In production, the finalizer clears the system warm-reboot flag before xcvrd publishes optics/port configs. xcvrd then assumes warm-reboot is over and pushes configs too early, which has been causing all ports to flap.

#### Motivation and Context
All port flapped during warm-reboot on Arista device from 202411->202505 upgrade

#### How Has This Been Tested?
root@str-7060-cx32-1:/var/log# sudo grep -nEi 'down[[:space:]]+to[[:space:]]+up|up[[:space:]]+to[[:space:]]+down' /var/log/syslog
root@str-7060-cx32-1:/var/log# sudo grep -nEi 'down[[:space:]]+to[[:space:]]+up|up[[:space:]]+to[[:space:]]+down' /var/log/syslog.1
root@str-7060-cx32-1:/var/log#

Warm-upgrade from 202311 to 202505 (unsupported upgrade path) which led to syncd crashed in new image, syncd will do cold-restart and reset restore_count, in this case xcvrd can publish SI settings correctly.
```
root@str2-7050cx3-acs-14:~# redis-cli -n 6 hgetall "WARM_RESTART_TABLE|syncd"
1) "restore_count"
2) "0"
root@str2-7050cx3-acs-14:~# show warm_restart state
name restore_count state
------------- --------------- -----------------------
fdbsyncd 0 disabled
teamsyncd 0 reconciled
bgp 0 disabled
teammgrd 0
syncd 0
neighsyncd 0 reconciled
nbrmgrd 0
warm-shutdown 0 warm-shutdown-succeeded
portsyncd 0
coppmgrd 0
xcvrd 0
vlanmgrd 0 reconciled
orchagent 0 reconciled
rebootbackend 0
gearsyncd 0
tunnelmgrd 0 reconciled
vxlanmgrd 0 reconciled
intfmgrd 0 disabled
vrfmgrd 0 disabled
root@str2-7050cx3-acs-14:~#
root@str2-7050cx3-acs-14:~# show reboot-cause
User issued 'warm-reboot' command [User: admin, Time: Tue 02 Sep 2025 10:31:47 PM UTC]
root@str2-7050cx3-acs-14:~#
root@str2-7050cx3-acs-14:~# sonic-db-cli STATE_DB hget "WARM_RESTART_ENABLE_TABLE|system" enable
false
root@str2-7050cx3-acs-14:~#
root@str2-7050cx3-acs-14:~# sudo zgrep -ai "xcvrd.*publish" /var/log/syslog
2025 Sep 2 22:42:45.394080 str2-7050cx3-acs-14 NOTICE pmon#xcvrd[37]: Publishing ASIC-side SI setting for port Ethernet108 in APP_DB:
2025 Sep 2 22:42:45.398251 str2-7050cx3-acs-14 NOTICE pmon#xcvrd[37]: Notify media setting: Published ASIC-side SI setting for lport Ethernet108 in APP_DB
2025 Sep 2 22:42:45.471204 str2-7050cx3-acs-14 NOTICE pmon#xcvrd[37]: Publishing ASIC-side SI setting for port Ethernet32 in APP_DB:

No Port flapped from 202411->202505 warm upgrade after injected changes in pmon container.
```
#### Additional Information (Optional)
@mssonicbld mssonicbld merged commit f3bd628 into Azure:202506 Sep 4, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant