Skip to content

lib: fix memory leak in link state message ownership model#22274

Open
guoguojia2021 wants to merge 1 commit into
FRRouting:masterfrom
guoguojia2021:fix/lib-ls-msg-ownership-leak
Open

lib: fix memory leak in link state message ownership model#22274
guoguojia2021 wants to merge 1 commit into
FRRouting:masterfrom
guoguojia2021:fix/lib-ls-msg-ownership-leak

Conversation

@guoguojia2021

Copy link
Copy Markdown
Contributor

ls_delete_msg() only frees the inner data (node/attr/prefix) when msg->event == LS_MSG_EVENT_DELETE. For ADD/UPDATE/SYNC events, it assumes the TED (Traffic Engineering Database) has taken ownership of the data via ls_msg2vertex/ls_msg2edge/ls_msg2subnet, which call ls_vertex_add/ls_edge_add (store pointer) or ls_vertex_update/ ls_edge_update (store or free if duplicate).

This assumption breaks when the TED is not initialized or when processing fails. For example, path_ted_rcvd_message() returns early if !path_ted_is_initialized(), so the data is never stored in the TED. ls_delete_msg() then skips freeing because the event is not DELETE, and the inner data (including admin_group bitmap allocated by admin_group_init) leaks.

Fix the ownership model with explicit ownership transfer:

  1. In ls_msg2vertex(): for SYNC/ADD/UPDATE, set msg->data.node = NULL after ls_vertex_add/ls_vertex_update (which either stores the pointer in TED or frees it if duplicate -- either way the caller no longer owns it). DELETE only uses the node for lookup via ls_find_vertex_by_id(), so msg->data.node remains valid for cleanup.

  2. In ls_msg2edge(): same pattern -- set msg->data.attr = NULL after ls_edge_add/ls_edge_update for SYNC/ADD/UPDATE.

  3. In ls_msg2subnet(): same pattern -- set msg->data.prefix = NULL after ls_subnet_add/ls_subnet_update for SYNC/ADD/UPDATE.

  4. In ls_delete_msg(): remove the msg->event == LS_MSG_EVENT_DELETE condition. Always call ls_node_del/ls_attributes_del/ls_prefix_del on the inner data. All three functions have NULL checks at entry (e.g., ls_attributes_del checks "if (!attr) return"), so when TED has taken ownership (pointer set to NULL), the call is a safe no-op. When TED has NOT taken ownership, the data is properly freed.

@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes a memory leak in the link-state message ownership model: ls_delete_msg previously skipped freeing inner data for non-DELETE events (assuming the TED had taken ownership), but when the TED is not initialized (e.g., path_ted_is_initialized() returns false), the data was never transferred and leaked.

  • ls_msg2vertex, ls_msg2edge, and ls_msg2subnet now set msg->data.{node,attr,prefix} to NULL after SYNC/ADD/UPDATE to signal ownership transfer, and ls_delete_msg unconditionally calls the _del helpers (which are NULL-safe no-ops).
  • The fix correctly handles the described scenario (TED not initialized → early return → data never stored → ls_delete_msg now frees it). A remaining gap: the NULL assignment fires unconditionally even when ls_vertex_add/ls_edge_add fails (returns NULL due to invalid key or AF_UNSPEC), meaning ownership is not actually transferred but msg->data is nulled anyway, leaving a pre-existing leak path the PR claims to have closed.

Confidence Score: 3/5

The fix correctly handles the main described scenario (TED not initialized), but the unconditional NULL assignment when ls_vertex_add or ls_edge_add returns NULL without taking ownership means the fix's stated invariant doesn't fully hold, and a narrow class of add-failure leaks remain.

The ownership transfer via msg->data.node = NULL is set unconditionally in ADD/SYNC/UPDATE branches even when ls_vertex_add returns NULL (key == 0, UNKNOWN origin) or ls_edge_add returns NULL (AF_UNSPEC, no valid local IP). In those paths, neither function stores nor frees the pointer, but the msg field is zeroed, making the allocation permanently unreachable. While this mirrors a pre-existing leak and does not cause new crashes, the PR description claims these paths are covered by the invariant, which is incorrect.

lib/link_state.c — specifically the SYNC/ADD/UPDATE branches in ls_msg2vertex and ls_msg2edge where the NULL assignment should be guarded by a non-NULL return check.

Important Files Changed

Filename Overview
lib/link_state.c Fixes memory leak in ls_delete_msg for non-DELETE events by removing the event guard and nulling msg->data pointers after add/update; however the NULL assignment is unconditional even when ls_vertex_add/ls_edge_add return NULL without taking ownership, leaving a pre-existing leak path unaddressed and the stated invariant incorrect.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
lib/link_state.c:1957-1972
**Unconditional NULL when `ls_vertex_add` fails to take ownership**

`msg->data.node = NULL` is set even when `ls_vertex_add` returns `NULL` — specifically when `key == 0` (e.g., `node->adv.origin == UNKNOWN`). In that code path, `ls_vertex_add` neither stores nor frees the `node`, so the caller still owns the pointer. Setting `msg->data.node = NULL` silently discards it, causing `ls_delete_msg` to call `ls_node_del(NULL)` (a no-op) and leak the allocation. The same issue applies to `ls_edge_add` returning `NULL` when `key.family == AF_UNSPEC` (no valid local IP in attributes). The PR description's invariant — "either stores the pointer in TED or frees it if duplicate" — does not hold for add failures; ownership is only transferred when the function returns non-NULL. The NULL assignment should be guarded by `if (vertex)` (or `if (edge)`/`if (subnet)`) to ensure it only fires when ownership was actually transferred.

Reviews (1): Last reviewed commit: "lib: fix memory leak in link state messa..." | Re-trigger Greptile

Comment thread lib/link_state.c Outdated
Comment on lines +1957 to +1972
vertex = ls_vertex_add(ted, node);
if (vertex)
vertex->status = SYNC;
msg->data.node = NULL;
break;
case LS_MSG_EVENT_ADD:
vertex = ls_vertex_add(ted, node);
if (vertex)
vertex->status = NEW;
msg->data.node = NULL;
break;
case LS_MSG_EVENT_UPDATE:
vertex = ls_vertex_update(ted, node);
if (vertex)
vertex->status = UPDATE;
msg->data.node = NULL;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unconditional NULL when ls_vertex_add fails to take ownership

msg->data.node = NULL is set even when ls_vertex_add returns NULL — specifically when key == 0 (e.g., node->adv.origin == UNKNOWN). In that code path, ls_vertex_add neither stores nor frees the node, so the caller still owns the pointer. Setting msg->data.node = NULL silently discards it, causing ls_delete_msg to call ls_node_del(NULL) (a no-op) and leak the allocation. The same issue applies to ls_edge_add returning NULL when key.family == AF_UNSPEC (no valid local IP in attributes). The PR description's invariant — "either stores the pointer in TED or frees it if duplicate" — does not hold for add failures; ownership is only transferred when the function returns non-NULL. The NULL assignment should be guarded by if (vertex) (or if (edge)/if (subnet)) to ensure it only fires when ownership was actually transferred.

Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/link_state.c
Line: 1957-1972

Comment:
**Unconditional NULL when `ls_vertex_add` fails to take ownership**

`msg->data.node = NULL` is set even when `ls_vertex_add` returns `NULL` — specifically when `key == 0` (e.g., `node->adv.origin == UNKNOWN`). In that code path, `ls_vertex_add` neither stores nor frees the `node`, so the caller still owns the pointer. Setting `msg->data.node = NULL` silently discards it, causing `ls_delete_msg` to call `ls_node_del(NULL)` (a no-op) and leak the allocation. The same issue applies to `ls_edge_add` returning `NULL` when `key.family == AF_UNSPEC` (no valid local IP in attributes). The PR description's invariant — "either stores the pointer in TED or frees it if duplicate" — does not hold for add failures; ownership is only transferred when the function returns non-NULL. The NULL assignment should be guarded by `if (vertex)` (or `if (edge)`/`if (subnet)`) to ensure it only fires when ownership was actually transferred.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread lib/link_state.c Outdated
edge = ls_edge_add(ted, attr);
if (edge)
edge->status = SYNC;
msg->data.attr = NULL;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not just set one time outside the switch? Same comment for all

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I’ve moved the NULL assignment outside the switch.
Thanks for catching that!

ls_delete_msg() only frees the inner data (node/attr/prefix) when
msg->event == LS_MSG_EVENT_DELETE. For ADD/UPDATE/SYNC events, it
assumes the TED (Traffic Engineering Database) has taken ownership
of the data via ls_msg2vertex/ls_msg2edge/ls_msg2subnet, which call
ls_vertex_add/ls_edge_add (store pointer) or ls_vertex_update/
ls_edge_update (store or free if duplicate).

This assumption breaks when the TED is not initialized or when
processing fails. For example, path_ted_rcvd_message() returns early
if !path_ted_is_initialized(), so the data is never stored in the TED.
ls_delete_msg() then skips freeing because the event is not DELETE, and
the inner data (including admin_group bitmap allocated by
admin_group_init) leaks.

Fix the ownership model with explicit ownership transfer:

1. In ls_msg2vertex(): for SYNC/ADD/UPDATE, set msg->data.node = NULL
   after ls_vertex_add/ls_vertex_update (which either stores the
   pointer in TED or frees it if duplicate -- either way the caller
   no longer owns it). DELETE only uses the node for lookup via
   ls_find_vertex_by_id(), so msg->data.node remains valid for
   cleanup.

2. In ls_msg2edge(): same pattern -- set msg->data.attr = NULL after
   ls_edge_add/ls_edge_update for SYNC/ADD/UPDATE.

3. In ls_msg2subnet(): same pattern -- set msg->data.prefix = NULL
   after ls_subnet_add/ls_subnet_update for SYNC/ADD/UPDATE.

4. In ls_delete_msg(): remove the msg->event == LS_MSG_EVENT_DELETE
   condition. Always call ls_node_del/ls_attributes_del/ls_prefix_del
   on the inner data. All three functions have NULL checks at entry
   (e.g., ls_attributes_del checks "if (!attr) return"), so when TED
   has taken ownership (pointer set to NULL), the call is a safe no-op.
   When TED has NOT taken ownership, the data is properly freed.

Signed-off-by: guozhongfeng <guozhongfeng.gzf@alibaba-inc.com>
@guoguojia2021 guoguojia2021 force-pushed the fix/lib-ls-msg-ownership-leak branch from e0353c0 to e4aeb9c Compare June 10, 2026 02:45
@github-actions github-actions Bot added the rebase PR needs rebase label Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants