Skip to content

eve: fix pcap_filename race with --pcap-file-recursive - v7#15741

Open
oferda4 wants to merge 3 commits into
OISF:mainfrom
oferda4:fix/eve-pcap-filename-race
Open

eve: fix pcap_filename race with --pcap-file-recursive - v7#15741
oferda4 wants to merge 3 commits into
OISF:mainfrom
oferda4:fix/eve-pcap-filename-race

Conversation

@oferda4

@oferda4 oferda4 commented Jun 27, 2026

Copy link
Copy Markdown

When processing multiple pcap files with --pcap-file-recursive, the pcap_filename field in EVE JSON events could show the wrong filename. The RX thread updates a global variable as it moves to the next file, but worker threads logging events from the previous file would read the already-updated global.

Use the per-packet PcapFileFileVars reference (p->pcap_v.pfv->filename) that is set at capture time, making the filename immutable for the packet's lifetime. For flow/netflow events where no packet is available (p == NULL), the global fallback is correct as these events are emitted synchronously on the RX thread.

BUGFIX: We also fix in the first commit a bug that we did not increase the ref count in all needed cases. This raised now that we have to have it in this scenario. The first commit handles it.

Previous PR: #15176

v2:

  • Avoid extending the flow struct.

v3:

  • Update schema.

v4:

  • Name the union (capture) so direct access requires f->capture.livedev, preventing future code from accidentally misinterpreting the union value.
  • Update flow-hash.c, flow-manager.c, and util-ebpf.c for the named union.
  • Use FlowGetLiveDev() accessor in flow-manager.c (bypassed-flow stats).
  • Clarify PcapFileGetFilename() fallback comment (stats events only).

v5:

  • Split commit 1 into two: flow: introduce capture union (pure mechanical rename) and flow: fix ref_cnt leak and use-after-free (accessors, ref counting, tests)
  • Replace raw (LiveDevice_*) cast with FlowGetLiveDev(f) on pseudo-packet creation in stream-tcp.c and flow-timeout.c
  • Add comment in CmpFlowPacket() explaining the pointer-equality trick for pcap-file mode
  • Drop invalid "optional" JSON Schema keyword from in_iface in schema.json; keep description only

v6:

  • Squash "eve: guard in_iface field" into the ref_cnt commit so no intermediate state has the union misread bug (3 commits instead of 4)
  • Add clarifying comment in stream-tcp.c that Packet keeps livedev and pcap_v as separate fields (not a union), so both assignments are needed
  • Add comment in output-json.c that the PcapFileGetFilename() legacy path is only used for stats events

v7:

  • Rebased onto current main. Upstream's "flow: store livedev by id" replaced the livedev pointer with a uint16_t livedev_id, so the previous design (the capture union reusing the livedev pointer slot, the FlowGetLiveDev()/FlowGetPcapFileVars() union accessors, the in_iface guards and the CmpFlowPacket() comment) no longer applies and has been dropped
  • New approach: the flow remembers its source pcap file via per-flow storage (pcap_file_vars), registered for offline run modes only - so live capture is unaffected and the core Flow struct is not grown (addresses the earlier "avoid extending the flow struct" concern). The stored PcapFileFileVars is ref-counted (referenced in FlowInit, released by the storage free callback) so it outlives the source file in --pcap-file-recursive.
  • Flow/netflow events (p == NULL, emitted from the flow-manager thread) now read the flow's stored file instead of the global, correcting the earlier assumption that the global was safe for those events.
  • Flow-timeout and stream detect-log-flush pseudo packets carry the flow's pcap file (no extra ref: each holds a flow reference that outlives it), fixing both the filename and per-file alert attribution.
  • output-json resolves the filename as packet -> flow -> global (the global is now used for stats events only).
  • Addressed prior review: trimmed the verbose comments and reworked/renumbered the unit tests.

Redmine ticket: https://redmine.openinfosecfoundation.org/issues/5255
SV_BRANCH=OISF/suricata-verify#3013

oferda4 added 3 commits June 27, 2026 13:22
With --pcap-file-recursive the RX thread advances the global pcap_filename
as it moves between files, so an event emitted later for an earlier file
would report the wrong name. Remember each flow's source file so such events
can use it.

Register a per-flow storage holding the flow's PcapFileFileVars, set (and
referenced) in FlowInit and released through the storage free callback. It is
registered for offline run modes only, so live-capture flows are unaffected.
The reference keeps the file vars alive while the flow exists, even after the
RX thread has moved on and the file would otherwise be cleaned up.

Bug OISF#5255.
Flow-timeout and stream detect-log-flush pseudo packets have no capture file
of their own. Set their pcap file vars from the flow so events emitted while
processing them (and per-file alert counts) are attributed to the file the
flow's packets came from, not the file the RX thread happens to be on.

No extra reference is taken: each pseudo packet holds a flow reference and the
flow holds the file reference, so it outlives the packet.

Bug OISF#5255.
When running with --pcap-file-recursive, EVE output was using the global
pcap_filename string to populate the "pcap_filename" field.  This string
is updated by the RX thread as it moves from file to file, so by the time
a worker thread emits a log entry the global may already point to the next
file, recording the wrong filename.

Fix: read the filename from p->pcap_v.pfv->filename instead, which is set
per-packet by the RX thread and is stable for the lifetime of the packet.
The previous commit stores a pfv reference in the Flow so that flow-timeout
pseudo packets also carry the correct filename.

For stream pseudo-packets (PKT_SRC_STREAM_TCP_DETECTLOG_FLUSH), also
propagate p->pcap_v.pfv from the flow so that http/fileinfo events emitted
during stream flushing report the correct per-flow filename.

For events where p == NULL (flow and netflow records), use the pfv stored
in the flow (FlowGetPcapFileVars) so that flow events report the file the
flow's packets came from, not the last file the RX thread processed.  Fall
back to PcapFileGetFilename() only when f is also NULL (e.g. stats events).

A BUG_ON guards against p != NULL with a NULL pfv, which would indicate a
missing pfv assignment in the capture path.

Regression tests: SourcePcapFileHelperTest18 (per-packet filename is
independent of the global), SourcePcapFileHelperTest19 (stats fallback to
global when both p and f are NULL).

Fixes OISF#5255.
@oferda4 oferda4 requested a review from victorjulien as a code owner June 27, 2026 11:36
@github-actions

Copy link
Copy Markdown

NOTE: This PR may contain new authors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant