Skip to content

fireflyb: skip invalid collector events#52

Open
ShawnMcKee wants to merge 3 commits intomainfrom
fix/firefly-collector-validation
Open

fireflyb: skip invalid collector events#52
ShawnMcKee wants to merge 3 commits intomainfrom
fix/firefly-collector-validation

Conversation

@ShawnMcKee
Copy link
Copy Markdown
Contributor

@ShawnMcKee ShawnMcKee commented Mar 9, 2026

Summary

  • Add a validation gate before collector forwarding in firefly backend
  • Drop collector sends for synthetic/invalid flow IDs (unspecified IPs, zero ports, missing state timestamps)
  • Add unit tests for collector-flow validation paths

Why

perfSONAR marker bootstrap emits a synthetic wildcard FlowID used for marking setup.
Those records should not be sent to global collector endpoints.

Validation

  • IDE diagnostics clean for touched files
  • Unit tests added under backends/fireflyb/comms_test.go
  • Go test could not be executed in this environment because go toolchain is not installed on host shell

Add validation for collector-bound flowIDs to avoid sending synthetic wildcard perfsonar events.\n\nReject zero ports, unspecified addresses, and missing START/END timestamps before collector forwarding.\nInclude focused unit tests for validation paths.
@pcolladosoto
Copy link
Copy Markdown
Member

Hi @ShawnMcKee! Thanks a ton for the PR.

Just to be sure, these 'synthetic' FlowIDs you're observing are the ones generated by the perfSONAR plugin, right? The perfSONAR plugin was initially conceived to only cooperate with the marker backend: that's why the interaction of this flowdID might be a bit awkward with other backends. If you let me know what's the inteneded workflow for the perfSONAR use case we can maybe come up with a cleaner way to enable 'wildcard marking' of IPv6 datagrams without polluting the channels with an outright wrong flowdID. That way we can improve the design instead of putting a band-aid in a single place today to need one somewhere else tomorrow...

Also, I took the liberty to edit the PR title. I guess it was (somewhat) AI genearated as the content was a single line with inline control sequences... Feel free to revert the edit if you see fit.

Thanks again!

@ShawnMcKee
Copy link
Copy Markdown
Contributor Author

Thanks for looking at the PR closely. What we want is to have perfSONAR measurements send fireflies, both to the destination and to the global (or regional) firefly collector. However, I think we do NOT want to do this for latency measurements, which send packets at 10 Hz (depends upon config) nor for traceroutes. So, the main thing is to have throughput measurements send fireflies, at least for now. For packet marking, I think we can change the packet headers on ALL types, if the marker backend is enabled. Any questions or suggestions? Thanks!

The perfSONAR plugin is exclusively leveraged to mark all
IPv6 datagrams egressing from a machine. In an effort to
adhere to existing interfaces an initial implementation
generated a single, 'dummy' flow ID which was only interpreted
by the marker backend. This initialy flow ID caused problems
on backends not checking whether it was valid before ingesting
it.

One approach could hinge on checking flow ID validity on every
backend but this poses two problems. On the one hand, this
check should be made for **every** flow ID. In scenarios with
high churn this could lead to performance problems. On the other
hand, logic checking a flow ID's correctness should be leveraged
in every backend, thus imposing a constranint that would ideally
be specified on the Backend interface. Instead of complicating the
design or compromising performance we judge it best to sidestep
the problem altogether.

The solution implemented in this commit simply overwrites the
marker backend's configuration at runtime to trigger the marking
of every flow at backend startup.
@pcolladosoto
Copy link
Copy Markdown
Member

Hi @ShawnMcKee! Sorry for the late turnaround... I just added some changes sidesteppign the boradcast of invalid flow IDs altogether. Could you give it a try on a working perfSONAR machine (I don't have easy access to one...) and let me know how it fares?

I've simply removed the validity check on the firefly backend for now so that it doesn't interfere with the fix put in place.

Thanks a million!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants