Skip to content

Add gRPC client to FlowAggregator and UI frontend charts#1190

Open
heanlan wants to merge 6 commits into
mainfrom
feature/flow-visibility
Open

Add gRPC client to FlowAggregator and UI frontend charts#1190
heanlan wants to merge 6 commits into
mainfrom
feature/flow-visibility

Conversation

@heanlan

@heanlan heanlan commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

This PR introduces real-time flow visibility to the Antrea UI by streaming flow records directly from the Flow Aggregator.

  • Backend Integration: Adds a gRPC client (GRPCFlowStreamSubscriber) connecting to the Flow Aggregator's FlowStreamService (port 14740) over server-side TLS. The Flow Aggregator CA certificate is fetched from the flow-aggregator-ca ConfigMap at startup to verify the server certificate. Flow records are streamed via a FlowStreamSubscriber interface and exposed to the browser as Server-Sent Events (SSE) at /api/v1/flows/stream.
  • Frontend Visualization: Wires the Service Map and Flow List to consume the live SSE stream using a FlowStreamClient backed by fetch() (required to support Authorization headers, which EventSource does not). Adds functional filtering by namespace, pod, service, IP, flow type, and direction.
  • Helm Charts: Adds a cross-namespace RoleBinding in the flow-aggregator namespace so the antrea-ui ServiceAccount can read the flow-aggregator-ca ConfigMap for TLS verification. Adds flowAggregator.enabled, address, caConfigMap, and namespace values to the chart.
  • Upstream Alignment: FlowStreamService proto stubs locally under pkg/flowpb instead of depending on antrea.io/antrea/v2, which would pull in the entire Antrea module and its transitive dependencies.
Screenshot 2026-06-02 at 3 52 57 PM Screenshot 2026-06-02 at 3 53 30 PM

Comment thread build/charts/antrea-ui/values.yaml Outdated
Comment thread build/charts/antrea-ui/templates/_nginx_conf.tpl Outdated
Comment thread pkg/handlers/flowstream/interface.go Outdated
Comment thread pkg/handlers/flowstream/handler.go
@Dyanngg Dyanngg force-pushed the feature/flow-visibility branch 2 times, most recently from fad57dc to cb85ddc Compare May 5, 2026 20:14
@heanlan heanlan force-pushed the feature/flow-visibility branch 3 times, most recently from 2e24f50 to ce625c4 Compare May 7, 2026 23:40
@heanlan heanlan marked this pull request as draft May 7, 2026 23:42
@heanlan heanlan marked this pull request as draft May 7, 2026 23:42
@heanlan heanlan marked this pull request as draft May 7, 2026 23:42
@heanlan heanlan force-pushed the feature/flow-visibility branch 3 times, most recently from a5a9fb4 to 513b822 Compare May 8, 2026 21:04
@heanlan heanlan marked this pull request as ready for review May 8, 2026 21:05
@heanlan heanlan force-pushed the feature/flow-visibility branch 4 times, most recently from 9f99b32 to eebb353 Compare June 2, 2026 22:28
@heanlan heanlan requested review from Dyanngg, antoninbas and Copilot June 2, 2026 22:54

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end live Flow Visibility to Antrea UI by introducing a backend gRPC subscriber for Flow Aggregator FlowStreamService, exposing it as an authenticated SSE endpoint, and wiring a new frontend “Flow Visibility” page (Flow List + Service Map) to consume that stream with filtering support. It also updates the Helm chart to configure and support the cross-namespace CA ConfigMap read required for Flow Aggregator server-side TLS verification.

Changes:

  • Backend: add Flow Aggregator gRPC client + SSE endpoint /api/v1/flows/stream, plus config plumbing and frontend feature flag exposure.
  • Frontend: add Flow Visibility routes/pages, flow store + SSE streaming client (fetch-based) with filters, and dev proxy updates.
  • Deployment: extend Helm chart values/templates for Flow Aggregator integration and nginx SSE timeouts; update Go module dependencies to use antrea.io/antrea/v2.

Reviewed changes

Copilot reviewed 37 out of 38 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/server/server.go Threads FlowStreamSubscriber dependency into server construction.
pkg/server/server_test.go Updates server constructor usage to include new flow stream dependency.
pkg/server/api/server.go Adds flow stream routes and conditional SSE handler wiring.
pkg/server/api/server_test.go Updates API server constructor usage for new flow stream parameter.
pkg/server/api/frontend_settings.go Exposes Flow Visibility feature flag to frontend settings.
pkg/handlers/flowstream/testing/mock_interface.go Adds generated GoMock for FlowStreamSubscriber.
pkg/handlers/flowstream/interface.go Introduces FlowStreamSubscriber interface and mockgen directive.
pkg/handlers/flowstream/handler.go Implements SSE handler, query parsing, and keepalive behavior.
pkg/handlers/flowstream/handler_test.go Adds tests for filter parsing and SSE happy/error paths.
pkg/handlers/flowstream/grpc.go Implements gRPC subscriber, TLS config, and proto→API conversion helpers.
pkg/handlers/flowstream/grpc_test.go Adds unit tests for filter mapping, IP conversion, and proto conversion.
pkg/config/server/config.go Adds Flow Aggregator config fields and defaults.
cmd/server/main.go Creates Flow Aggregator subscriber, fetches CA from ConfigMap, injects into server.
apis/v1/frontend_settings.go Adds features.flowVisibilityEnabled to settings API schema.
apis/v1/flow.go Adds JSON API types for flows, filters, and SSE event payloads.
go.mod Adds antrea.io/antrea/v2 + gRPC/protobuf requirements; bumps k8s deps.
go.sum Updates module checksums for new / bumped dependencies.
client/web/antrea-ui/vite.config.ts Adds dev proxy for /api and /auth.
client/web/antrea-ui/.env.development Switches dev API base to rely on Vite proxy (empty VITE_API_SERVER).
client/web/antrea-ui/src/index.tsx Adds router entry for new /flows page.
client/web/antrea-ui/src/components/nav.tsx Adds nav entry for “Flow Visibility”.
client/web/antrea-ui/src/routes/flowvisibility.tsx Adds Flow Visibility page wrapper (list/map, pause/clear, errors).
client/web/antrea-ui/src/routes/flowlist.tsx Adds live flow list table with sorting + text filtering.
client/web/antrea-ui/src/routes/servicemap.tsx Adds D3-based Service Map visualization for live flows.
client/web/antrea-ui/src/components/flow-filters.tsx Adds filter UI (namespace/pod/service/IP/type/direction/label selector) + controls.
client/web/antrea-ui/src/api/settings.tsx Extends settings typing to include feature flags.
client/web/antrea-ui/src/api/use-flow-stream.ts Hook to manage FlowStreamClient lifecycle + local flow store state.
client/web/antrea-ui/src/api/flow-types.ts Defines frontend flow types/helpers and connection dedup key.
client/web/antrea-ui/src/api/flow-types.test.ts Adds unit tests for frontend flow type helpers.
client/web/antrea-ui/src/api/flow-store.ts Adds bounded in-memory store with LRU eviction and dedup.
client/web/antrea-ui/src/api/flow-store.test.ts Adds unit tests for FlowStore behavior.
client/web/antrea-ui/src/api/flow-stream.ts Adds fetch-based SSE client, filter serialization, and reconnect logic.
client/web/antrea-ui/src/api/flow-stream.test.ts Adds unit tests for filter key stability.
build/charts/antrea-ui/values.yaml Adds chart values for antreaNamespace and Flow Aggregator integration.
build/charts/antrea-ui/templates/_backend_conf.tpl Emits backend config for Flow Aggregator settings + quotes URL.
build/charts/antrea-ui/templates/_nginx_conf.tpl Adds nginx location tuning for long-lived flow SSE stream.
build/charts/antrea-ui/templates/flow-aggregator-rolebinding.yaml Adds cross-namespace RoleBinding to read Flow Aggregator CA ConfigMap.
build/charts/antrea-ui/README.md Documents new Helm values for Flow Aggregator integration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/server/main.go Outdated
Comment thread client/web/antrea-ui/src/api/flow-stream.ts Outdated
Comment thread pkg/handlers/flowstream/grpc.go Outdated
@heanlan heanlan force-pushed the feature/flow-visibility branch 2 times, most recently from 95d2a2c to 766c40a Compare June 3, 2026 20:28
Comment thread build/charts/antrea-ui/templates/flow-aggregator-rolebinding.yaml Outdated
Comment thread apis/v1/flow.go Outdated
Comment thread pkg/handlers/flowstream/grpc.go Outdated
Comment thread pkg/handlers/flowstream/grpc.go Outdated
Comment thread go.mod Outdated
Comment thread apis/v1/flow.go Outdated
Comment thread client/web/antrea-ui/src/hooks/use-flow-stream.ts
Comment thread client/web/antrea-ui/.env.development

import { Flow, connectionKey } from './flow-types';

const DEFAULT_MAX_ENTRIES = 10_000;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we will need a way to test memory usage (in the browser) and responsiveness when the flow store is full

this may be possible by feeding synthetic data from the backend?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can track this in a follow-up issue?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is perfectly fine to do as a follow-up
I didn't see anything concerning in my testing. I think chrome has tools to profile memory and maybe it can even be done without an actual browser?

@heanlan heanlan force-pushed the feature/flow-visibility branch 2 times, most recently from 3091972 to 8d1addc Compare June 9, 2026 20:05
@heanlan heanlan marked this pull request as draft June 9, 2026 22:40
@heanlan heanlan force-pushed the feature/flow-visibility branch 2 times, most recently from 674c86f to 2d3d5ac Compare June 11, 2026 21:22

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 42 changed files in this pull request and generated 4 comments.

Files not reviewed (2)
  • pkg/flowpb/flow.pb.go: Language not supported
  • pkg/flowpb/service.pb.go: Language not supported

Comment thread client/web/antrea-ui/.yarnrc.yml Outdated
Comment thread client/web/antrea-ui/src/routes/servicemap.tsx
Comment thread client/web/antrea-ui/src/routes/servicemap.tsx
Comment thread client/web/antrea-ui/src/store/flow-store.ts
@heanlan heanlan force-pushed the feature/flow-visibility branch 3 times, most recently from abc4d58 to 3217f76 Compare June 11, 2026 23:31
@heanlan heanlan marked this pull request as ready for review June 11, 2026 23:34
@heanlan heanlan requested a review from Copilot June 12, 2026 05:34

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 38 out of 42 changed files in this pull request and generated 4 comments.

Files not reviewed (2)
  • pkg/flowpb/flow.pb.go: Generated file
  • pkg/flowpb/service.pb.go: Generated file

Comment thread pkg/handlers/flowstream/handler.go
Comment thread pkg/handlers/flowstream/grpc.go
Comment thread pkg/handlers/flowstream/grpc.go Outdated
Comment thread pkg/handlers/flowstream/handler.go
heanlan and others added 5 commits June 12, 2026 16:23
- Add a gRPC client (GRPCFlowStreamSubscriber) connecting to the
  FlowAggregator FlowStreamService on port 14740 over server-side TLS.
  The FlowAggregator CA cert is fetched from the flow-aggregator-ca
  ConfigMap at startup and used to verify the server certificate.
- Add a cross-namespace RoleBinding so the antrea-ui ServiceAccount can
  read the flow-aggregator-ca ConfigMap from the flow-aggregator namespace.
- Expose a Server-Sent Events (SSE) endpoint at /api/v1/flows/stream
  that streams live flow records from the FlowAggregator to the browser.
- Add frontend Flow Visibility page (Service Map and Flow List views)
  with functional filtering by namespace, pod, service, IP, flow type,
  and direction.
- Wire the frontend to consume the SSE stream using a FlowStreamClient
  backed by fetch() to support Authorization headers (EventSource does
  not support custom headers).
- Add flowVisibilityEnabled feature flag to the frontend settings API
  so the UI can conditionally show the Flow Visibility nav item.
- Add flowAggregator.enabled / address / caConfigMap / namespace to
  the antrea-ui Helm chart values; add SSE-specific nginx location block
  with extended proxy_read_timeout to avoid 504 timeouts on idle streams.

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Remove Follow field from FlowStreamFilter: the SSE endpoint always
  streams in follow mode; exposing a non-functional field on the API
  was misleading.

- Move FlowStreamFilter and FlowFilterDirection out of apis/v1 and into
  the flowstream handler package, as the filter is parsed from URL query
  parameters (not a JSON body) and does not belong in the public JSON
  API types.

- Copy FlowStreamService proto stubs locally under pkg/flowpb instead
  of depending on antrea.io/antrea/v2, which pulls in the entire Antrea
  module and its transitive dependencies.

- Merge DroppedCount and Flows into a single FlowStreamEvent per gRPC
  response, matching how the gRPC layer itself works.

- Expose flowAggregator.serverName as a Helm chart value and config
  field instead of hardcoding the TLS ServerName override.

- Use {{ .Release.Name }} in the flow-aggregator RoleBinding name to
  avoid collisions when multiple antrea-ui releases share the same
  flow-aggregator namespace.

- Move flow-store and use-flow-stream out of src/api/ into src/store/
  and src/hooks/ respectively, as they are not network client modules.

- Restore .env.development to VITE_API_SERVER=http://localhost:8080 and
  remove the Vite dev proxy; the backend already configures CORS for
  localhost:3000 in dev mode so the proxy is unnecessary.

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Replace the broken cumulative-bytes-over-UI-window bit rate with a
  proper 120s sliding window computed from each flow's own endTs
  timestamps, which is immune to gRPC backfill distortion. Track
  per-connection (endTs, cumulativeBytes) samples in the flow store and
  derive the rate from the oldest/newest samples in the window.
- Drop the misleading connection rate (connections/s). It was computed
  as count / browser-watch-time, which produces nonsense values due to
  gRPC backfill collapsing the observation window to near-zero. Keep
  only the connection count, which is accurate.
- Fix arrow endpoints in the service map not reaching node borders by
  computing the true rectangle/diamond boundary intersection instead of
  treating every node as a circle.
- Fix the flow table overflowing its container at narrow window widths
  by adding min-width:0 to the flex content column in App.tsx.
- Add a 5s periodic refresh tick in useFlowStream so derived stats
  (bit rate, byte totals) stay current between flow record arrivals
  from the Flow Aggregator (~60s active timeout).

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Remove internal Broadcom npm registry from .yarnrc.yml; only the
  nodeLinker setting belongs in the repository.
- Fix policy allow/drop detection in the service map tooltip: replace
  fragile string matching on the formatted policy label with a direct
  check of the NetworkPolicyRuleAction enum values stored on the edge.
- Replace O(n) graph.edges.find() calls in D3 event handlers with O(1)
  Map lookups using the edgeMap now returned by buildGraph.
- Simplify the Date.parse validity check in flow-store nextSamples from
  !Number.isFinite() to the more idiomatic isNaN().

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Dismiss the hover tooltip and the click-selected edge stats panel
  when filters are applied, reset, or cleared so stale data from the
  previous filter is not shown alongside new results.
- Move the close (✕) button out of the title row into an
  absolute-positioned icon in the top-right corner of the panel so
  the title row only contains "Connection Stats".
- Keep the panel stats live after click by storing only the selected
  edge key in state and re-deriving EdgeDetails from the live
  graph.edgeMap on every render, so bytes, bit rate, and all other
  fields update continuously alongside incoming flow records.
- Add namespace clustering force so same-namespace nodes stay grouped
  instead of drifting far apart; tune link distance and repulsion to
  produce a more compact layout.
- Fix service map SVG to fill the full container width using a
  ResizeObserver on the container element; include svgWidth in the
  topology key so the simulation recenters on resize.
- Pin nodes after drag so user-placed positions are preserved;
  double-click a node to release the pin.
- Fix stale-closure bug where edge hover tooltip and click panel read
  byte/connection stats from the graph captured at topology-build time
  rather than the latest graph; introduce graphRef to always reflect
  the most recent flow data.

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@heanlan heanlan force-pushed the feature/flow-visibility branch from 3217f76 to d1dd51d Compare June 12, 2026 23:40
- parseFlowStreamFilter: trim whitespace from each comma-separated
  value and skip empty elements to avoid passing blank strings to the
  gRPC filter; extract a splitTrimmed helper.
- protoFlowToAPI: guard GetStartTs/GetEndTs for nil before calling
  AsTime() to avoid a nil-pointer dereference on malformed records.
- grpc.go: fix defer ordering so flowsCh is closed before errCh (swap
  defers; LIFO means the last-registered runs first).
- handler.go: when errCh closes (!ok), set errCh = nil and return true
  instead of returning false immediately. A nil channel in a select is
  never ready, so the handler keeps streaming from flowsCh and respects
  ctx cancellation until flowsCh closes, preventing buffered flow events
  from being silently dropped on clean shutdown.

Signed-off-by: Anlan He <anlan.he@broadcom.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@heanlan heanlan force-pushed the feature/flow-visibility branch from d1dd51d to a9801d2 Compare June 12, 2026 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants