Skip to content

feat(starrocks): translate FE plan fragments over BRPC PInternalService#941

Merged
mbrobbel merged 16 commits into
sirius-db:devfrom
mbrobbel:starrocks-translate
Jun 16, 2026
Merged

feat(starrocks): translate FE plan fragments over BRPC PInternalService#941
mbrobbel merged 16 commits into
sirius-db:devfrom
mbrobbel:starrocks-translate

Conversation

@mbrobbel

Copy link
Copy Markdown
Member

Summary

Adds the StarRocks compute-node BRPC PInternalService path so a Sirius CN can receive plan fragments dispatched by a StarRocks FE and translate them. Part of #826.

  • Implement exec_plan_fragment and exec_batch_plan_fragments: deserialize the binary-thrift TExecPlanFragmentParams attachment and run it through a thrift→Substrait PlanTranslator.
  • Add a Baidu PRPC frame transport (raw-TCP PRPC envelope — FE→backend RPC is not gRPC/HTTP2) and a Tower-based BRPC server with tonic-like graceful shutdown.
  • Generate BRPC and StarRocks protobuf bindings in build.rs; vendor apache/brpc as a submodule for the upstream proto definitions.

Notes

  • Translation currently validates the fragment and logs the resulting plan; fragment execution is out of scope for this PR.
  • All changes are under experimental/starrocks. CN registration and FE heartbeat were already wired; this adds the fragment-handling RPC surface on top.

@mbrobbel mbrobbel added enhancement New feature or request starrocks labels Jun 15, 2026
@mbrobbel mbrobbel marked this pull request as draft June 15, 2026 13:14
mbrobbel and others added 6 commits June 15, 2026 13:29
Address review findings on the compute-node BRPC PInternalService path:

- Confine per-connection read/decode/write errors to the connection and
  keep the server running, instead of letting one bad frame propagate out
  of serve_* and crash the whole compute node.
- Spawn a task per accepted connection (tonic-style) so a slow or
  long-lived peer cannot block the accept loop; reap finished tasks in the
  accept loop and drain in-flight tasks on shutdown.
- Generate the BRPC service facade with quote/prettyplease instead of
  string concatenation, emit Send futures behind a Send+Sync service
  trait, and name method constants via heck.
- PRPC transport: return the distinct ENOMETHOD code for unknown methods,
  drop the dead attachment_size assignment, and grow the body buffer in
  bounded chunks rather than pre-allocating the declared size.
- Document that an OK status from exec_plan_fragment means accepted and
  translated, not executed.

Add a regression test that a malformed frame closes only its connection
while the server keeps serving.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The starrocks CN build script compiles proto definitions from the
apache/brpc submodule, so CI must check it out alongside the starrocks
submodule or fmt/clippy/test fail before the build script runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Replace the hand-rolled startup registration retry loop with backon's
  exponential backoff (min 1s, capped at 30s, bounded by the configured
  attempt count).
- Type registration_max_attempts as NonZeroU32 so clap rejects 0 and the
  runtime guard is no longer needed.
- Instrument the registration, reporting, BRPC connection, and plan
  fragment paths with tracing::instrument and emit span close events
  (FmtSpan::CLOSE) so per-operation timings are visible.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Drive the heartbeat/backend/BRPC server joins through a JoinSet so
  wait_until_shutdown observes the first exit, stops everything once, and
  drains the rest in one loop, removing the per-branch repetition.
- Add Frame::into_request to move the request body/attachment into the
  service request instead of cloning them, plus a correlation_id getter
  and a correlation-id-based response_frame builder.
- Derive Copy on FrameSizes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the serve/serve_with_shutdown/serve_with_listener variants from
both BrpcServer and the inner BrpcServiceServer; only new, bind, and
serve_with_listener_shutdown are used. Also drop a stale comment
reference to PR notes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mbrobbel mbrobbel marked this pull request as ready for review June 15, 2026 18:40
@mbrobbel mbrobbel requested a review from mike-wendt as a code owner June 15, 2026 18:40

@mike-wendt mike-wendt left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved for changes in Ops CODEOWNERS files

@dhruv9vats dhruv9vats self-requested a review June 16, 2026 10:06
let shutdown = CancellationToken::new();
let server_shutdown = shutdown.clone();
let join = tokio::task::spawn_blocking(move || {
let runtime = tokio::runtime::Builder::new_current_thread()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it only a single-thread runtime here ? do we have plans to switch to anything else later ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. We should revisit this later.

) -> std::result::Result<(), String> {
let translated = self
.translator
.translate_fragment(params)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

translate_fragment is synchronous right ? If yes, and runs on that single BRPC worker, is it expected to stay cheap, or is spawn_blocking / a multi-thread runtime planned once fragment execution lands?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this will be updated once fragment execution is added.

peer: SocketAddr,
shutdown: CancellationToken,
) {
loop {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my following understanding correct: Requests on one connection are handled serially even though BRPC allows multiplexing concurrent requests, and does the FE actually pipeline fragments over one connection ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct. We can add per-request concurrency in a follow-up.

@9prady9 9prady9 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/merge

@mbrobbel mbrobbel added this pull request to the merge queue Jun 16, 2026
@9prady9

9prady9 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

my bad, for a second I thought it was quent 😄 and hence made that merge comment

Merged via the queue into sirius-db:dev with commit f5b0388 Jun 16, 2026
12 checks passed
@mbrobbel mbrobbel deleted the starrocks-translate branch June 16, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request starrocks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants