Support chunking for fetchBaseJoin in chaining streaming path by yuli-han · Pull Request #1090 · airbnb/chronon

yuli-han · 2026-02-19T06:36:57Z

Summary

Add configurable chunking for fetchBaseJoin calls in JoinSourceRunner.enrichBaseJoin to prevent KV store overload or timeouts when micro-batches are large
When spark.chronon.stream.chain.fetch_chunk_size is set to a positive integer, requests are split into chunks and fetched in parallel as separate Futures
Default is 0 (no chunking), preserving current behavior

Why / Goal

Test Plan

Added Unit Tests
Covered by existing CI
Integration tested

Checklist

Documentation update

Reviewers

@hzding621 @pengyu-hou @Shiyinghaha

pengyu-hou · 2026-02-19T07:41:02Z

spark/src/main/scala/ai/chronon/spark/streaming/JoinSourceRunner.scala

          // this might be potentially slower, but spark doesn't work when the internal derivation functionality triggers
          // its own spark session, or when it passes around objects
-          val responses = Await.result(responsesFuture, 5.second)
+          val responses = if (fetchChunkSize > 0 && requests.length > fetchChunkSize) {


let's try to use the joinFetchParallelChunkSize from https://github.com/airbnb/chronon/blob/main/online/src/main/scala/ai/chronon/online/Fetcher.scala#L101C15-L101C41 so you don't have to implement the chunking here again.

Hi @pengyu-hou using joinFetchParallelChunkSize would require some changes in the Api.buildFetcher since currently the joinFetchParallelChunkSize is not passed in this function. Also the chaining is using fetchBaseJoin instead of fetchJoin.
The duplicate chunking is just 3 lines of code so I guess it might be fine to have some duplication here and avoid changing too many code? Lemme know what you think

yuli-han added 2 commits February 18, 2026 22:29

support chunking in chaining

34bca40

remove unused change

c123ea1

pengyu-hou reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support chunking for fetchBaseJoin in chaining streaming path#1090

Support chunking for fetchBaseJoin in chaining streaming path#1090
yuli-han wants to merge 2 commits intomainfrom
ylh--support-chunking-chaining

yuli-han commented Feb 19, 2026 •

edited

Loading

Uh oh!

pengyu-hou Feb 19, 2026

Uh oh!

yuli-han Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yuli-han commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why / Goal

Test Plan

Checklist

Reviewers

Uh oh!

pengyu-hou Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

yuli-han Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuli-han commented Feb 19, 2026 •

edited

Loading