SW-1246: cursor pagination on v2 data endpoint#690
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds keyset (cursor) pagination support to the v2 consumer data endpoint (GET /v2/:dataset_id/data, including filtered variant) to avoid deep LIMIT/OFFSET reads and enable forward/backward navigation via opaque cursors.
Changes:
- Introduces
cursorquery param parsing/validation and extendsPageOptionsto carry it. - Implements cursor encoding/decoding, deterministic default sort resolution, and keyset WHERE-clause construction; wires these into
consumer-view-v2query building and frontend response envelopes (next_cursor/prev_cursor). - Regenerates/updates Swagger/OpenAPI docs and adds unit/integration test coverage for cursor mode.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/utils/parse-page-options.test.ts | Adds unit tests for cursor parsing + mutual exclusivity with page_number > 1. |
| test/unit/utils/cursor-codec.test.ts | Adds unit tests for cursor codec round-trips and invalid cursor rejection cases. |
| test/unit/services/keyset-where-builder.test.ts | Adds unit tests for keyset predicate generation (multi-column, NULL semantics, escaping). |
| test/unit/services/default-sort-resolver.test.ts | Adds unit tests ensuring default sort selection and PK tie-breakers are deterministic and language-aware. |
| test/unit/services/consumer-view-v2.test.ts | Updates tests for new BuildDataQueryResult shape and adds cursor-mode negative cases. |
| test/integration/routes/consumer-v2/data-and-pivot.test.ts | Adds integration coverage for cursor traversal forward/backward and invalid cursor scenarios. |
| src/validators/index.ts | Adds cursorValidator enforcing type and max length. |
| src/utils/parse-page-options.ts | Plumbs cursor into parsed page options and enforces cursor/page_number exclusivity. |
| src/utils/cursor-codec.ts | New: base64url JSON cursor codec + sort-hash binding and validation. |
| src/services/keyset-where-builder.ts | New: OR-ladder keyset WHERE clause builder with explicit NULL handling. |
| src/services/default-sort-resolver.ts | New: resolves default deterministic sort and PK tie-breakers from dataset fact table + column mapping. |
| src/services/consumer-view-v2.ts | Core implementation: cursor/offset query building, _sort projection injection, and response page_info cursor fields. |
| src/routes/consumer/v2/schema.ts | Adds cursor query parameter to Swagger schema definitions. |
| src/routes/consumer/v2/api.ts | Wires cursor param into swagger refs and expands endpoint descriptions for pagination modes. |
| src/routes/consumer/v2/openapi-en.json | Regenerated OpenAPI (EN) including cursor param and updated descriptions. |
| src/routes/consumer/v2/openapi-cy.json | Regenerated OpenAPI (CY) including cursor param. |
| src/interfaces/page-options.ts | Adds optional cursor to PageOptions. |
| src/controllers/dataset.ts | Updates preview controller to pass dataset (with factTable) into buildDataQuery and forward BuildDataQueryResult. |
| src/controllers/consumer-v2.ts | Updates consumer controller to ensure dataset is loaded with factTable for deterministic sorting and forwards BuildDataQueryResult. |
Collaborator
Author
|
Pushed since the last review:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds keyset (cursor) pagination to the v2
GET /:dataset_id/dataendpoint as a faster, more attack-resistant alternative to deepLIMIT/OFFSETreads. v1 is out of scope (legacy, deprecate). Pivot endpoint is out of scope (separate ticket).SW-1246 — follow-up to the 2026-04-23 incident, where deep-paginated requests with rotating
page_numberbypassed Front Door cache and saturated the application DB pool.What changed
cursorquery param on/v2/:dataset_id/data(and the filtered variant). Opaque base64url token, serialised as a compact positional array[version, contextHash, direction, keyTuple]. Bound toqueryStoreId+revisionId+language+sortHashvia a single context hash that is recomputed and revalidated on every decode. Any mismatch →400 errors.invalid_cursor.sort_by: firstTimecolumn on the fact table, falling back to firstDimension, with the composite PK appended as tie-breakers. Applied to both cursor and offset paths so the offset→cursor handover at the page-100 boundary is seamless.sort_bysupported on the cursor path. OR-ladder WHERE construction handles mixed ASC/DESC and NULLs explicitly (IS NOT DISTINCT FROMfor equality rungs; explicit NULL placement rungs for inequality; Postgres NULLS LAST on ASC / NULLS FIRST on DESC).next_cursor/prev_cursor(nullable). In cursor modecurrent_page/start_record/end_recordarenull— they're not meaningful when paginating by keyset.total_records/total_pagesstay (cached onQueryStore).cursorparameter and updatedPageInfoV2.What did not change
page_numberis still accepted as before (no cap enforced server-side in this PR — the frontend caps at 100 today; backend cap will land separately).sort_byis unchanged.Notable design choices
queryStoreId,revisionId,lang,sortHash) and revalidating on every decode. Tampering produces a 400, not silent corruption.queryStoreId/revisionIdare folded into the hash rather than transmitted, so a typical cursor is ~70 chars instead of ~220. The binding is unchanged in effect: a cursor replayed against different filters / revision / language / sort recomputes a different hash and 400s rather than silently mis-paginating.sort_bychange, language switch, or revision change all invalidate the cursor. Frontend stripscursoron language/sort changes (see paired frontend PR) so users land on page 1 of the new traversal rather than hitting a 400.sort_bymay not be. Treated as aspirational; per-cube sort indexes are a follow-up.Test plan
npm run checkis greennext_cursorisnullon the last pagecursor+page_number > 1→ 400 (mutually exclusive)cursorparameter on both v2 data routespage_number=1vspage_number=100(offset) vscursor=<deep>(cursor) — record in PR before merge