SchemaView performance improvements by rschili · Pull Request #9431 · iTwin/itwinjs-core

rschili · 2026-06-20T14:52:18Z

What

SchemaView was built and tuned against a very large iModel, balanced for the general mix of use cases.
Presentation has been testing it for adoption and the results are promising in almost every scenario, but
one critical scenario regresses on very large iModels - enough that they cannot adopt SchemaView as-is.

The scenario

Very large iModel (~30 GB, ~100 schemas, hundreds of thousands of properties).
Open the iModel and walk the model tree as fast as possible.

That path only needs BisCore, yet SchemaView paid to hydrate every schema in the file before returning. This PR makes SchemaView pay only for what the caller asks for, and removes a fixed cost that hurt every consumer.

Two changes, independent but shipped together.

Change 1: a cheaper schema-identity token (`PRAGMA schema_token`)

SchemaView uses a token to detect when an iModel's schemas have changed so it can drop its cached view.
It was obtaining that token from PRAGMA checksum(ecdb_schema), a SHA3 hash over the full contents of all schema tables (every class, property, and custom-attribute instance).

On the large iModel above, that checksum alone takes ~1.7 seconds, almost entirely CPU-bound on the hash - while loading BisCore plus its references via the binary blob is only ~30 ms. The token, meant to be a cheap "did anything change?" check, was dominating the whole operation, and every SchemaView consumer paid it regardless of how much schema they needed.

This PR adds PRAGMA schema_token, which hashes schema identity only - the name and version of each schema (one row per schema in ec_Schema, ordered by name). It is essentially free. The same cheap hash now also backs the schemaToken column returned by PRAGMA schema_view and the new PRAGMA schema_view_fragment, so a cached view and any later token check derive from the same value.

Measured on the ~30 GB / ~100-schema file: the token computation dropped from ~1744 ms to 1 ms, and the schema fragment pragma (see Change 2 below) from ~1759 ms to ~14 ms (cold ≈ warm, confirming the cost was CPU, not disk).

Known limitation (accepted): because the token hashes name + version only, it does not detect a schema whose contents change without a version bump. ECDb only allows in-place re-import for dynamic schemas, so that is the only case affected; We accept this for now and can strengthen the hash later (for example a cached per-schema content checksum) without changing the pragma's contract.

Frontend IModelConnection.invalidateSchemaViewIfChanged now queries PRAGMA schema_token (reading .token) instead of PRAGMA checksum(ecdb_schema) (.sha3_256).

Change 2: `getSchemaView({ schemas })` - load only a subset

getSchemaView() gains an optional argument. It is purely additive:

// Unchanged: loads every schema in the iModel using an optimized binary blob, exactly as before.
const full = await iModel.getSchemaView();

// New: ensure only BisCore and its references are loaded.
const view = await iModel.getSchemaView({ schemas: ["BisCore"] });
view.findClass("BisCore.Subject");        // present
view.findClass("Generic.PhysicalObject"); // undefined - not loaded

Behavior:

The subset view is a accumulating instance. A later call with different schemas merges
their closure into the same view, so schemas loaded earlier stay available.
If the requested schemas (or all schemas) are already loaded, the call is a synchronous no-op that returns the existing view.
Inside SchemaView: A schema that is not loaded looks identical to a schema the iModel does not contain: findClass and friends return undefined. Cross-schema walks (derivedClasses, etc.) are complete only over what is currently loaded.
Schema names the iModel does not contain are ignored.

How it works

Manifest (cheap reference graph). On first subset request, the backend reads the schema reference graph from ECDbMeta (meta.ECSchemaDef + meta.SchemaHasSchemaReferences) - just names, versions, ids, and reference edges, no schema data. This is exposed as the new SchemaManifest type.

This has one incoming path via getSchemaView:

If caller specified no filter, and nothing is loaded yet -> load the full blob.
If filter specified OR (already some schemas loaded + no filter) -> calculate which schemas need to be loaded and load them as a fragment
If no additional schemas need to be loaded, just return the object.

Serialized merges. Overlapping concurrent requests are serialized behind one in-flight promise and re-check the loaded-set inside the continuation, so two requests can never double-merge a schema.

Fixed Primitive Type enum width

The binary blob incorrectly used 8 bits for the primitive type, but we need 16 bits for it. I decided to fix this format in-place. The API is still beta, and the addon + backend are coupled together, so both should always speak the same language.

A point could be made that this warrants a v2 binary blob format - in the future such changes probably should, however, since no real consumer on frontend picked this up yet, and we will backport the fix to 5.10, I believe it's safe to fix this in place. It's a fix for the v1 format, not an evolution of it.

…older Relocate SchemaView.ts, SchemaViewBinaryReader.ts and SchemaViewInterfaces.ts into a dedicated src/SchemaView/ folder and fix the relative import paths in the moved files and their importers (barrel, SchemaLocalization, test). Pure move, no behavior change - prepares the package for the incremental schema-loading work.

- Introduced `PRAGMA schema_view_fragment` to return a subset of schemas as a binary blob, enabling incremental loading. - Updated documentation for `Pragmas.md`, `SchemaView.md`, and `SchemaViewBinaryFormat.md` to reflect the new pragma and its usage. - Enhanced `getSchemaView` method to support loading only specified schemas and their dependencies, improving performance for large iModels. - Added tests for schema view fragment loading, ensuring correct behavior when loading subsets of schemas and handling dependencies. - Implemented `SchemaManifest` to manage schema references and loading order, facilitating efficient schema management.

…pragma - Added support for incremental schema loading in IModelDb using the new schema_view_fragment pragma. - Updated the SchemaView class to handle schema tokens for cache invalidation. - Modified the getSchemaView method to utilize the new incremental loading strategy. - Enhanced documentation for schema_view and schema_token pragmas to clarify their usage and benefits. - Updated tests to reflect changes in schema view lifecycle and cache invalidation logic.

…chema_token) for accuracy

…ormat

…clarify failure conditions

grigasp · 2026-06-29T05:20:45Z

This PR adds PRAGMA schema_token, which hashes schema identity only - the name and version of each schema

Known limitation (accepted): because the token hashes name + version only, it does not detect a schema whose contents change without a version bump. ECDb only allows in-place re-import for dynamic schemas, so that is the only case affected; We accept this for now and can strengthen the hash later (for example a cached per-schema content checksum) without changing the pragma's contract.

How does this work with schema editing? Would editing a schema automatically increase the schema version?

…conciseness

rschili · 2026-06-30T07:04:46Z

@grigasp

How does this work with schema editing? Would editing a schema automatically increase the schema version?

Not automatically. When you import an edited schema, you have to increment the version yourself or ECDb will consider the schema unchanged. It's pretty safe to assume this works for all normal scenarios, with the only exception of dynamic schemas.

Our rules already state how callers should increment schema versions. Any change in non-dynamic schemas needs at least a minor digit increment. Frankly, we loosened the rules for dynamic schemas a few years ago, and I now believe that may have been a mistake. They incremented the minor version automatically very often and were worried it would eventually overflow.

If this is a problem, we can add a short term fix to force invalidate the schemaView whenever dynamic schemas are involved, but feels like beyond the scope of this PR.

…in/itwinjs-core into rschili/schema-view-fragment

rschili added 5 commits June 20, 2026 00:15

docs: update SchemaView documentation for clarity and consistency

016920f

docs: rename SchemaView section to Intro for improved clarity

84f8772

rschili mentioned this pull request Jun 21, 2026

core-interop: SchemaView-based ECSchemaProvider iTwin/presentation#1394

Draft

rschili added 4 commits June 23, 2026 09:35

docs: update references from PRAGMA schema_token to PRAGMA checksum(s…

bf9bacd

…chema_token) for accuracy

fix: update primitiveType from uint8 to uint16 in SchemaView binary f…

1d668d1

…ormat

test: enhance SchemaViewFragmentLoading tests with native log capturing

61ae7d2

Merge branch 'master' into rschili/schema-view-fragment

17b73ef

rschili mentioned this pull request Jun 26, 2026

SchemaView performance improvements iTwin/imodel-native#1479

Draft

Fix documentation for PRAGMA schema_view and schema_view_fragment to …

7dcbd6e

…clarify failure conditions

rschili added 3 commits June 29, 2026 13:24

Refactor comments in SchemaViewFragmentLoading tests for clarity and …

df29492

…conciseness

Refactor SchemaView documentation for clarity and conciseness

be62c05

Refactor SchemaView documentation for clarity and conciseness

9770c9a

aruniverse added this to the iTwin.js 5.12 milestone Jun 29, 2026

rschili changed the title ~~WIP SchemaView performance improvements~~ SchemaView performance improvements Jun 30, 2026

Merge branch 'master' into rschili/schema-view-fragment

51b3ac0

rschili added 2 commits June 30, 2026 09:36

Refactor schema loading logic in IModelDb to improve promise handling

33510bc

Merge branch 'rschili/schema-view-fragment' of https://github.com/iTw…

4e92bc2

…in/itwinjs-core into rschili/schema-view-fragment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SchemaView performance improvements#9431

SchemaView performance improvements#9431
rschili wants to merge 16 commits into
masterfrom
rschili/schema-view-fragment

rschili commented Jun 20, 2026 •

edited

Loading

Uh oh!

grigasp commented Jun 29, 2026

Uh oh!

rschili commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

rschili commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

The scenario

Change 1: a cheaper schema-identity token (PRAGMA schema_token)

Change 2: getSchemaView({ schemas }) - load only a subset

How it works

Fixed Primitive Type enum width

Uh oh!

grigasp commented Jun 29, 2026

Uh oh!

rschili commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rschili commented Jun 20, 2026 •

edited

Loading

Change 1: a cheaper schema-identity token (`PRAGMA schema_token`)

Change 2: `getSchemaView({ schemas })` - load only a subset