Skip to content

perf: replace linear scans with indexed lookups in tracker import DHIS2-21287#23590

Open
teleivo wants to merge 3 commits intomasterfrom
DHIS2-21287
Open

perf: replace linear scans with indexed lookups in tracker import DHIS2-21287#23590
teleivo wants to merge 3 commits intomasterfrom
DHIS2-21287

Conversation

@teleivo
Copy link
Copy Markdown
Contributor

@teleivo teleivo commented Apr 14, 2026

Replace O(n) linear stream scans with O(1) indexed lookups in tracker import validation and preheat. Profiling shows validation at 6.7% CPU. The main costs:

  • TrackerBundle.findById: linear scan of the entity list on every call, invoked 6+ times per enrollment from different validators. With 500 TEs per request that is ~3,000 linear scans.
  • UniqueAttributesSupplier.getEntityForEnrollment: linear scan of the TE list per enrollment during preheat.
  • AttributeValidator.validateRequiredProperties: streams through program attributes per attribute per enrollment to check if mandatory.

Changes

TrackerBundle: add lazily-built Map<UID, T> indexes for each entity type. findXxxByUid does a map lookup instead of a stream scan. Maps are invalidated when entity lists change (after validation, after rule engine).

UniqueAttributesSupplier: build a Map<UID, TrackedEntity> once from the payload and use it in getEntityForEnrollment instead of scanning the list per enrollment.

TrackerPreheat: cache mandatory attribute sets per program and per tracked entity type. getMandatoryProgramAttributes and getMandatoryTrackedEntityTypeAttributes compute the set once on first access and reuse it for all entities of the same program/type.

Cleanup: remove unused @JsonProperty/@JsonIgnore annotations from TrackerBundle.

Performance

CPU profile: 1 user, 5 min, skipRuleEngine=true&skipSideEffects=true, 500 TEs/request.

Baseline run 24357292097. PR cpu run 24394901884, alloc run 24398234369.

Subsystem Baseline (17,652 samples) PR (15,848 samples)
Hibernate flush/persist ~9,000 (51%) 8,695 (54.8%)
DB I/O (__write) ~2,475 (14%) 2,528 (15.9%)
Validation ~1,177 (6.7%) 534 (3.3%)
Preheat ~1,131 (6.4%) 1,061 (6.6%)
TrackerBundle.findById 466 (2.6%) 5 (0.0%)
validateRequiredProperties 164 (0.9%) 21 (0.1%)

Validation halved (6.7% -> 3.3%). findById eliminated. No regressions. Throughput improved ~5% (220 -> 231 imports in 5 min) though these are single-user profile runs with profiler overhead, not load tests.

@teleivo teleivo force-pushed the DHIS2-21287 branch 3 times, most recently from 961ce6b to 55f353c Compare April 14, 2026 13:15
@teleivo teleivo changed the title perf: replace linear stream scans with Map lookups in tracker import DHIS2-21287 perf: replace linear scans with indexed lookups in tracker import DHIS2-21287 Apr 14, 2026
@teleivo teleivo force-pushed the DHIS2-21287 branch 3 times, most recently from 14e3096 to 79726d3 Compare April 14, 2026 13:18
@teleivo teleivo marked this pull request as ready for review April 14, 2026 13:18
@teleivo teleivo requested a review from a team as a code owner April 14, 2026 13:18
teleivo added 3 commits April 14, 2026 15:20
TrackerBundle.findById does a linear stream scan on every call, invoked
6+ times per enrollment from different validators. With 500 TEs per
request this means ~3,000 linear scans. Replace with lazily-built
Map<UID, T> indexes that are invalidated when lists change.

Also replace the linear TE scan in UniqueAttributesSupplier
getEntityForEnrollment with a pre-built Map lookup.

DHIS2-21287
validateRequiredProperties and validateMandatoryAttributes rebuild the
set of mandatory program/TE-type attributes per entity via stream scans
of program.getProgramAttributes() and trackedEntityType
.getTrackedEntityTypeAttributes(). With 500 TEs and 20 attributes per
request, this causes thousands of redundant stream operations.

Cache the mandatory attribute sets lazily in TrackerPreheat keyed by
program/TE-type UID. The set is computed once on first access and reused
for all entities of the same program/type.

DHIS2-21287
TrackerBundle does not implement Serializable so transient has no
effect. The private getter methods prevent Lombok from generating
public getters, so Jackson will not see these fields and JsonIgnore
is also unnecessary.

DHIS2-21287
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant