Skip to content

perf: add direct org unit predicate for orgUnitMode=SELECTED event queries [2.41]#23492

Draft
jason-p-pickering wants to merge 4 commits into2.41from
perf-oumode-selected-241-redux
Draft

perf: add direct org unit predicate for orgUnitMode=SELECTED event queries [2.41]#23492
jason-p-pickering wants to merge 4 commits into2.41from
perf-oumode-selected-241-redux

Conversation

@jason-p-pickering
Copy link
Copy Markdown
Contributor

@jason-p-pickering jason-p-pickering commented Apr 2, 2026

Background

For single-event (WITHOUT_REGISTRATION) programs, the GET /tracker/events endpoint with
orgUnitMode=SELECTED was producing slow queries on a production-scale system with ~12.4 million
events in a program. Two bugs combined to cause this: the org unit predicate was never emitted
for this program type, and the enrollmentid index on the shared event table was systematically
misleading the query planner into a catastrophic scan path.

Root cause — part 1: org unit predicate never emitted

The 2.41 query uses COALESCE(po.organisationunitid, ev.organisationunitid) to handle tracker
program ownership semantics — the effective org unit for a tracker event may differ from the
stored ev.organisationunitid if ownership has been transferred. This expression prevents the
planner from using the org unit equality as an early filter: it cannot see through a COALESCE to
use an index on the underlying column.

For WITHOUT_REGISTRATION programs there are no tracked entities and therefore no TPO rows —
the COALESCE always resolves to ev.organisationunitid. The fix emits the predicate directly on
ev.organisationunitid at query build time for this program type, bypassing the COALESCE
entirely. Without this predicate the planner has no selective entry point and falls back to
scanning all events for the program stage (~918K rows, ~6,200ms).

Root cause — part 2: enrollmentid index poisons the planner for single-event programs

For WITHOUT_REGISTRATION programs, DHIS2 creates a single synthetic enrollment per program.
Every event in that program shares the same enrollmentid. This means
programstageinstance_programinstanceid — the index on event.enrollmentid — can never be
selective for these programs: a scan by enrollmentid always returns the entire program's event
set. On this production system that is 12.3M rows from a single index entry.

The query planner cannot know this from column statistics alone. It estimates selectivity from the
average rows per distinct enrollmentid value. With a handful of distinct values and one
outlier owning 97% of the table, the average is ~4,200 — causing the planner to consistently
underestimate the true cost by a factor of ~3,000 and prefer a BitmapAnd using the enrollmentid
index regardless of what other, genuinely selective indexes are available.

The shared event table (pre-2.43) therefore has a structural tension: the enrollmentid index is
designed for tracker semantics (one enrollment per patient, few events each) but is also visible
to single-event program queries where it is always the wrong choice — actively preventing the
planner from choosing better paths. In 2.43 the tracker and event stores are separated, which
resolves this at the schema level. For 2.41 and 2.42 — which will remain in production at many
sites for years — the tension persists.

Historical context

The default DHIS2 client sends no occurredAfter/occurredBefore parameters. This means the
true upstream baseline is an unbounded query — no date range, no OU predicate — and every
production instance running stock DHIS2 hits this path when loading the working list in the
Capture app.

The progression below uses real EXPLAIN ANALYZE output from the same production database:

Stage What changed Execution time Rows entering access-check subplans
True upstream baseline No date range, no OU predicate (stock DHIS2 client) ~41,000 ms 10,661,309
With date range injected via reverse proxy 7-day window injected when client sends none ~2,600 ms 3,233
This branch Correct OU predicate + composite index, 7-day date range ~11 ms 52
This branch + partial enrollmentid index ¹ No date range at all ~9 ms 16

¹ Site-specific optimisation described below — not part of this branch.

Baseline → date range: reducing the event window cut SubPlan iterations from 10.6M to 3,233.
However, the dominant cost shifted rather than disappeared: PostgreSQL still had to build a bitmap
over all 10.6M program events via programstageinstance_programinstanceid before intersecting
with the date bitmap. That bitmap construction alone consumed ~2,386ms of the 2,613ms total. The
date range was treating the symptom — reducing the bitmap size — while the enrollmentid index
remained the structural cause.

Date range → this branch: the ev.organisationunitid = :ou_id predicate gives the planner a
direct index entry via idx_event_ou_occurreddate(organisationunitid, occurreddate). With a
narrow date range, the planner uses an Index Scan Backward — no bitmap construction, no mass
SubPlan iterations. The two predicates compound: the date range narrows the index range scan; the
OU equality limits the result to events at the requested facility.

On the date injection: with the partial enrollmentid index applied (see below), an unbounded
query with no date range runs in ~9ms. The date injection at the reverse proxy was never a fix —
it was a workaround for the enrollmentid index problem that remained hidden until this analysis.
A bounded date range remains good UX practice — users generally have no interest in events from
5+ years ago in a working list — but it is no longer required to prevent database overload on
deployments that have addressed the enrollmentid index.


Changes

Bug fix — program type condition always false

The WITHOUT_REGISTRATION branch condition used params.getProgramType(), which is never
populated on EventQueryParams in the tracker store — always returning null. The condition was
therefore always false, meaning the COALESCE path was used for all programs regardless of type.
Fixed to use params.getEnrolledInProgram().getProgramType(), consistent with how
isProgramRestricted() works in the same class.

This was invisible to CI because all existing orgUnitMode=SELECTED integration tests in
EventExporterTest used a WITH_REGISTRATION program (BFcipDERJnf).

Integration test added

shouldReturnEventsForWithoutRegistrationProgramGivenOrgUnitModeSelected in EventExporterTest
— exercises orgUnitMode=SELECTED with a WITHOUT_REGISTRATION program (iS7eutanDry), closing
the test gap that hid both bugs.

Flyway migration V2_41_58 — composite index

Even with the correct ev.organisationunitid = :ou_id predicate in place, the enrollmentid
index can still lure the planner into a BitmapAnd for high-volume org units with wide date ranges.
The composite index (organisationunitid, occurreddate) gives the planner a competing entry point
that wins when the (OU, date range) combination is selective — typically up to a few weeks of
events per facility. The planner uses Index Scan Backward covering both predicates in a single
pass and can stop as soon as the LIMIT is satisfied without sorting the full result set.

For very wide date ranges (months to years) on high-volume org units, the planner may still choose
the BitmapAnd path (~900ms). In practice the intended use case — Capture app working list with a
bounded date window — stays well inside the fast path. For deployments where this ceiling is still
a problem, see the site-specific optimisation below.

Site-specific optimisation for pre-2.43 event-heavy deployments

On deployments that have been collecting single-event program data for many years, the enrollment
distribution can become pathologically skewed. The production system used for this analysis:

 enrollmentid |  count
--------------+----------
       583732 | 12,324,987   ← one enrollment owns 97% of the table
      5346233 |    200,690
      5346235 |     67,028
         5337 |      2,965
      (others) |       < 600

In this shape the event table is functionally a flat aggregate table
(organisationunitid, occurreddate, datavalues)enrollmentid is a data model artefact with
no filtering value. DB admins can eliminate the BitmapAnd path entirely by replacing the full
enrollmentid index with a partial index that excludes WITHOUT_REGISTRATION program stages:

-- Step 1: find the programstageids to exclude
SELECT ps.programstageid, ps.name, p.name AS program_name
FROM programstage ps
JOIN program p ON ps.programid = p.programid
WHERE p.type = 'WITHOUT_REGISTRATION';

-- Step 2: recreate the index as a partial index
DROP INDEX programstageinstance_programinstanceid;
CREATE INDEX programstageinstance_programinstanceid
    ON event (enrollmentid)
    WHERE programstageid NOT IN (<ids from step 1>);

With this in place, even 1-year date range queries on high-volume org units use
Index Scan Backward via idx_event_ou_occurreddate and execute in under 10ms. This is not
included in the Flyway migration because it requires knowledge of site-specific programstageids
and is inappropriate for tracker-heavy deployments, but it is a straightforward DBA operation and
the correct long-term fix for deployments where single-event programs dominate the event table.


Performance results (production, ~12.7M row event table)

Baseline is stock 2.41 with no date bounds and no OU predicate — the query shape every production
instance hits when the Capture app loads a working list without custom date parameters.

Metric Stock 2.41 baseline With date range injection ² This branch
p50 ~41,000ms ~950ms ~11ms
p95 ~41,000ms ~950ms ~11ms
p99 ~41,000ms ~950ms ~11ms
Typical OU execution time >40,000ms ~838ms ~11ms
Worst-case (wide date range, high-volume OU) >40,000ms ~950ms ~950ms
Worst-case with partial enrollmentid index ¹ >40,000ms ~950ms ~10ms

¹ Site-specific DBA optimisation — not part of this branch.
² Date range injection via reverse proxy — a workaround, not a fix.

For the common case — facility-level queries with a bounded date range — the 12M-row
programstageinstance_programinstanceid scan is eliminated from the plan. For very wide date
ranges on high-volume org units, the planner may still select the BitmapAnd path; this is a known
ceiling addressable with the partial index approach described above.


Scope

These changes apply to the event query for orgUnitMode=SELECTED on WITHOUT_REGISTRATION
programs only. This query pattern is issued by the Capture app when loading the initial working
list — it is high volume, fired on every page load for every user, and therefore latency-sensitive.

The WITH_REGISTRATION (tracker) case uses the COALESCE join which cannot be bypassed in the
same way — that ceiling requires a separate architectural change and is out of scope here.

Disclaimer: 🤖 AI was used for portions of this PR.

@jason-p-pickering jason-p-pickering changed the title Optimize event query for orgunitMode=SELECTED perf: add direct org unit predicate for orgUnitMode=SELECTED event queries [2.41] Apr 2, 2026
@jason-p-pickering jason-p-pickering marked this pull request as draft April 8, 2026 07:24
@sonarqubecloud
Copy link
Copy Markdown

@jason-p-pickering jason-p-pickering requested a review from a team April 13, 2026 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants