fix(cohorts): fix cross-DB alias error in static cohort insertion by dmarticus · Pull Request #53804 · PostHog/posthog

dmarticus · 2026-04-08T22:01:44Z

Problem

#52819 fixed static cohort duplication truncation but was reverted (#53670) because it broke adding users to static cohorts in production.

The root cause: the ClickHouse dedup path mixed db_read (persons_db_reader) and db_write (persons_db_writer) Django database aliases in a single queryset with .exclude(). Django rejects cross-alias subqueries — even when both aliases point to the same physical database — raising:

ValueError: Subqueries aren't allowed across different databases.

This error was silently swallowed by the except block in _insert_users_list_with_batching, so API calls returned 200 but cohort membership was never updated. Tests passed because db_read == db_write in the test config.

Changes

CH dedup path: Use db_write for both Person and CohortPeople querysets so Django can merge the .exclude() into a single NOT IN subquery. No cross-DB error, no memory explosion.
PG insert path: Replace the string-replacement hack (UPDATE_QUERY.format(sql.replace(...))) with a LEFT JOIN via raw SQL on the db_write cursor. Dedup stays entirely in SQL, avoiding the O(cohort_size) memory cost of loading all existing member IDs into Python.
Remove unused UPDATE_QUERY constant.
Re-apply admin improvements from fix(cohorts): fix static cohort duplication truncation and add admin re-sync #52819 (resync action, list fields).
Add regression test that simulates production's split DB aliases.

How did you test this code?

All 40 static cohort API tests pass
All 4 feature flag cohort generation tests pass (with updated query counts)
All 5 cohort model insertion tests pass
Regression test validates the fix: passes with db_write for both querysets, fails with db_read + db_write mix (reproduces the production ValueError)

Publish to changelog?

No

Docs update

N/A

LLM context

Re-land of #52819 with the cross-DB alias bug fixed. The key insight is that Django checks database aliases, not physical databases, when validating subqueries. Production uses separate persons_db_reader/persons_db_writer aliases, but tests use a single alias for both, which is why the original PR's tests passed.

The original fix (#52819) for static cohort duplication truncation was reverted because it caused a ValueError in production: "Subqueries aren't allowed across different databases" Root cause: the CH dedup path created a queryset mixing db_read (persons_db_reader) and db_write (persons_db_writer) aliases. Django rejects cross-alias subqueries even when both point to the same physical database. The error was silently swallowed by the except block, so API calls returned 200 but cohort membership never updated. Fix: use db_write for both Person and CohortPeople in the CH dedup path so Django can merge the .exclude() into a single NOT IN subquery. The PG path uses a LEFT JOIN via raw cursor (bypasses Django's check). Also re-applies the admin improvements and adds a regression test that simulates production's split DB aliases.

greptile-apps · 2026-04-08T22:06:45Z

Vulnerabilities

No security concerns identified. The new f-string SQL interpolates only self.pk (Django auto-incremented integer PK) and self.version or "NULL" (integer or SQL NULL literal) — neither is user-controlled. The inner subquery comes from Django ORM's sql_with_params(), with user-supplied values kept in the params tuple passed separately to cursor.execute(). Table names are sourced from Model._meta.db_table (compile-time constants).

Prompt To Fix All With AI

This is a comment left during a code review.
Path: posthog/models/cohort/cohort.py
Line: 727-735

Comment:
**Missing `ON CONFLICT DO NOTHING` for concurrent inserts**

The removed `UPDATE_QUERY` had `ON CONFLICT DO NOTHING` to handle the race window between the SELECT and the INSERT. The new LEFT JOIN dedup closes the race most of the time, but if two concurrent processes run the same batch simultaneously, both can pass the `WHERE cp."person_id" IS NULL` check before either inserts, and the second will hit a unique-constraint violation. The `except Exception` handler swallows it, silently skipping the entire batch.

Adding `ON CONFLICT DO NOTHING` to the tail of the INSERT restores the original behaviour at no cost:
```suggestion
                query = f"""
                    INSERT INTO "{cohort_people_table}" ("person_id", "cohort_id", "version")
                    SELECT p."id", {self.pk}, {self.version or "NULL"}
                    FROM ({sql}) AS p
                    LEFT JOIN "{cohort_people_table}" AS cp
                        ON cp."person_id" = p."id" AND cp."cohort_id" = {self.pk}
                    WHERE cp."person_id" IS NULL
                    ON CONFLICT DO NOTHING
                """
```

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix(cohorts): fix cross-DB alias error i..." | Re-trigger Greptile}

posthog/models/cohort/cohort.py

Defense-in-depth for concurrent inserts, matching the original query's behavior.

github-actions · 2026-04-08T22:27:25Z

🎭 Playwright report · View test results →

⚠️ 2 flaky tests:

Inline editing insight title via compact card popover (chromium)
Materialize view pane (chromium)

These issues are not necessarily caused by your changes.
Annoyed by this comment? Help fix flakies and failures and it'll disappear!

gustavohstrassburger

Hope this works.

assign-reviewers-posthog bot requested review from a team April 8, 2026 22:02

posthog-project-board-bot bot moved this to In Review in Feature Flags Apr 8, 2026

posthog-project-board-bot bot added this to Feature Flags Apr 8, 2026

greptile-apps bot reviewed Apr 8, 2026

View reviewed changes

posthog/models/cohort/cohort.py Show resolved Hide resolved

fix(cohorts): add ON CONFLICT DO NOTHING to PG insert

8dbdc00

Defense-in-depth for concurrent inserts, matching the original query's behavior.

gustavohstrassburger approved these changes Apr 9, 2026

View reviewed changes

github-project-automation bot moved this from In Review to Approved in Feature Flags Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cohorts): fix cross-DB alias error in static cohort insertion#53804

fix(cohorts): fix cross-DB alias error in static cohort insertion#53804
dmarticus wants to merge 2 commits intomasterfrom
dmarticus/fix-static-cohorts

dmarticus commented Apr 8, 2026

Uh oh!

greptile-apps bot commented Apr 8, 2026

Vulnerabilities

Uh oh!

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

gustavohstrassburger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dmarticus commented Apr 8, 2026

Problem

Changes

How did you test this code?

Publish to changelog?

Docs update

LLM context

Uh oh!

greptile-apps bot commented Apr 8, 2026

Vulnerabilities

Uh oh!

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

gustavohstrassburger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants