fix(cohorts): fix cross-DB alias error in static cohort insertion#53804
fix(cohorts): fix cross-DB alias error in static cohort insertion#53804
Conversation
The original fix (#52819) for static cohort duplication truncation was reverted because it caused a ValueError in production: "Subqueries aren't allowed across different databases" Root cause: the CH dedup path created a queryset mixing db_read (persons_db_reader) and db_write (persons_db_writer) aliases. Django rejects cross-alias subqueries even when both point to the same physical database. The error was silently swallowed by the except block, so API calls returned 200 but cohort membership never updated. Fix: use db_write for both Person and CohortPeople in the CH dedup path so Django can merge the .exclude() into a single NOT IN subquery. The PG path uses a LEFT JOIN via raw cursor (bypasses Django's check). Also re-applies the admin improvements and adds a regression test that simulates production's split DB aliases.
|
Defense-in-depth for concurrent inserts, matching the original query's behavior.
|
🎭 Playwright report · View test results →
These issues are not necessarily caused by your changes. |
gustavohstrassburger
left a comment
There was a problem hiding this comment.
Hope this works.
Problem
#52819 fixed static cohort duplication truncation but was reverted (#53670) because it broke adding users to static cohorts in production.
The root cause: the ClickHouse dedup path mixed
db_read(persons_db_reader) anddb_write(persons_db_writer) Django database aliases in a single queryset with.exclude(). Django rejects cross-alias subqueries — even when both aliases point to the same physical database — raising:This error was silently swallowed by the
exceptblock in_insert_users_list_with_batching, so API calls returned 200 but cohort membership was never updated. Tests passed becausedb_read == db_writein the test config.Changes
db_writefor bothPersonandCohortPeoplequerysets so Django can merge the.exclude()into a singleNOT INsubquery. No cross-DB error, no memory explosion.UPDATE_QUERY.format(sql.replace(...))) with aLEFT JOINvia raw SQL on thedb_writecursor. Dedup stays entirely in SQL, avoiding the O(cohort_size) memory cost of loading all existing member IDs into Python.UPDATE_QUERYconstant.How did you test this code?
db_writefor both querysets, fails withdb_read+db_writemix (reproduces the productionValueError)Publish to changelog?
No
Docs update
N/A
LLM context
Re-land of #52819 with the cross-DB alias bug fixed. The key insight is that Django checks database aliases, not physical databases, when validating subqueries. Production uses separate
persons_db_reader/persons_db_writeraliases, but tests use a single alias for both, which is why the original PR's tests passed.