Skip to content

pghistory.middleware.HistoryMiddleware is incompatible with django.db.connection.cursor().executemany(sql, batch) for UPSERTs #258

Description

Issue Description

Howdy!

We ran into an issue where we were getting a query parameters should be a sequence or a mapping, got str exception from Django's executemany. We dug into the issue and found that when executemany was run in an http request context with HistoryMiddleware running, _inject_history_context injects the context UUID and JSON metadata before the passed array of parameters, which causes executemany's param_list[0] to resolve to a string, when it expects an iterable or mapping.

Example

Let's take this setup as an example. This code would be running in an http request scope that has the HistoryMiddleware enabled:

class VehicleCarManufacturer(Model):
    name = CharField(max_length=128)

class Vehicle(Model):
    year = IntegerField()
    make = ForeignKey(VehicleManufacturer, related_name="notification_records", on_delete=models.CASCADE)
    model = CharField(max_length=128)
    colours = ArrayField(CharField(max_length=128))

def upsert_colours_for_year(vehicle_to_colours: dict[Vehicle, list[str]], model_year: int):
    sql = """
        INSERT INTO vehicle (
            year,
            make_id,
            model,
            colours,
        ) VALUES (%s, %s, %s, %s)
        ON CONFLICT (year, make_id, model)
        DO UPDATE SET
            colours = EXCLUDED.colours
    """

    rows = []

    for vehicle, colours in vehicle_to_colours.items():
        rows.append(
            (
                year, 
                vehicle.make_id, 
                vehicle.model, 
                colours,
            )
        )

    with connection.cursor() as cursor:
    for batch in itertools.batched(rows, 500):
        cursor.executemany(sql, batch)

If our batch of rows ends up being:

batch = (
    (2025, 23, "Wrangler", ["Black", "Red", "Olive"]),
    (2026, 15, "F-150", ["Black", "Red", "Grey"]),
)

When we get into _inject_history_context, we'd end up with a context_params that looks like:

context_params = {
    'pghistory__context_id': '80f1506e-f375-4e7f-a896-de72093hd08ed', 
    'pghistory__context_metadata': {
        "user": 2309754, 
        "url": "/update/colour/inventory"
    }
}

Then when the params are updated, the context id and user are pre-pended to our tuple of tuples as strings:

>>> params = (*context_params.values(), *(params or ()))
>>> params
(
    '80f1506e-f375-4e7f-a896-de72093hd08ed', 
    '{"user": 2309754, "url": "/update/colour/inventory"}', 
    (2025, 23, "Wrangler", ["Black", "Red", "Olive"]),
    (2026, 15, "F-150", ["Black", "Red", "Grey"]),
)

So when the _execute_wrapper call runs execute_many, the first iteration (batch[0]) gets '80f1506e-f375-4e7f-a896-de72093hd08ed' when it expects a 4-tuple.

Versioning

Django~=4.2.0
django-pghistory==3.9.2
django-pgtrigger==4.17.0
psycopg[binary,pool]~=3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions