Skip to content

Commit a20cf88

Browse files
Add support for converting pyformat SQL to positional markers with bulk_parameters (#812)
* Add support for converting pyformat SQL to positional markers with bulk parameters * Add support for dict bulk parameters and new tests * Update docs with new changes of named parameters * Apply suggestions from code review Co-authored-by: Mathias Fußenegger <mfussenegger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Bilal Tonga <bilaltonga@gmail.com> * Update docs/query.rst --------- Co-authored-by: Mathias Fußenegger <mfussenegger@users.noreply.github.com>
1 parent 17656c3 commit a20cf88

5 files changed

Lines changed: 181 additions & 17 deletions

File tree

CHANGES.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,14 @@
22
Changes for crate
33
=================
44

5+
Unreleased
6+
================
7+
8+
- Fixed ``cursor.execute()`` with ``bulk_parameters`` and pyformat SQL: when
9+
rows are dicts, both the SQL template and the rows are now converted to
10+
positional format before sending to CrateDB. Positional-list rows
11+
continue to work as before.
12+
513
2026/06/04 2.2.0
614
==========
715
- Added JSON serialization support for Python's ``datetime.time`` type,

docs/by-example/cursor.rst

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,37 @@ For completeness' sake the cursor description is updated nonetheless:
266266
>>> [ desc[0] for desc in cursor.description ]
267267
['name', 'position']
268268

269+
executemany() with named parameters
270+
====================================
271+
272+
``executemany()`` also accepts a :class:`py:list` of :class:`py:dict` when
273+
the SQL statement contains ``%(name)s`` placeholders. The client converts both the SQL
274+
template and all rows to positional format before sending them to CrateDB:
275+
276+
.. Hidden: set up mocked response
277+
278+
>>> connection.client.set_next_response({
279+
... "results": [
280+
... {"rowcount": 1},
281+
... {"rowcount": 1}
282+
... ],
283+
... "duration": 123,
284+
... "cols": [],
285+
... })
286+
287+
>>> cursor = connection.cursor()
288+
289+
>>> cursor.executemany(
290+
... "INSERT INTO t (id, val) VALUES (%(id)s, %(val)s)",
291+
... [{"id": 1, "val": "foo"}, {"id": 2, "val": "bar"}])
292+
[{'rowcount': 1}, {'rowcount': 1}]
293+
294+
>>> cursor.rowcount
295+
2
296+
297+
>>> cursor.duration
298+
123
299+
269300
>>> connection.client.set_next_response({
270301
... "rows":[ [ "North West Ripple", 1 ], [ "Arkintoofle Minor", 3 ], [ "Alpha Centauri", 3 ] ],
271302
... "cols":[ "name", "position" ],

docs/query.rst

Lines changed: 57 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -72,20 +72,19 @@ The same parameter name may appear multiple times in the query:
7272
... "SELECT * FROM locations WHERE name = %(q)s OR kind = %(q)s",
7373
... {"q": "Quasar"})
7474

75-
The client converts the ``%(name)s`` placeholders to positional ``?`` markers
76-
before sending the query to CrateDB, so no server-side changes are required.
77-
78-
.. NOTE::
79-
80-
Named parameters are not yet supported by ``executemany()``. Use
81-
positional ``?`` placeholders with a :class:`py:list` of tuples for bulk
82-
operations.
75+
The client converts the ``%(name)s`` placeholders to ``$N`` positional
76+
markers before sending the query to CrateDB.
8377

8478
Bulk inserts
8579
------------
8680

8781
:ref:`Bulk inserts <crate-reference:http-bulk-ops>` are possible with the
88-
``executemany()`` method, which takes a :class:`py:list` of tuples to insert:
82+
``executemany()`` method.
83+
84+
Positional parameters
85+
.....................
86+
87+
Pass a :class:`py:list` of sequences using ``?`` placeholders:
8988

9089
>>> cursor.executemany(
9190
... "INSERT INTO locations (name, date, kind, position) VALUES (?, ?, ?, ?)",
@@ -94,10 +93,58 @@ Bulk inserts
9493
[{'rowcount': 1}, {'rowcount': 1}]
9594

9695
The ``executemany()`` method returns a result :class:`dictionary <py:dict>`
97-
for every tuple. This dictionary always has a ``rowcount`` key, indicating
96+
for every row. This dictionary always has a ``rowcount`` key, indicating
9897
how many rows were inserted. If an error occurs, the ``rowcount`` value is
9998
``-2``, and the dictionary may additionally have an ``error_message`` key.
10099

100+
Named parameters
101+
................
102+
103+
``executemany()`` also accepts a :class:`py:list` of :class:`py:dict` using
104+
``%(name)s`` placeholders. The client converts both the SQL template and all
105+
rows to positional format before sending to CrateDB:
106+
107+
>>> cursor.executemany(
108+
... "INSERT INTO locations (name, date, kind, position) "
109+
... "VALUES (%(name)s, %(date)s, %(kind)s, %(pos)s)",
110+
... [{"name": "Cloverleaf", "date": "2007-03-11", "kind": "Quasar", "pos": 7},
111+
... {"name": "Old Faithful", "date": "2007-03-11", "kind": "Quasar", "pos": 8}])
112+
[{'rowcount': 1}, {'rowcount': 1}]
113+
114+
Using ``bulk_parameters`` directly
115+
...................................
116+
117+
``execute()`` accepts a ``bulk_parameters`` keyword argument directly:
118+
119+
.. NOTE::
120+
Please prefer ``executemany()`` for bulk inserts, it is the standard DB-API 2.0
121+
interface. The ``bulk_parameters`` argument is a lower-level alternative.
122+
123+
>>> cursor.execute(
124+
... "INSERT INTO locations (name, kind, position) VALUES (?, ?, ?)",
125+
... bulk_parameters=[('Cloverleaf', 'Quasar', 7),
126+
... ('Old Faithful', 'Quasar', 8)])
127+
128+
Named ``%(name)s`` placeholders are also supported. When the rows are
129+
:class:`py:dict` objects the SQL template and rows are fully converted,
130+
identical to the ``executemany()`` path:
131+
132+
>>> cursor.execute(
133+
... "INSERT INTO locations (name, kind, position) "
134+
... "VALUES (%(name)s, %(kind)s, %(pos)s)",
135+
... bulk_parameters=[{"name": "Cloverleaf", "kind": "Quasar", "pos": 7},
136+
... {"name": "Old Faithful", "kind": "Quasar", "pos": 8}])
137+
138+
When the rows are already positional lists (e.g. data coming from a
139+
DataFrame), only the SQL template is rewritten. In this case the caller must
140+
ensure the value order in each row matches the placeholder order in the SQL:
141+
142+
>>> cursor.execute(
143+
... "INSERT INTO locations (name, kind, position) "
144+
... "VALUES (%(name)s, %(kind)s, %(pos)s)",
145+
... bulk_parameters=[['Cloverleaf', 'Quasar', 7],
146+
... ['Old Faithful', 'Quasar', 8]])
147+
101148
.. _selects:
102149

103150
Selecting data

src/crate/client/cursor.py

Lines changed: 20 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -22,20 +22,27 @@
2222
import typing as t
2323
import warnings
2424
from datetime import datetime, timedelta, timezone
25+
from itertools import count
2526

2627
from .converter import Converter, DataType
2728
from .exceptions import ProgrammingError
2829

2930
_NAMED_PARAM_RE = re.compile(r"%\(([^)]+)\)s")
3031

3132

33+
def _rewrite_pyformat_sql(sql: str) -> str:
34+
"""Replace %(name)s placeholders with $N positional markers (1-indexed)."""
35+
counter = count(1)
36+
return _NAMED_PARAM_RE.sub(lambda _: f"${next(counter)}", sql)
37+
38+
3239
def _convert_named_to_positional(
3340
sql: str, params: t.Dict[str, t.Any]
3441
) -> t.Tuple[str, t.List[t.Any]]:
35-
"""Convert pyformat-style named parameters to positional qmark parameters.
42+
"""Convert pyformat-style named parameters to positional parameters.
3643
37-
Converts ``%(name)s`` placeholders to ``?`` and returns an ordered list
38-
of corresponding values extracted from ``params``.
44+
Converts ``%(name)s`` placeholders to ``$N`` (1-indexed) and returns an
45+
ordered list of corresponding values extracted from ``params``.
3946
4047
The same name may appear multiple times; each occurrence appends the
4148
value to the positional list independently.
@@ -47,7 +54,7 @@ def _convert_named_to_positional(
4754
4855
sql = "SELECT * FROM t WHERE a = %(a)s AND b = %(b)s"
4956
params = {"a": 1, "b": 2}
50-
# returns: ("SELECT * FROM t WHERE a = ? AND b = ?", [1, 2])
57+
# returns: ("SELECT * FROM t WHERE a = $1 AND b = $2", [1, 2])
5158
"""
5259
positions = {}
5360
idx = 1
@@ -91,8 +98,8 @@ def _convert_named_bulk_params(
9198
for row in seq_of_dicts:
9299
if not isinstance(row, dict):
93100
raise ProgrammingError(
94-
"executemany() requires all parameter rows to be dicts "
95-
"when the SQL uses pyformat (%(name)s) placeholders"
101+
"All bulk parameter rows must be dicts when SQL uses "
102+
"pyformat (%(name)s) placeholders; got a non-dict row"
96103
)
97104
positional: t.List[t.Any] = [None] * n
98105
for name, pos in positions.items():
@@ -136,6 +143,13 @@ def execute(self, sql, parameters=None, bulk_parameters=None):
136143

137144
if isinstance(parameters, dict):
138145
sql, parameters = _convert_named_to_positional(sql, parameters)
146+
elif bulk_parameters is not None and _NAMED_PARAM_RE.search(sql):
147+
if bulk_parameters and isinstance(bulk_parameters[0], dict):
148+
sql, bulk_parameters = _convert_named_bulk_params(
149+
sql, bulk_parameters
150+
)
151+
else:
152+
sql = _rewrite_pyformat_sql(sql)
139153

140154
self._result = self.connection.client.sql(
141155
sql, parameters, bulk_parameters

tests/client/test_cursor.py

Lines changed: 65 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,9 @@ def test_executemany_with_mixed_param_types(mocked_connection):
203203
parameter sequence mixes dicts and non-dicts while the SQL uses pyformat.
204204
"""
205205
cursor = mocked_connection.cursor()
206-
with pytest.raises(ProgrammingError, match="requires all parameter rows"):
206+
with pytest.raises(
207+
ProgrammingError, match="All bulk parameter rows must be dicts"
208+
):
207209
cursor.executemany(
208210
"INSERT INTO characters (name) VALUES (%(name)s)",
209211
[{"name": "Arthur"}, ["Trillian"]], # second row is a list
@@ -329,6 +331,68 @@ def test_execute_with_bulk_args(mocked_connection):
329331
mocked_connection.client.sql.assert_called_once_with(statement, None, [[1]])
330332

331333

334+
def test_execute_with_pyformat_sql_and_bulk_parameters(mocked_connection):
335+
"""
336+
cursor.execute() converts %(name)s SQL to $N when bulk_parameters is
337+
provided. Rows are already positional; only the SQL needs conversion.
338+
"""
339+
cursor = mocked_connection.cursor()
340+
sql = "INSERT INTO t (id, val) VALUES (%(id)s, %(val)s)"
341+
bulk = [[1, "hello"], [2, "world"]]
342+
cursor.execute(sql, bulk_parameters=bulk)
343+
mocked_connection.client.sql.assert_called_once_with(
344+
"INSERT INTO t (id, val) VALUES ($1, $2)", None, bulk
345+
)
346+
347+
348+
def test_execute_with_pyformat_sql_and_dict_bulk_parameters(mocked_connection):
349+
"""
350+
cursor.execute() with pyformat SQL and dict-format bulk_parameters converts
351+
both the SQL template (%(x)s → $N) and the rows (dicts → positional lists).
352+
"""
353+
cursor = mocked_connection.cursor()
354+
sql = "INSERT INTO t (id, val) VALUES (%(id)s, %(val)s)"
355+
bulk = [{"id": 1, "val": "hello"}, {"id": 2, "val": "world"}]
356+
cursor.execute(sql, bulk_parameters=bulk)
357+
mocked_connection.client.sql.assert_called_once_with(
358+
"INSERT INTO t (id, val) VALUES ($1, $2)",
359+
None,
360+
[[1, "hello"], [2, "world"]],
361+
)
362+
363+
364+
def test_execute_with_dict_bulk_parameters_mixed_types_raises(
365+
mocked_connection,
366+
):
367+
"""
368+
cursor.execute() raises ProgrammingError when bulk_parameters mixes
369+
dict and non-dict rows with pyformat SQL.
370+
"""
371+
cursor = mocked_connection.cursor()
372+
with pytest.raises(
373+
ProgrammingError, match="All bulk parameter rows must be dicts"
374+
):
375+
cursor.execute(
376+
"INSERT INTO t (id) VALUES (%(id)s)",
377+
bulk_parameters=[{"id": 1}, [2]],
378+
)
379+
mocked_connection.client.sql.assert_not_called()
380+
381+
382+
def test_execute_with_pyformat_sql_and_bulk_parameters_no_placeholders(
383+
mocked_connection,
384+
):
385+
"""
386+
SQL without %(name)s placeholders is passed through unchanged
387+
even when bulk_parameters is provided.
388+
"""
389+
cursor = mocked_connection.cursor()
390+
sql = "INSERT INTO t (id, val) VALUES (?, ?)"
391+
bulk = [[1, "hello"], [2, "world"]]
392+
cursor.execute(sql, bulk_parameters=bulk)
393+
mocked_connection.client.sql.assert_called_once_with(sql, None, bulk)
394+
395+
332396
def test_execute_custom_converter(mocked_connection):
333397
"""
334398
Verify that a custom converter is correctly applied when passed to a cursor.

0 commit comments

Comments
 (0)