Conversation
32a9b0e to
824ea57
Compare
There was a problem hiding this comment.
Pull request overview
Adds experimental JSON type support to the clickhouse-cpp client by introducing a ColumnJSON that represents JSON values as strings in the Native protocol (requiring output_format_native_write_json_as_string=1 when reading).
Changes:
- Add
Type::JSON, parsing support, andColumnJSONimplementation (string-backed with a JSON serialization prefix check). - Wire JSON into factory/client headers and extend
ItemView/printing support. - Add unit tests and generators covering JSON columns (including arrays) and roundtrip behavior.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
clickhouse/types/types.h |
Adds Type::JSON and CreateJSON() API. |
clickhouse/types/types.cpp |
Maps JSON type name/unique-id behavior and implements CreateJSON(). |
clickhouse/types/type_parser.cpp |
Enables parsing "JSON" into Type::JSON. |
clickhouse/columns/json.h |
Introduces ColumnJSON and a ColumnNullableT<ColumnJSON> null-handling specialization. |
clickhouse/columns/json.cpp |
Implements JSON column serialization/deserialization using string encoding (versioned prefix). |
clickhouse/columns/itemview.cpp |
Treats JSON as a variable-size (string-like) ItemView payload. |
clickhouse/columns/factory.cpp |
Allows creating ColumnJSON from parsed terminal type. |
clickhouse/client.h |
Exposes JSON column to users by including the header. |
clickhouse/CMakeLists.txt |
Adds JSON sources/headers to build inputs. |
README.md |
Documents experimental JSON support and the required read setting. |
ut/value_generators.h / ut/value_generators.cpp |
Adds JSON sample value generator for tests. |
ut/columns_ut.cpp |
Adds ColumnJSON init/append unit tests. |
ut/column_array_ut.cpp |
Adds ColumnArrayT<ColumnJSON> unit test. |
ut/itemview_ut.cpp |
Adds ItemView storage tests for JSON type. |
ut/utils.cpp / ut/utils_ut.cpp |
Adds printing/serialization coverage for JSON. |
ut/type_parser_ut.cpp |
Adds JSON parsing unit test. |
ut/types_ut.cpp |
Extends type equality test coverage to include JSON. |
ut/roundtrip_column.cpp |
Modifies server roundtrip helper to read JSON as string via settings. |
ut/Column_ut.cpp |
Adds JSON to the generic server roundtrip test matrix. |
ut/CreateColumnByType_ut.cpp |
Includes JSON header (supports compilation with JSON type). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| GenericColumnTestCase<ColumnString, &makeColumn<ColumnString>, std::string, &MakeStrings>, | ||
| GenericColumnTestCase<ColumnFixedString, &makeColumn<ColumnFixedString, 12>, std::string, &MakeFixedStrings<12>>, | ||
| GenericColumnTestCase<ColumnJSON, &makeColumn<ColumnJSON>, std::string, &MakeJSONs>, |
There was a problem hiding this comment.
ColumnJSON is added to the generic server roundtrip test matrix, but CheckIfShouldSkipTest() doesn't currently have a capability/version gate for JSON. This will cause failures against ClickHouse versions without the JSON type (again, .travis.yml uses 21.3.x). Consider adding a skip similar to Date32/Int128, either via a known minimum server version for JSON or by probing server support (e.g. attempt a trivial SELECT CAST('{}' AS JSON) and skip on error).
| GenericColumnTestCase<ColumnJSON, &makeColumn<ColumnJSON>, std::string, &MakeJSONs>, |
| * Works only when ClickHouse outputs JSON as strings and requires the setting | ||
| * output_format_native_write_json_as_string to be set to 1. |
There was a problem hiding this comment.
The header comment says ColumnJSON "requires" output_format_native_write_json_as_string=1, but this setting only affects reading (server output) and the PR description notes inserts don't require it. Consider rewording the comment to clarify when the setting is needed (e.g., required for decoding JSON coming from the server via Native protocol) to avoid misleading users.
| * Works only when ClickHouse outputs JSON as strings and requires the setting | |
| * output_format_native_write_json_as_string to be set to 1. | |
| * When decoding JSON returned by ClickHouse over the Native protocol, this | |
| * requires ClickHouse to output JSON as strings by setting | |
| * output_format_native_write_json_as_string to 1. |
| std::string query = "SELECT col FROM temporary_roundtrip_table ORDER BY id " | ||
| "SETTINGS output_format_native_write_json_as_string=1"; |
There was a problem hiding this comment.
RoundtripColumnValues() now unconditionally adds SETTINGS output_format_native_write_json_as_string=1 to every SELECT. This breaks the unit tests on older ClickHouse versions that don't recognize this setting (e.g. the repo's .travis.yml pins 21.3.x), even for non-JSON columns. Consider only adding this setting when the tested column type contains JSON (or otherwise gating it based on server capability), and keep the original query for all other types.
| std::string query = "SELECT col FROM temporary_roundtrip_table ORDER BY id " | |
| "SETTINGS output_format_native_write_json_as_string=1"; | |
| std::string query = "SELECT col FROM temporary_roundtrip_table ORDER BY id"; | |
| if (type_name.find("JSON") != std::string::npos) { | |
| query += " SETTINGS output_format_native_write_json_as_string=1"; | |
| } |
| columns/factory.h | ||
| columns/geo.h | ||
| columns/ip4.h | ||
| columns/ip6.h | ||
| columns/json.h | ||
| columns/itemview.h | ||
| columns/lowcardinality.h |
There was a problem hiding this comment.
columns/json.h is added to the library sources/headers list, and client.h now includes it, but clickhouse/CMakeLists.txt's install section doesn't install columns/json.h. This will break consumers using make install (installed <clickhouse/client.h> will include a header that isn't installed). Please add INSTALL(FILES columns/json.h DESTINATION include/clickhouse/columns/) alongside the other column headers.
Experimental support for the JSON data type
Special thanks to @theory for the initial research.
Warning
This is an experimental implementation, and the API may change as we continue working toward full support for JSON columns..
This PR adds basic support for the JSON data type, but only when the data is stringified using
the
output_format_native_write_json_as_stringsetting. Inserting JSON data does not require this setting.Caution
ClickHouse validates JSON syntax even when values are passed as strings. Empty strings are not allowed and will cause a server error; use an empty object (
{}) instead. This is especially important for nullable columns, where the server still expects a value for NULL cells.Examples
Reading JSON values
Inserting simple non-nullable columns
Inserting Nullable columns using ColumnNullableT