Skip to content

Commit 7412cf6

Browse files
Merge remote-tracking branch 'origin/main' into locadex/parallel/t9n-main-mtlkv0dh97mm5xszdcmdetec
2 parents 23026b8 + 9a7b67c commit 7412cf6

7 files changed

Lines changed: 309 additions & 0 deletions

File tree

docs/integrations/index.mdx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ ClickHouse integrations are organized by their support level:
2020
| Partner integrations | Built or maintained, and supported by, third-party software vendors |
2121
| Community integrations | Built or maintained and supported by community members. No direct support is available besides the public GitHub repositories and community Slack channels |
2222

23+
:::tip[Building a ClickHouse integration?]
24+
If you are building a product that connects to ClickHouse, start with the [Integration development](/integrations/integration-development) guides: [building integrations](/integrations/integration-development/building-integrations), [testing](/integrations/integration-development/testing-your-integration), and [documenting your product](/integrations/integration-development/documenting-your-integration). Register your integration in the [partner portal](https://clickhouse.com/partners) when you're ready to submit.
25+
:::
26+
2327
<IntegrationGrid />
2428

2529
:::note Notice
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
position: 500
2+
label: 'Integration development'
3+
collapsible: true
4+
collapsed: true
5+
link:
6+
type: doc
7+
id: integrations/integration-development/index
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
---
2+
slug: /integrations/integration-development/building-integrations
3+
title: 'Building integrations with ClickHouse'
4+
sidebar_label: 'Building integrations'
5+
sidebar_position: 2
6+
description: 'Orientation on ingestion, consumption, wire protocols, and client conventions for ClickHouse integrations.'
7+
keywords: ['partner', 'integration', 'ingestion', 'consumption', 'ClickPipes', 'language clients', 'user-agent']
8+
doc_type: 'guide'
9+
---
10+
11+
# Building integrations with ClickHouse
12+
13+
This page orients you to the integration surface so you can scope ingestion and consumption work. For validation and publishing, continue with [Testing your integration](/integrations/integration-development/testing-your-integration) and [Documenting your integration](/integrations/integration-development/documenting-your-integration).
14+
15+
## Ingestion {#ingestion}
16+
17+
Two paths bring data into ClickHouse. Choose based on whether your product should own the ingestion plane or delegate it.
18+
19+
### Path A: ClickPipes (managed, ClickHouse Cloud only) {#path-a-clickpipes}
20+
21+
If you prefer not to build and operate ingestion infrastructure, [ClickPipes](/integrations/clickpipes) is the managed service that pulls from your customer's sources into their ClickHouse Cloud service. ClickPipes handles scaling, parallelization, retries, and lag reporting.
22+
23+
Supported sources today include:
24+
25+
- **Streaming:** Apache Kafka (including MSK, Confluent Cloud, Redpanda, Azure Event Hubs, WarpStream), Amazon Kinesis
26+
- **Object storage:** Amazon S3 (and S3-compatible stores), Google Cloud Storage, Azure Blob Storage
27+
- **CDC:** PostgreSQL, MySQL, MongoDB, BigQuery
28+
29+
### Path B: Self-driven ingestion via an official language client {#path-b-language-client}
30+
31+
If you own the pipeline, use one of the [official language clients](/integrations/language-clients). They handle serialization, batching, TLS, compression, and connection pooling. You pass runtime primitives; the client handles the wire format.
32+
33+
- Official clients: Python, Go, Java, JavaScript, Rust, C#, C++
34+
- Both wire protocols: HTTP (all clients) and native TCP (Go and C++ clients only)
35+
- Auth: username and password over TLS by default; mTLS and SSL client-certificate auth are supported by all major clients
36+
- Data format is usually an implementation detail. Clients convert runtime types to ClickHouse Native or RowBinary format. If you already produce Arrow, Parquet, JSONEachRow, or another format, most clients expose a raw-bytes API for pre-serialized data
37+
- For throughput, batch **10K–100K rows** and aim for roughly **one insert per second** as an upper bound for synchronous inserts. If client-side batching is impractical, use [asynchronous inserts](/optimize/asynchronous-inserts) to shift batching to the server
38+
39+
See also: [Bulk inserts](/optimize/bulk-inserts).
40+
41+
## Consumption {#consumption}
42+
43+
HTTP and native TCP both carry queries. Native is binary and lower overhead. HTTP works through load balancers and proxies. Both are first-class; pick based on infrastructure, not feature gaps.
44+
45+
- **Application code:** use the same [official language clients](/integrations/language-clients) as for ingestion
46+
- **BI and SQL tools:** ClickHouse ships an official [JDBC v2 driver](/integrations/java) (Java) and an [ODBC driver](/interfaces/odbc). Tableau, Looker, Power BI, Metabase, Apache Superset, and Grafana integrate via these drivers or dedicated connectors maintained by ClickHouse and partners
47+
- **Result format:** clients typically own serialization. You can request Arrow, Parquet, or other columnar formats on the wire if your product needs them
48+
49+
### Result-set sizing {#result-set-sizing}
50+
51+
Most analytical queries return small result sets (aggregates, summaries, top-N), and the wire is rarely the bottleneck. ClickHouse tables can hold billions of rows, and an unbounded `SELECT *` over a large fact table can move terabytes. **Shape the request in your application:** use `LIMIT`, pagination, streaming reads, and explicit column lists. If you build user-facing analytics, treat unbounded result sets as a UX problem, not a transport problem.
52+
53+
ClickHouse has a rich type system: arrays, tuples, maps, JSON, nested, LowCardinality, and more. Official clients map these to idiomatic language types. If your product surfaces ClickHouse data to end users, plan a type-mapping strategy early.
54+
55+
## Next steps {#next-steps}
56+
57+
Pick a path and prototype against a [ClickHouse Cloud trial](https://clickhouse.com/cloud), then register your integration in the [partner portal](https://clickhouse.com/partners).
58+
59+
## User-agent string convention {#user-agent-string-convention}
60+
61+
HTTP clients should set a `User-Agent` string that identifies your integration. ClickHouse parses this server-side to track adoption, surface usage telemetry, and inform the roadmap.
62+
63+
Format:
64+
65+
```text
66+
<app_name>/<app_version> <client_name>/<client_version> (<comment>; <key1>: <value1>; <key2>: <value2>)
67+
```
68+
69+
Examples:
70+
71+
- `clickhouse-java/0.8.0`
72+
- `my-analytics-app/3.1.2 clickhouse-js/1.2.0 (env: staging; region: us-east-1; lv: node/20.10)`
73+
74+
Rules:
75+
76+
- No whitespace in client name or version
77+
- If you include a comment, it must come first
78+
- Standard metadata keys: `lv` (language or framework version), `os`, `arch`
79+
- TCP and native protocol clients report client name and version via protocol fields, not `User-Agent`
80+
81+
If you use JDBC, see [client identification](/integrations/language-clients/java/jdbc#client-identification) for how the driver sets `User-Agent` and related fields.
82+
83+
## Sandbox and trial access {#sandbox-and-trial-access}
84+
85+
[ClickHouse Cloud](https://clickhouse.com/cloud) offers a free trial for development and integration validation. If you are a House Mate partner, you can request additional development credits through the [partner portal](https://clickhouse.com/partners).
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
slug: /integrations/integration-development/documenting-your-integration
3+
sidebar_label: 'Documenting your integration'
4+
sidebar_position: 4
5+
title: 'Documenting your ClickHouse integration'
6+
description: 'How to contribute integration pages to clickhouse-docs, including required sections and a copy-paste skeleton.'
7+
keywords: ['partner', 'integration', 'documentation', 'contributing', 'pull request', 'integration docs']
8+
doc_type: 'guide'
9+
---
10+
11+
# Documenting your ClickHouse integration
12+
13+
Integration documentation on this site gives end users one place to scope and troubleshoot setups. This page describes what to include, where files go, and how to open a pull request.
14+
15+
Start with [Building integrations](/integrations/integration-development/building-integrations) and [Testing your integration](/integrations/integration-development/testing-your-integration) if you have not already.
16+
17+
## Where docs live {#where-docs-live}
18+
19+
- **Repository:** [`ClickHouse/clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs)
20+
- **Format:** Markdown, built with Docusaurus
21+
- **Location:** `/docs/integrations/<category>/<your-integration>/`, where `<category>` reflects what your product does (`data-visualization`, `data-ingestion`, `language-clients`, and so on)
22+
- **Process:** open a pull request against `main`. The ClickHouse integrations team reviews. First-time contributors sign the Contributor License Agreement when the bot prompts on the PR
23+
24+
Integration pages in this repository are the primary reference for end users. You can link to supplementary documentation on your site from your integration page for product-specific details.
25+
26+
Good exemplars: [Tableau](/integrations/tableau) and [Metabase](/integrations/metabase).
27+
28+
## Choosing a category {#choosing-a-category}
29+
30+
Pick the category that best matches what your product does. Browse existing categories under [Integrations](/integrations) before you open a PR. If you are unsure, note your proposed category in the PR description and the integrations team will help place the page.
31+
32+
## Required sections {#required-sections}
33+
34+
Every integration page should cover the following, ideally in this order:
35+
36+
1. **Purpose.** What problem the integration solves, in two or three sentences. Avoid marketing copy. Readers are usually engineers scoping a setup
37+
2. **Prerequisites and supported version matrix.** What the user needs installed and which versions you support for **both ClickHouse Cloud and self-hosted (open source)**. A small table works well
38+
3. **Setup walkthrough.** Step-by-step instructions to a working connection, with **side-by-side coverage of Cloud and self-hosted** where they differ (host, port, TLS)
39+
4. **Authentication.** Which auth modes you support (username and password over TLS at minimum, plus mTLS, SSL client cert, IP allow-list notes if relevant)
40+
5. **End-to-end example.** At least one realistic example from connection through a meaningful result. Use a [ClickHouse example dataset](/getting-started/example-datasets) so readers can reproduce it
41+
6. **Known limits and performance characteristics.** Type-system gaps, result-set thresholds, throughput notes, unsupported features. Honesty here saves support cycles
42+
7. **Troubleshooting.** Common errors and resolutions. Two or three frequent cases are enough for a first version
43+
44+
## Style notes {#style-notes}
45+
46+
- **Show both Cloud and self-hosted.** Cloud typically uses HTTPS on port `8443` and native TCP on `9440`. Self-hosted defaults to `8123` and `9000`
47+
- **Use Docusaurus admonitions** (`:::note`, `:::warning`, `:::tip`) for callouts instead of bold paragraphs
48+
- **Link out for depth.** Link to existing docs for data types, formats, JDBC, ClickPipes, and similar topics instead of re-explaining them
49+
- **No marketing.** Integration pages here are technical reference. Promotional content belongs on your site; we can link to it from the partner directory
50+
51+
## Copy-paste skeleton {#copy-paste-skeleton}
52+
53+
Fill in the bracketed sections, save as `/docs/integrations/<category>/<your-integration>/index.md`, and open a PR.
54+
55+
```markdown
56+
# [Your product] and ClickHouse
57+
58+
[One to three sentences: what the integration does and why a
59+
ClickHouse user would want it.]
60+
61+
## Prerequisites
62+
63+
- [Your product, version X.Y or later]
64+
- ClickHouse Cloud, or self-hosted ClickHouse version [X.Y] or later
65+
- [Anything else: driver, plugin, network access requirements]
66+
67+
### Version matrix
68+
69+
| [Your product] | ClickHouse Cloud | ClickHouse open source | Notes |
70+
| -------------- | ---------------- | ---------------------- | -------- |
71+
| X.Y || ✅ 24.x+ | [if any] |
72+
73+
## Setup
74+
75+
### Connect to ClickHouse Cloud
76+
77+
1. In the ClickHouse Cloud console, select your service and click **Connect**.
78+
2. Choose **HTTPS**. Copy the host, port (8443), username, and password.
79+
3. In [your product], [steps to configure the connection].
80+
81+
### Connect to self-hosted ClickHouse
82+
83+
1. [How to point at a self-hosted instance — host, port 8123 or 9000, TLS notes.]
84+
2. In [your product], [steps to configure the connection].
85+
86+
## Authentication
87+
88+
[List supported auth modes — username/password over TLS, mTLS, etc. — and how
89+
to configure each.]
90+
91+
## Example: querying the [dataset] dataset
92+
93+
[Walkthrough using one of the ClickHouse example datasets, end-to-end.]
94+
95+
## Known limits
96+
97+
- [Types not yet supported, e.g., deeply nested JSON]
98+
- [Result-set size thresholds or other performance notes]
99+
- [Feature gaps]
100+
101+
## Troubleshooting
102+
103+
### [Common error message]
104+
105+
[Cause and resolution.]
106+
107+
### [Another common error]
108+
109+
[Cause and resolution.]
110+
```
111+
112+
## Review {#review}
113+
114+
The ClickHouse integrations team reviews PRs for technical accuracy, Cloud and self-hosted coverage, and docs style. Iterate in the PR until reviewers approve. That approval is the merge gate.
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
slug: /integrations/integration-development
3+
title: 'Integration development'
4+
sidebar_label: 'Overview'
5+
sidebar_position: 1
6+
description: 'Guides for building, testing, and documenting ClickHouse integrations.'
7+
keywords: ['integration development', 'build integration', 'partner', 'integration partner']
8+
doc_type: 'landing-page'
9+
---
10+
11+
# Integration development
12+
13+
These guides orient you if you build a product that connects to ClickHouse. They cover the integration surface, how to validate your connector, and how to publish documentation on this site.
14+
15+
:::note[Partner portal]
16+
Use the [partner portal](https://clickhouse.com/partners) to register your integration and access partner resources. The guides below cover how to build, test, and document your connector.
17+
:::
18+
19+
## Guides {#guides}
20+
21+
Read them in this order:
22+
23+
| Guide | What it covers |
24+
| ----- | -------------- |
25+
| [Building integrations](/integrations/integration-development/building-integrations) | Ingestion and consumption paths, wire protocols, clients, and user-agent conventions |
26+
| [Testing your integration](/integrations/integration-development/testing-your-integration) | Deployment modes, datasets, type coverage, and what to report before review |
27+
| [Documenting your integration](/integrations/integration-development/documenting-your-integration) | Required doc sections, style rules, and a PR skeleton for your product page |
28+
29+
After you prototype and test, contribute your integration page under [`/docs/integrations/<category>/<your-integration>/`](/integrations/integration-development/documenting-your-integration) and open a pull request against [`clickhouse-docs`](https://github.com/ClickHouse/clickhouse-docs).
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
slug: /integrations/integration-development/testing-your-integration
3+
sidebar_label: 'Testing your integration'
4+
sidebar_position: 3
5+
title: 'Testing your ClickHouse integration'
6+
description: 'Entry-level validation matrix for integrations on ClickHouse Cloud and self-hosted open source.'
7+
keywords: ['partner', 'integration', 'testing', 'validation', 'example datasets', 'ClickHouse Cloud', 'open source']
8+
doc_type: 'guide'
9+
---
10+
11+
# Testing your ClickHouse integration
12+
13+
Validate your integration against both ClickHouse deployment modes and datasets that exercise ClickHouse's type system at a meaningful scale before you submit it for review. This page defines what "tested" means at the entry level. Formal validation is a separate process for partners progressing to higher partnership tiers.
14+
15+
See [Building integrations](/integrations/integration-development/building-integrations) for ingestion and consumption paths, and [Documenting your integration](/integrations/integration-development/documenting-your-integration) for how to publish your results.
16+
17+
## Test matrix {#test-matrix}
18+
19+
Cover both deployment modes. Most customers run one or the other, and behavior differs in places (auth, networking, available features).
20+
21+
- **ClickHouse Cloud:** sign up for a [free trial](https://clickhouse.com/cloud). No credit card is required for the development tier
22+
- **Self-hosted (open source):** use the latest stable release from [GitHub releases](https://github.com/ClickHouse/ClickHouse/releases). The [install guide](/install) is the fastest path to a local instance with Docker
23+
24+
Test against both, and document any feature gaps in your integration page.
25+
26+
## What to test {#what-to-test}
27+
28+
**Functional correctness.** Exercise every code path your integration exposes: ingestion, querying, schema discovery, error handling, and reconnection. If your product surfaces SQL to end users, confirm that the queries your UI generates round-trip cleanly.
29+
30+
**Type-system coverage.** ClickHouse supports arrays, tuples, maps, JSON, nested, LowCardinality, Decimal, Date and DateTime variants, UUID, IPv4 and IPv6, enums, and aggregate-function types. Integrations often hit issues with nested arrays, deeply nested tuples, and JSON columns. Your client library and UI should handle these gracefully, and at a minimum, fail with a readable error instead of silently truncating or misrendering.
31+
32+
**Scale.** Test at result-set sizes and row counts your customers will run. For user-facing BI, that often means tables with hundreds of millions to billions of rows, and result sets from single aggregates to tens of thousands of rows. Unbounded reads (`SELECT *`) should fail predictably or paginate, not hang.
33+
34+
**Authentication.** Validate at least one TLS-enabled connection. If you expose auth configuration, test every mode you document (username and password over TLS, mTLS, SSL client certificate).
35+
36+
**Connection lifecycle.** Confirm sane behavior on dropped connections, server restarts, and slow queries. Many escalations trace back to connection handling rather than query semantics.
37+
38+
## Recommended example datasets {#recommended-example-datasets}
39+
40+
The full set can be found in the [**Example datasets**](/getting-started/example-datasets) section. The following four datasets cover most integration testing needs:
41+
42+
- **[GitHub events](/getting-started/example-datasets/github-events):** 3.1B rows with nested event payloads. Best for arrays, tuples, and nested types
43+
- **[NYC taxi data](/getting-started/example-datasets/nyc-taxi):** billions of rows with a well-known schema. Good for throughput and read-path testing
44+
- **[Stack Overflow](/getting-started/example-datasets/stackoverflow):** multi-table relational data for JOIN-heavy BI scenarios
45+
- **[Hacker News](/getting-started/example-datasets/hacker-news):** 28M rows, fast to load, useful for iteration
46+
47+
For extreme-scale validation, use **[WikiStat](/getting-started/example-datasets/wikistat)** (~0.5 trillion records).
48+
49+
## What to capture from your testing {#what-to-capture-from-your-testing}
50+
51+
When you submit your integration for review, share:
52+
53+
- ClickHouse versions tested (Cloud and open source)
54+
- Datasets and approximate scale (rows, on-disk size)
55+
- Types your integration handles and types it does not (this becomes the **Known limits** section of your docs)
56+
- Performance characteristics worth flagging, such as result-set thresholds where behavior changes
57+
58+
A short test report saves review cycles. A paragraph plus a table is enough.

sidebars.js

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1279,6 +1279,18 @@ const sidebars = {
12791279
},
12801280
],
12811281
},
1282+
{
1283+
type: 'category',
1284+
label: 'Integration development',
1285+
collapsed: true,
1286+
collapsible: true,
1287+
link: { type: 'doc', id: 'integrations/integration-development/index' },
1288+
items: [
1289+
'integrations/integration-development/building-integrations',
1290+
'integrations/integration-development/testing-your-integration',
1291+
'integrations/integration-development/documenting-your-integration',
1292+
],
1293+
},
12821294
],
12831295

12841296
managingData: [

0 commit comments

Comments
 (0)