[0.49-topru] topru manager keyspace cluster#82
Conversation
|
|
|
跟一下这轮 review comments,刚刚已经在
本地已跑过定向测试:
如果你更倾向于第 5 点走“fallback 回 etcd”而不是 fail-fast,我也可以再改。 |
# Conflicts: # src/sinks/topsql_data_deltalake/processor.rs
There was a problem hiding this comment.
claude:整体代码质量不错,测试覆盖充分,review comment 的处理也很到位(LRU cache、body 限制、fail-fast 配置校验等)。
另外有一个关于 TopSQLDeltaLakeSink::new 中 unsafe 块的建议:通过 raw pointer 从 Arc 中取出内部值比较脆弱,如果以后有人改了代码多加了一个 clone 就可能导致 UB。建议重构为:先创建 channel 和 shared state(Arc<Mutex<HashMap<...>>>),spawn task 时直接传这些 Arc,然后正常构造 Self 返回,完全不需要 unsafe。
|
跟进 review 处理内容:
本地定向验证:
|
|
codex: 关于 review body 里提到的 |
There was a problem hiding this comment.
claude:LGTM 🎉
上一轮的 review comments 都已修复:
- ✅
process_events_loop加了 max retry (5次) + exponential backoff (5s→60s cap) + 超限丢弃日志 - ✅
unsafe块移除,改用#[derive(Clone)]+sink.clone()(data 和 meta processor 都改了) - ✅
is_not_found_body收紧为"keyspace not found",支持纯文本和 JSON 格式,加了测试 - ✅
normalize_namespaces抽成公共函数normalize_namespace_list - ✅ shard config 初始化时读一次环境变量,缓存到 struct
- ✅ doc comment PLACEHOLDER 已补上
- ✅ FNV-1a hash 加了注释说明选择原因和跨服务契约
- Replace recursive JSON extraction with direct serde deserialization for ActiveTiDBAddress, removing ~100 lines of guessing logic - Remove redundant normalize_namespaces wrapper method - Change PdKeyspaceResolver::new to take Option<&TlsConfig> instead of owned Option<TlsConfig> to avoid unnecessary clones - Add debug_assert on replace_keyspace_route_segments to catch org_id/cluster_id containing path separators - Add TODO for resolve_keyspace concurrent cache miss dedup - Add comment explaining org=/cluster= path segment convention
Empty array from manager is a legitimate response (no active TiDB instances). Previously it was treated as an error while filter-to-empty after keyspace sharding was treated as Ok — inconsistent behavior. Now both cases flow through get_up_tidbs which logs and returns Ok, making the behavior uniform. Also removes the now-unused InvalidManagerResponse error variant.
… on retry process_events wrote dedup keys to LRU before flush_buffer persisted data to Delta Lake. If flush failed (e.g. PD keyspace lookup error), the retry path would find all keys already in LRU and silently skip them, causing permanent meta data loss. Fix: stage dedup keys in a pending_dedup_keys buffer during process_events (read-only LRU check). Commit to LRU only after flush_buffer succeeds. On failure, process_events_loop clears pending state so the cloned retry snapshot is re-processed cleanly.
# Conflicts: # src/sinks/topsql_data_deltalake/mod.rs # src/sinks/topsql_data_deltalake/processor.rs # src/sinks/topsql_meta_deltalake/mod.rs # src/sinks/topsql_meta_deltalake/processor.rs
…eyspace-cluster # Conflicts: # src/sources/topsql_v2/upstream/tidb/parser.rs
…ster module Deduplicate identical retry delay function and constants from both topsql_data_deltalake and topsql_meta_deltalake processors into common/keyspace_cluster.rs. Move the associated test as well.
Signed-off-by: zeminzhou <zhouzemin@pingcap.com>
…king Replace the single cache-miss-then-fetch pattern with double-check locking using per-keyspace Mutex guards. Concurrent resolve_keyspace calls for the same keyspace now serialize behind a shared lock so only the first caller issues the HTTP request; subsequent callers find the result in cache after acquiring the lock. Also extracts the HTTP fetch logic into fetch_keyspace_from_pd for clarity and removes the TODO comment.
…rowth Wrap the post-lock logic in an async block so the per-keyspace lock cleanup runs on every exit path: double-check cache hit, fetch success, and fetch error. Previously cleanup only ran on the straight-line success path, leaving map entries permanently for early-return callers.
PR Review Summary
22 files changed, +2510 / -971
主要改动
Manager-based TiDB 拓扑发现 (
tidb_manager.rs,fetch/mod.rs)TiDBManagerTopologyFetcher,通过 manager 的/api/tidb/get_active_tidb接口获取活跃 TiDB 实例,替代原来只能从 etcd 读取的方式manager_server_address+tidb_namespace配置,tidb_namespace在配了 manager 时为必填(fail-fastConfigurationError)Keyspace 分片路由 (
tidb_manager.rs)VECTOR_STS_REPLICA_COUNT/VECTOR_STS_ID环境变量实现 keyspace 级别的 StatefulSet 分片keyspace_name取模分配 TiDB 实例keyspace_name的 TiDB 实例在分片模式下被跳过并打 warn 日志PD Keyspace → org/cluster 路由 (
keyspace_cluster.rs)PdKeyspaceResolver,通过 PD/pd/api/v2/keyspaces/{name}接口查询 keyspace 的serverless_tenant_id/serverless_cluster_idSink 层 keyspace 路由写入 (
topsql_data_deltalake/processor.rs,topsql_meta_deltalake/processor.rs)enable_keyspace_cluster_mapping配置项,开启后base_path中的org=xxx/cluster=xxx模板段会被替换为实际路由值HashMap<String, DeltaLakeWriter>改为HashMap<WriterKey, DeltaLakeWriter>,WriterKey包含table_name+table_path,支持同一 table_name 写入不同 org/cluster 路径unsafe块(原来通过 raw pointer 从Arc取值),改用#[derive(Clone)]+sink.clone()Meta sink dedup 修复 (
topsql_meta_deltalake/processor.rs)process_events阶段就写入 LRU,如果后续flush_buffer失败,retry 时这些 key 已在 LRU 中被跳过,导致永久丢失pending_dedup_keys暂存,仅在flush_buffer成功后才commit_dedup_keys写入 LRU{keyspace}_{digest}_{date}),避免不同 keyspace 的 meta 互相去重TiKV TopSQL 采集开关 (
topsql_v2/controller.rs,topsql_v2/mod.rs)enable_tikv_topsql配置(默认 true),关闭后只采集 TiDB TopSQL/TopRU 数据Keyspace 传播到 meta 事件 (
upstream/tidb/parser.rs)topsql_sql_meta和topsql_plan_meta事件现在携带keyspace字段(从 proto 的keyspace_name解析)decode_keyspace_name抽成公共方法,统一处理空值和 UTF-8 解码Schema evolution 代码清理 (
deltalake_writer/mod.rs,deltalake_writer/schema.rs,tests/system_tables_integration_test.rs)check_schema_evolution/reset_schema_cache/EvolutionResult等未使用的 schema evolution 检测逻辑test_sqlstatement_schema_evolution_add_column等)对现有功能的影响
enable_keyspace_cluster_mapping=false,enable_tikv_topsql=true,manager_server_address=None),不配置时行为与之前完全一致manager_server_address和tidb_namespace参数,所有调用方(keyviz, system_tables, topsql, topsql_v2)已适配,传NoneString改为WriterKeystruct,影响 writer 缓存的 key 结构,但逻辑等价deltalake_writer中的 schema evolution 检测逻辑被删除,如果有其他 sink 依赖这个功能需要注意(从 diff 看目前没有其他调用方)内存泄漏分析
无内存泄漏风险,具体分析:
PdKeyspaceResolver的 keyspace 路由缓存使用LruCache,默认容量 10000,超出自动淘汰HashMap<WriterKey, DeltaLakeWriter>按 (table_name, table_path) 组合增长,实际数量受 keyspace × component 类型限制,不会无限增长Arc::into_raw+ 手动重建的 unsafe 模式已被#[derive(Clone)]替代,消除了潜在的引用计数泄漏风险seen_keys_sql_meta/seen_keys_plan_meta使用配置化容量的 LRU,有上限其他观察
process_events_loop的 retry 会 clone 整个events_vec(retry_snapshot),在大批量事件时可能有短暂的内存峰值,但最多 5 次 retry 后释放,不构成泄漏join_path和build_table_path在 data sink 和 meta sink 中有重复实现,可以考虑抽到 common 模块