Skip to content

Display data partition repair progress by providing a new SQL statement#17637

Open
zerolbsony wants to merge 3 commits into
apache:masterfrom
zerolbsony:manually-trigger-repair-data-partiton-progress
Open

Display data partition repair progress by providing a new SQL statement#17637
zerolbsony wants to merge 3 commits into
apache:masterfrom
zerolbsony:manually-trigger-repair-data-partiton-progress

Conversation

@zerolbsony

@zerolbsony zerolbsony commented May 11, 2026

Copy link
Copy Markdown
Contributor

Use the sql to display progress:
SHOW REPAIR DATA PARTITION TABLE PROGRESS

@CRZbulabula CRZbulabula left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feature — the end-to-end plumbing follows the existing statement pipeline well. I left inline comments on specific lines. One important item cannot be anchored inline because the file is not touched by this PR, so I'm raising it here:

[Must fix] iotdb-core/antlr/src/main/antlr4/org/apache/iotdb/db/qp/sql/IdentifierParser.g4 (keyWords rule, ~line 191)

The new PROGRESS lexer token added in SqlLexer.g4 is not registered in the keyWords rule. REPAIR, PARTITION, and TABLE are all present there, but PROGRESS is missing. As a result, any existing identifier or timeseries node named progress (e.g. SELECT progress FROM root.x or root.sg.d.progress) will fail to parse after this change — a backward-incompatible regression. Please add | PROGRESS to the list, alphabetically between PROCESSLIST and PROCESSOR:

    | PROCESSLIST
    | PROGRESS
    | PROCESSOR

final LoadManager currentLoadManager =
loadManager == null ? env.getConfigManager().getLoadManager() : loadManager;

final Set<TDataNodeConfiguration> targetDataNodes = new HashSet<>(allDataNodes);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Thread-safety] Unsynchronized cross-thread read of procedure fields.

This method runs on the RPC handler thread, but allDataNodes is a non-volatile ArrayList that the procedure-executor thread reassigns on every step (allDataNodes = dataNodeManager.getRegisteredDataNodes()) and clears in rollbackState (allDataNodes.clear()). Copying it via new HashSet<>(allDataNodes) while another thread structurally modifies or reassigns it can throw ConcurrentModificationException or observe a half-published reference. The same concern applies to skipDataNodes/failedDataNodes, which are reassigned with = new HashSet<>() in executeFromState/getInitialState. A transient exception here would fail the user's SHOW ... PROGRESS even though the repair itself is healthy.

Please snapshot these fields under a lock, make them volatile, or wrap the read so getProgress degrades to UNKNOWN instead of throwing.

try {
Object response =
SyncDataNodeClientPool.getInstance()
.sendSyncRequestToDataNodeWithGivenRetry(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Performance] Synchronous fan-out RPC inside a status query.

This issues a synchronous RPC with up to MAX_RETRY_COUNT (3) retries to every target DataNode, sequentially, on the RPC handler thread. On a large cluster with one slow/unreachable DataNode, a SHOW ... PROGRESS query can block for roughly N × retries × timeout.

The REQUEST_PARTITION_TABLES_HEART_BEAT state already polls per-DataNode generation progress — consider caching the latest per-DataNode progress on the procedure and reading it here, which avoids the extra fan-out entirely and is both faster and race-free. If the synchronous fan-out is kept, please bound or parallelize it.

return 0.95;
case WRITE_PARTITION_TABLE_TO_CONSENSUS:
return 0.99;
default:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Maintainability] Silent default branch.

The default branch maps any unhandled state to 0.0. If a new DataPartitionTableIntegrityCheckProcedureState is added later, progress will silently report 0% for it. Please add a LOG.warn here to catch enum drift.

.orElseGet(
() ->
new TShowRepairDataPartitionTableProgressResp(
RpcUtils.getStatus(TSStatusCode.SUCCESS_STATUS), "IDLE", 100.0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Semantics] IDLE returns progress = 100.0.

When no procedure is running, this returns IDLE with progress = 100.0. "Idle at 100%" is ambiguous — a user who has never triggered a repair will see 100% progress. Suggest returning 0.0 (or a distinct sentinel) for the idle case, or at minimum a comment clarifying the semantics.

public TShowRepairDataPartitionTableProgressResp showRepairDataPartitionTableProgress() {
TSStatus status = confirmLeader();
if (status.getCode() != TSStatusCode.SUCCESS_STATUS.getStatusCode()) {
return new TShowRepairDataPartitionTableProgressResp(status, "UNKNOWN", 0.0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Maintainability] Magic state strings duplicated across modules.

The state string is produced ad hoc in three different places: "UNKNOWN" here, "IDLE" in PartitionManager.showRepairDataPartitionTableProgress, and the raw enum name() in DataPartitionTableIntegrityCheckProcedure.getProgress. Please define these as shared constants or a small enum so the state contract is single-sourced and testable, rather than relying on matching string literals across modules.

return resp;
}

switch (currentGenerator.getStatus()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Possible NPE] Re-reading volatile currentGenerator without a local copy.

currentGenerator is volatile, but this method reads it multiple times (getStatus(), getProgress(), getErrorMessage()) after the null-check above. The generation-with-heartbeat path sets currentGenerator = null on completion, so a concurrent null-assignment between the null-check and this switch would NPE. Please capture final DataPartitionTableGenerator generator = currentGenerator; once at the top and operate on the local.

}

@Test
public void testShowRepairDataPartitionTableProgress() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Test coverage] Add an integration test for progress reporting.

This parser-level test is good, but it's the only coverage. Please add an integration test that exercises the feature end-to-end, covering: (a) SHOW REPAIR DATA PARTITION TABLE PROGRESS while a REPAIR DATA PARTITION TABLE procedure is running (asserting a 0–100 Progress(%) and a non-terminal Status); (b) the idle case when no procedure exists; and (c) the non-leader / UNKNOWN branch. This would also guard the concurrency and DataNode-RPC fan-out paths flagged in the other comments. Per the repo conventions, scope the new IT to only this feature (-Dit.test=ClassName) and place it with the existing repair-partition ITs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants