feat: Add array#422
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds array-job support across CLI, protobufs, request filters, reply post-filtering, display formatting (including array-summary rows), and cbatch array submission handling; introduces JobId selector parsing/formatting and selector-driven filtering for job/step/array-task scopes. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CLI
participant Client
participant Ctld
participant Formatter
CLI->>Client: Parse selectors (ParseJobIdSelectorList) / configureJobIdSelectors
Client->>Ctld: Send QueryJobsInfoRequest with FilterIds & FilterArrayTaskIds
Ctld-->>Client: Reply with JobInfoList
Client->>Client: applyArrayAwareJobFilter / applyArrayAwareItemFilter
Client->>Formatter: formatJobIdForDisplay / insertArraySummaryRows
Formatter-->>CLI: Render table/json output
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
internal/cacct/job_selector.go (1)
79-81: Remove unreachable nil-guard foritem.job.
item.jobis always populated at construction, so this branch is dead code and can be dropped to simplify the filter path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cacct/job_selector.go` around lines 79 - 81, Remove the dead nil-guard that checks item.job in the filter loop: delete the conditional block "if item.job == nil { continue }" (used in the job selection/filtering code) because item.job is always initialized at construction; simply let the loop proceed without that check and adjust any surrounding logic/comments accordingly.internal/cqueue/output.go (1)
45-85: Consider the semantic meaning of the "Nodes" column in summary rows.For individual jobs, the "Nodes" column shows
NodeNum(compute nodes). In summary rows, it displays the taskcount(line 70). While the "ArraySummary" status and "ArrayTasks=%d" in NodeList/Reason provide context, users familiar with the column semantics may find this confusing.This is a minor UX consideration—the current behavior is functional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cqueue/output.go` around lines 45 - 85, The "Nodes"/"NodeNum" cell in buildArraySummaryRow currently shows the task count (info.count) which conflicts with the per-job semantics that "Nodes" represents compute nodes; update the "Nodes","NodeNum" case in buildArraySummaryRow to use the sample's node count (e.g., info.sample.NodeNum) and only fall back to info.count if sample.NodeNum is not set/zero, and keep the NodeList/Reason case as "ArrayTasks=%d" to preserve task-count context; modify the switch case for "Nodes","NodeNum" accordingly within buildArraySummaryRow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cacct/cacct.go`:
- Around line 82-87: The JSON path returns before the array-task-aware filtering
is applied, so selectors like "123_4" are ignored; move or apply the array-task
selector earlier by ensuring configureJobIdSelectors/selectors from
ParseJobIdSelectorList are applied to reply.JobInfoList before any early JSON
return (when FlagJson is set), i.e. call
applyArrayAwareItemFilter(reply.JobInfoList) or otherwise filter
reply.JobInfoList using the predicate built by buildFilterIdsFromSelectors()
prior to serializing/outputting JSON (also apply the same change for the other
output path noted around the later return).
- Around line 333-346: The grouping key in makeArraySummaryGroupKey currently
uses display metadata (account/username/partition/cmdline/SubmitTime) which can
collide for arrays submitted within the same second; change it to include a
stable backend identifier such as job.JobId (or job.ArrayJobId if that exists)
and guard for a nil/empty JobId; update makeArraySummaryGroupKey to append
job.JobId to the fmt.Sprintf key and return false if JobId is nil/empty. Also
adjust the logic referenced around lines 421-425 (the code that picks/surfaces a
JobId for the grouped summary) to rely on that stable identifier instead of
taking the largest JobId from a collapsed group.
In `@internal/cacct/cmd.go`:
- Line 130: Update the help text string that currently contains "%j/%JobID
- Display the ID of the job (array jobs use jobid_arraytaskid; steps use
jobid_arraytaskid.stepid)" to clarify non-array step IDs: change wording so it
distinguishes array vs non-array forms (e.g., "array jobs use jobid_arraytaskid;
non-array jobs use jobid; step IDs append .stepid — array step:
jobid_arraytaskid.stepid; non-array step: jobid.stepid"). Locate and modify the
usage/help literal that contains "%j/%JobID" in internal/cacct/cmd.go (the
command help/usage string) to use this clarified wording.
---
Nitpick comments:
In `@internal/cacct/job_selector.go`:
- Around line 79-81: Remove the dead nil-guard that checks item.job in the
filter loop: delete the conditional block "if item.job == nil { continue }"
(used in the job selection/filtering code) because item.job is always
initialized at construction; simply let the loop proceed without that check and
adjust any surrounding logic/comments accordingly.
In `@internal/cqueue/output.go`:
- Around line 45-85: The "Nodes"/"NodeNum" cell in buildArraySummaryRow
currently shows the task count (info.count) which conflicts with the per-job
semantics that "Nodes" represents compute nodes; update the "Nodes","NodeNum"
case in buildArraySummaryRow to use the sample's node count (e.g.,
info.sample.NodeNum) and only fall back to info.count if sample.NodeNum is not
set/zero, and keep the NodeList/Reason case as "ArrayTasks=%d" to preserve
task-count context; modify the switch case for "Nodes","NodeNum" accordingly
within buildArraySummaryRow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 99538f0c-5ed1-4e3f-b127-4af35bbe7738
📒 Files selected for processing (17)
internal/cacct/cacct.gointernal/cacct/cmd.gointernal/cacct/job_selector.gointernal/cbatch/cbatch.gointernal/cbatch/cmd.gointernal/cbatch/line.gointernal/cqueue/cmd.gointernal/cqueue/cqueue.gointernal/cqueue/flags.gointernal/cqueue/format.gointernal/cqueue/job_selector.gointernal/cqueue/output.gointernal/cqueue/step.gointernal/cwrapper/slurm.gointernal/util/job_id_selector.gointernal/util/string.goprotos/PublicDefs.proto
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cbatch/cbatch.go`:
- Around line 578-588: The summary formatting incorrectly requires
len(reply.JobIdList) > 1 and assumes reply.JobIdList[0] is a parent; instead
change the logic so that if job.ArrayIndexStart and job.ArrayIndexEnd are set
you always print the array summary (use
util.ConvertSliceToString(reply.JobIdList, ", ") for the allocated ids) and
remove any use of reply.JobIdList[0] as a parent identifier; if a true parent id
is needed rely on an explicit parent field in the reply proto rather than
inferring from list order.
In `@internal/ccancel/ccancel.go`:
- Around line 55-77: The code flattens selectors into filterIds and
filterArrayTaskIds and thus loses step↔array-task pairing (selectors), so update
the assembly to detect when the same JobId would end up in both filterIds and
filterArrayTaskIds and reject the request: while iterating selectors, track
per-JobId whether a step and/or an array-task was added (using the existing
filterIds and filterArrayTaskIds maps), and if you detect that both a StepId and
an ArrayTaskId would be associated with the same JobId, return an error from
this function (rather than populating req.FilterIds/req.FilterArrayTaskIds) with
a clear message referencing the conflicting JobId; alternatively, if you prefer
to preserve pairing long-term, instead modify the RPC to accept selector tuples,
but until the proto changes, implement the rejection check around
selectors/filterIds/filterArrayTaskIds and short-circuit with an error.
In `@internal/ccancel/cmd.go`:
- Around line 60-63: Replace the ad-hoc regex validation in
internal/ccancel/cmd.go with the shared parser: call
util.ParseJobIdSelectorList(args[0]) (instead of regexp.MatchString) and on
error return util.NewCraneErr(util.ErrorCmdArg, err.Error()) so validation is
centralized and consistent with internal/util/job_id_selector.go; also update
the command's Use string (the ccancel command's Use variable) to document the
array/step syntax (<job_id>[_<array_task_id>][.<step_id>] and comma-separated
lists) so --help shows the canonical selector form.
In `@protos/Crane.proto`:
- Around line 187-189: The CancelJobReply must be extended to report results at
the array-task granularity: update the CancelJobReply (and any nested result
type like the existing per-job/per-step result message) to include an optional
uint32 array_task_id (or add a new repeated message ArrayTaskResult with fields
job_id, step_id, optional array_task_id, status/ok flag and optional error
string) so that cancel results can unambiguously represent 123_4 and capture
per-array-task partial failures; update any code that populates or reads
CancelJobReply to fill/consume the new array_task_id/error fields when
filter_array_task_ids (map<uint32, ArrayTaskIds>) is used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 926d532f-7016-42f0-891c-9d53a48018ce
📒 Files selected for processing (5)
internal/cbatch/cbatch.gointernal/ccancel/ccancel.gointernal/ccancel/cmd.goprotos/Crane.protoprotos/PublicDefs.proto
| filterIds := make(map[uint32]*protos.JobStepIds) | ||
| filterArrayTaskIds := make(map[uint32]*protos.ArrayTaskIds) | ||
|
|
||
| for _, sel := range selectors { | ||
| if _, ok := filterIds[sel.JobId]; !ok { | ||
| filterIds[sel.JobId] = &protos.JobStepIds{} | ||
| } | ||
| if sel.StepId != nil { | ||
| filterIds[sel.JobId].Steps = append(filterIds[sel.JobId].Steps, *sel.StepId) | ||
| } | ||
|
|
||
| if sel.ArrayTaskId != nil { | ||
| if _, ok := filterArrayTaskIds[sel.JobId]; !ok { | ||
| filterArrayTaskIds[sel.JobId] = &protos.ArrayTaskIds{} | ||
| } | ||
| filterArrayTaskIds[sel.JobId].ArrayTaskIds = append( | ||
| filterArrayTaskIds[sel.JobId].ArrayTaskIds, *sel.ArrayTaskId) | ||
| } | ||
| } | ||
|
|
||
| req.FilterIds = filterIds | ||
| if len(filterArrayTaskIds) > 0 { | ||
| req.FilterArrayTaskIds = filterArrayTaskIds |
There was a problem hiding this comment.
This request assembly loses array-task/step pairing.
Flattening selectors into map[job_id]step_ids plus map[job_id]array_task_ids makes different inputs indistinguishable. For example, 123_4.5 and 123_4,123.5 both serialize to the same request, even though one is a paired selector and the other is two independent filters. For a destructive command, that means the server can only guess what should be cancelled. Please either send selectors as tuples in the RPC, or reject any input that produces both step and array-task filters for the same job until the API can preserve that association.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/ccancel/ccancel.go` around lines 55 - 77, The code flattens
selectors into filterIds and filterArrayTaskIds and thus loses step↔array-task
pairing (selectors), so update the assembly to detect when the same JobId would
end up in both filterIds and filterArrayTaskIds and reject the request: while
iterating selectors, track per-JobId whether a step and/or an array-task was
added (using the existing filterIds and filterArrayTaskIds maps), and if you
detect that both a StepId and an ArrayTaskId would be associated with the same
JobId, return an error from this function (rather than populating
req.FilterIds/req.FilterArrayTaskIds) with a clear message referencing the
conflicting JobId; alternatively, if you prefer to preserve pairing long-term,
instead modify the RPC to accept selector tuples, but until the proto changes,
implement the rejection check around selectors/filterIds/filterArrayTaskIds and
short-circuit with an error.
| // Per-job array task ID filters. Key is job_id, value is list of | ||
| // array_task_ids to cancel. | ||
| map<uint32, ArrayTaskIds> filter_array_task_ids = 9; |
There was a problem hiding this comment.
CancelJobReply still can’t report array-task results.
Once the request can target job_id_array_task_id, the existing reply shape is no longer precise enough: it only returns job IDs and step IDs, so ccancel 123_4 cannot be reported unambiguously, and partial failures per array task have nowhere to go. That ambiguity flows straight into both the normal output path and --json. Please extend the reply with array-aware result items before shipping this selector syntax.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@protos/Crane.proto` around lines 187 - 189, The CancelJobReply must be
extended to report results at the array-task granularity: update the
CancelJobReply (and any nested result type like the existing per-job/per-step
result message) to include an optional uint32 array_task_id (or add a new
repeated message ArrayTaskResult with fields job_id, step_id, optional
array_task_id, status/ok flag and optional error string) so that cancel results
can unambiguously represent 123_4 and capture per-array-task partial failures;
update any code that populates or reads CancelJobReply to fill/consume the new
array_task_id/error fields when filter_array_task_ids (map<uint32,
ArrayTaskIds>) is used.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
internal/cbatch/cbatch.go (1)
578-587:⚠️ Potential issue | 🟠 MajorDon't infer the parent array ID from
JobIdList[0].Line 582 still assumes the first returned ID is the array parent, but this reply shape only gives you a list. Depending on backend ordering, that can mislabel the first task ID or double-report the parent. Unless the proto adds an explicit parent field, keep printing the allocated IDs here instead of synthesizing a parent job id.
Possible fix
if len(reply.JobIdList) > 0 { + jobIdListString := util.ConvertSliceToString(reply.JobIdList, ", ") if job.ArrayIndexStart != nil && job.ArrayIndexEnd != nil { - // Array job: backend returns single parent job_id. - // Tasks will be expanded at scheduling time. - fmt.Printf("Submitted array job %d, array range [%d-%d].\n", - reply.JobIdList[0], + fmt.Printf("Submitted array job ids: %s, array range [%d-%d].\n", + jobIdListString, *job.ArrayIndexStart, *job.ArrayIndexEnd) } else { - jobIdListString := util.ConvertSliceToString(reply.JobIdList, ", ") fmt.Printf("Job id allocated: %s.\n", jobIdListString) } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cbatch/cbatch.go` around lines 578 - 587, The code incorrectly treats reply.JobIdList[0] as an array parent ID; update the array branch so it does not synthesize a parent ID but instead prints the actual allocated IDs like the non-array case. Replace the fmt.Printf that references reply.JobIdList[0] with a call that converts reply.JobIdList to a string (use util.ConvertSliceToString(reply.JobIdList, ", ")) and print that alongside the array range when job.ArrayIndexStart and job.ArrayIndexEnd are set, leaving reply.JobIdList usage consistent and avoiding any assumed parent ID.protos/Crane.proto (1)
187-189:⚠️ Potential issue | 🟠 Major
CancelJobReplystill cannot report array-task results.Once this request can target selectors like
123_4, the reply below still only carries job IDs and step IDs, so normal output and--jsoncannot distinguish per-task successes or failures. Please make the cancel result schema array-task-aware before shipping this path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@protos/Crane.proto` around lines 187 - 189, CancelJobReply currently only reports job_id/step_id and cannot represent per-array-task outcomes; update the proto to include array-task-aware results (e.g., add a repeated ArrayTaskCancelResult or a map<uint32, ArrayTaskCancelResults> field to CancelJobReply) and define a new message (ArrayTaskCancelResult) that includes array_task_id, success/failure status, and optional error string; ensure any server code that populates CancelJobReply (handlers that reference filter_array_task_ids and CancelJobReply) is updated to fill the new array-task fields so CLI/JSON output can distinguish per-task successes/failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cbatch/cbatch.go`:
- Around line 131-138: The current ParseArrayRangeSpec only accepts "start-end"
and thus breaks compatibility with Slurm array syntax; update
ParseArrayRangeSpec to parse full Slurm forms (ranges with optional step like
"1-10:2", comma-separated lists like "1,3,5", and optional concurrency spec
"%N") and return a canonical representation (e.g., explicit start/end/step or
expanded index list) that the callers can consume; adjust callers that set
FlagArray / job.ArrayIndexStart / job.ArrayIndexEnd (the two callers that invoke
ParseArrayRangeSpec) to use the new representation (or convert lists to
start/end/step) or, if you prefer a smaller change, add clear validation and
user-facing error explaining supported subset and a migration hint.
In `@internal/ccontrol/ccontrol.go`:
- Around line 53-61: The loop that processes selectors in
internal/ccontrol/ccontrol.go currently ignores sel.StepId and thus widens
selectors like "123.4" to "123"; update the loop to mirror
internal/cqueue/job_selector.go by detecting selectors with sel.StepId != nil
and rejecting them immediately (returning or propagating an error) instead of
treating them as a plain job selector; ensure the error message clearly
references the offending selector and use the same error handling path as the
surrounding function so callers get a fast-fail when a .step selector is
provided.
In `@internal/cqueue/job_selector.go`:
- Around line 103-109: The display helpers use the child record's JobId instead
of the effective parent job id for array children, so change
formatJobIdForDisplay and formatStepIdForDisplay to use the ParentJobId when
present: for formatJobIdForDisplay, compute effectiveJobId := if job.ParentJobId
!= "" then job.ParentJobId else job.JobId and pass that with job.ArrayTaskId to
util.FormatJobIdWithArray; for formatStepIdForDisplay, compute effectiveParentId
:= if parentJob.ParentJobId != "" then parentJob.ParentJobId else
parentJob.JobId and pass that with parentJob.ArrayTaskId and stepInfo.StepId to
util.FormatStepIdWithArray so displayed IDs match the selector keys.
In `@protos/Crane.proto`:
- Around line 282-284: Modify the ModifyJobReply to be array-task-aware instead
of collapsing results by parent job id: add per-array-task result structures
(e.g., a repeated message or map keyed by job_id that lists individual
array_task_id entries with status/error fields) and update the RPC response
fields referenced by internal/ccontrol/ccontrol.go (where reply.ModifiedJobs and
reply.NotModifiedJobs are inspected) so the controller can report
successes/failures for specific job_id_arraytask_id selectors (preserve existing
parent-level fields if needed but provide new per-task fields to avoid merging
123_4 and 123_5 into one result).
---
Duplicate comments:
In `@internal/cbatch/cbatch.go`:
- Around line 578-587: The code incorrectly treats reply.JobIdList[0] as an
array parent ID; update the array branch so it does not synthesize a parent ID
but instead prints the actual allocated IDs like the non-array case. Replace the
fmt.Printf that references reply.JobIdList[0] with a call that converts
reply.JobIdList to a string (use util.ConvertSliceToString(reply.JobIdList, ",
")) and print that alongside the array range when job.ArrayIndexStart and
job.ArrayIndexEnd are set, leaving reply.JobIdList usage consistent and avoiding
any assumed parent ID.
In `@protos/Crane.proto`:
- Around line 187-189: CancelJobReply currently only reports job_id/step_id and
cannot represent per-array-task outcomes; update the proto to include
array-task-aware results (e.g., add a repeated ArrayTaskCancelResult or a
map<uint32, ArrayTaskCancelResults> field to CancelJobReply) and define a new
message (ArrayTaskCancelResult) that includes array_task_id, success/failure
status, and optional error string; ensure any server code that populates
CancelJobReply (handlers that reference filter_array_task_ids and
CancelJobReply) is updated to fill the new array-task fields so CLI/JSON output
can distinguish per-task successes/failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a3727b22-2c0e-4625-bd2f-0bf41926adcc
📒 Files selected for processing (6)
internal/cbatch/cbatch.gointernal/ccontrol/ccontrol.gointernal/cqueue/flags.gointernal/cqueue/job_selector.goprotos/Crane.protoprotos/PublicDefs.proto
🚧 Files skipped from review as they are similar to previous changes (1)
- protos/PublicDefs.proto
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
protos/PublicDefs.proto (2)
243-249: Non-sequential field numbering may cause maintenance confusion.The field numbers are declared out of order:
array_index_start = 53,array_index_end = 54,array_index_stride = 56, thenarray_task_id = 55. While valid in protobuf, this ordering is unconventional and could lead to confusion during future maintenance.Consider reordering the declarations to match the field number sequence, or renumber
array_task_idto 57 to maintain sequential ordering.🔧 Suggested reordering
// Job array spec for this submission (inclusive range). optional uint32 array_index_start = 53; optional uint32 array_index_end = 54; - optional uint32 array_index_stride = 56; // The concrete index of this task after expansion. optional uint32 array_task_id = 55; + optional uint32 array_index_stride = 56;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@protos/PublicDefs.proto` around lines 243 - 249, The protobuf field ordering for the job array spec is non-sequential: in PublicDefs.proto the fields array_index_start (53), array_index_end (54), array_index_stride (56) and array_task_id (55) are out of numeric order; fix by either reordering the field declarations so array_task_id appears after array_index_end and before array_index_stride to match numeric sequence, or renumber array_task_id to 57 (and update any dependent code/comments) so all field numbers are sequential; ensure you update only the declaration order/number and do not change field names or semantics in the message containing array_index_start, array_index_end, array_index_stride, and array_task_id.
438-443: Non-sequential field numbering: field 35 declared before 34.Similar to
JobToCtld, the field declarations are out of order (array_task_id = 35beforetask_node_list = 34). Consider swapping the declaration order or adjusting field numbers for consistency.The
task_node_listdocumentation is clear and useful—documenting thattask_node_list[task_id]maps to the node index innodelist.🔧 Suggested reordering
- optional uint32 array_task_id = 35; - // task_node_list[task_id] = index of the node in nodelist that owns this task. // Size equals ntasks. Provides the authoritative task_id -> node_idx mapping. repeated uint32 task_node_list = 34; + + optional uint32 array_task_id = 35;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@protos/PublicDefs.proto` around lines 438 - 443, The protobuf fields are out-of-order: optional uint32 array_task_id (array_task_id) is declared with tag 35 before repeated uint32 task_node_list (task_node_list) with tag 34; reorder the declarations so the lower tag appears first or renumber tags to be sequential (e.g., swap the two declarations or change array_task_id to 34 and task_node_list to 35) while keeping the existing field names and comments intact to preserve semantics and documentation.internal/util/string.go (1)
203-240: Consider precompiling the regex forParseArrayRangeSpec.The regex is compiled on every function call. While this is unlikely to be a performance bottleneck given typical usage, it would be more efficient to precompile it at package level, consistent with patterns used elsewhere in this file (e.g.,
ValidHostListRegexinCheckNodeList).♻️ Suggested refactor
+var arrayRangeSpecRegex = regexp.MustCompile(`^(\d+)-(\d+)(?::(\d+))?$`) + // ParseArrayRangeSpec parses the array syntax: start-end or start-end:stride. func ParseArrayRangeSpec(spec string) (uint32, uint32, uint32, error) { cleaned := strings.TrimSpace(spec) - // Match start-end or start-end:stride - re := regexp.MustCompile(`^(\d+)-(\d+)(?::(\d+))?$`) - result := re.FindStringSubmatch(cleaned) + result := arrayRangeSpecRegex.FindStringSubmatch(cleaned) if result == nil {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/util/string.go` around lines 203 - 240, ParseArrayRangeSpec currently compiles its regex on every call; extract and precompile the regexp at package level (e.g., add a var like parseArrayRangeSpecRe = regexp.MustCompile(`^(\d+)-(\d+)(?::(\d+))?$`) alongside other package regexes) and then replace the local re := regexp.MustCompile(...) in ParseArrayRangeSpec with a reference to that package-level variable so the function reuses the precompiled regex.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cbatch/cbatch.go`:
- Around line 131-139: Array task count ignores the stride when computing
arrayCount; update the calculation in the cmd.go logic that computes arrayCount
to account for job.ArrayIndexStride. Specifically, when computing arrayCount
(the variable currently assigned from job.ArrayIndexEnd and
job.ArrayIndexStart), read job.ArrayIndexStride (defaulting to 1 if nil or zero)
and compute arrayCount using the formula (end - start) / stride + 1 so the
number of tasks reflects the stride; ensure you use uint64 arithmetic consistent
with existing casts and keep
ParseArrayRangeSpec/job.ArrayIndexStart/job.ArrayIndexEnd/job.ArrayIndexStride
identifiers to locate and update the code.
---
Nitpick comments:
In `@internal/util/string.go`:
- Around line 203-240: ParseArrayRangeSpec currently compiles its regex on every
call; extract and precompile the regexp at package level (e.g., add a var like
parseArrayRangeSpecRe = regexp.MustCompile(`^(\d+)-(\d+)(?::(\d+))?$`) alongside
other package regexes) and then replace the local re := regexp.MustCompile(...)
in ParseArrayRangeSpec with a reference to that package-level variable so the
function reuses the precompiled regex.
In `@protos/PublicDefs.proto`:
- Around line 243-249: The protobuf field ordering for the job array spec is
non-sequential: in PublicDefs.proto the fields array_index_start (53),
array_index_end (54), array_index_stride (56) and array_task_id (55) are out of
numeric order; fix by either reordering the field declarations so array_task_id
appears after array_index_end and before array_index_stride to match numeric
sequence, or renumber array_task_id to 57 (and update any dependent
code/comments) so all field numbers are sequential; ensure you update only the
declaration order/number and do not change field names or semantics in the
message containing array_index_start, array_index_end, array_index_stride, and
array_task_id.
- Around line 438-443: The protobuf fields are out-of-order: optional uint32
array_task_id (array_task_id) is declared with tag 35 before repeated uint32
task_node_list (task_node_list) with tag 34; reorder the declarations so the
lower tag appears first or renumber tags to be sequential (e.g., swap the two
declarations or change array_task_id to 34 and task_node_list to 35) while
keeping the existing field names and comments intact to preserve semantics and
documentation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 677958c1-75d3-48be-b4df-f0b377de5781
📒 Files selected for processing (3)
internal/cbatch/cbatch.gointernal/util/string.goprotos/PublicDefs.proto
|
|
||
| // Per-job array task ID filters. Key is job_id, value is list of | ||
| // array_task_ids to cancel. | ||
| map<uint32, ArrayTaskIds> filter_array_task_ids = 9; |
There was a problem hiding this comment.
这个的解析太复杂了,我们可以直接让JobStepIds支持array id就行了
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
internal/cacct/job_selector.go (1)
9-12: Global state may cause issues in concurrent or test scenarios.Package-level variables for selector state can lead to race conditions if
cacctoperations are ever parallelized, and make unit testing harder due to shared state leakage between tests.Consider encapsulating this state in a struct that is passed through the call chain, or ensure tests always call
resetJobIdSelectors()in setup/teardown.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cacct/job_selector.go` around lines 9 - 12, The package-level variables jobIdSelectors and hasArrayTaskSelectors create shared mutable global state; replace them by encapsulating selector state into a struct (e.g., JobSelectorState with fields JobIdSelectors and HasArrayTaskSelectors) and thread that struct through the functions that currently read/write jobIdSelectors/hasArrayTaskSelectors (or alternatively ensure every test/setup calls resetJobIdSelectors() to clear state), updating callers to accept the new state parameter and removing the globals to avoid races and test leakage.internal/cqueue/job_selector.go (2)
9-14: Global state may cause issues in concurrent or test scenarios.Same concern as
internal/cacct/job_selector.go- package-level variables can lead to race conditions and test isolation issues.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cqueue/job_selector.go` around lines 9 - 14, The package-level variables jobIdSelectors, hasArrayTaskSelectors, allJobSelections, and arrayTaskIdSelections create global mutable state; refactor by introducing a JobSelector struct (e.g., type JobSelector struct { jobIdSelectors []util.JobIdSelector; hasArrayTaskSelectors bool; allJobSelections map[uint32]struct{}; arrayTaskIdSelections map[uint32]map[uint32]struct{} }) and move all functions that read/modify those globals to be methods on *JobSelector, update constructors to initialize the maps, and change call sites to create and pass a JobSelector instance instead of relying on globals to eliminate races and improve test isolation.
47-55: Clarify the semantics ofnilvalues in the filter map.The function sets
filterIds[selector.JobId] = nilwhich differs from the cacct version that uses&protos.JobStepIds{Steps: []uint32{}}for "all steps". Ifnilis intentionally used to mean "job exists, fetch all data", add a comment. Otherwise, consider using an emptyJobStepIdsfor consistency.func buildFilterIdsFromJobSelectors() map[uint32]*protos.JobStepIds { filterIds := make(map[uint32]*protos.JobStepIds) for _, selector := range jobIdSelectors { if _, ok := filterIds[selector.JobId]; !ok { - filterIds[selector.JobId] = nil + // nil indicates "all steps" for this job + filterIds[selector.JobId] = nil } } return filterIds }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cqueue/job_selector.go` around lines 47 - 55, The current buildFilterIdsFromJobSelectors function uses nil values for map entries (filterIds[selector.JobId] = nil) which is ambiguous compared to the cacct version that uses &protos.JobStepIds{Steps: []uint32{}} to mean “all steps”; update buildFilterIdsFromJobSelectors to either (A) keep nil intentionally but add a concise comment above the function explaining that a nil *protos.JobStepIds signals “job exists — fetch all steps” and ensure any callers check for nil accordingly, or (B) change the map value to a non-nil empty struct for consistency by setting filterIds[selector.JobId] = &protos.JobStepIds{Steps: []uint32{}} so semantics match cacct; reference buildFilterIdsFromJobSelectors, jobIdSelectors and protos.JobStepIds when making the change.protos/PublicDefs.proto (1)
240-247: Field numbers are out of order in JobToCtld.Field 55 (
array_task_id) is defined after field 56 (array_index_stride). While protobuf allows this, it reduces readability and can confuse maintainers. Consider reordering for clarity.// Job array spec for this submission (inclusive range). optional uint32 array_index_start = 53; optional uint32 array_index_end = 54; - optional uint32 array_index_stride = 56; + optional uint32 array_index_stride = 55; // The concrete array task index of this job after expansion. - optional uint32 array_task_id = 55; + optional uint32 array_task_id = 56;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@protos/PublicDefs.proto` around lines 240 - 247, In the JobToCtld message reorder the fields so numeric order is logical: move optional uint32 array_task_id (field 55) to appear before optional uint32 array_index_stride (field 56) so the sequence is array_index_start (53), array_index_end (54), array_task_id (55), array_index_stride (56); update only the field ordering in the message definition (JobToCtld) to improve readability without changing field numbers or types.internal/cqueue/output.go (1)
61-64: Semantic mismatch: "Nodes" column displays array task count.For array summary rows, the "Nodes" column shows
info.count(number of array tasks), not the number of nodes. This may confuse users expecting node counts. Consider using a different representation or clarifying in the column value.case "Nodes", "NodeNum": - row[i] = strconv.Itoa(info.count) + row[i] = "-" // Node count varies per task; see NodeList/Reason for task count case "NodeList/Reason", "NodeList": row[i] = fmt.Sprintf("ArrayTasks=%d", info.count)Alternatively, keep as-is if this matches the expected behavior from similar tools (e.g., Slurm's
squeue).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/cqueue/output.go` around lines 61 - 64, The "Nodes"/"NodeNum" switch currently uses info.count (array task count) which is semantically wrong for node counts; update the cases in the switch handling "Nodes"/"NodeNum" (and consider "NodeList/Reason"/"NodeList") so that when rendering an array-summary row you show the actual node count (e.g., len(info.NodeList) or info.nodeCount) or a clear placeholder like "-" if node information is not available, instead of info.count; locate the switch in internal/cqueue/output.go and replace the use of info.count for "Nodes"/"NodeNum" with the proper node-count field or computed length, and ensure "NodeList"/"NodeList/Reason" still represents array task details (or clarifies via "ArrayTasks=…") consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/cqueue/job_selector.go`:
- Around line 103-109: formatStepIdForDisplay currently dereferences parentJob
without nil-check which can panic; add a defensive nil check in
formatStepIdForDisplay and use parentJob.ArrayJobId and parentJob.ArrayTaskId
only when parentJob != nil, otherwise pass an empty/default arrayJobId and
arrayTaskId into util.ResolveArrayJobId and util.FormatStepIdWithArray so the
function behaves safely when parentJob is nil.
---
Nitpick comments:
In `@internal/cacct/job_selector.go`:
- Around line 9-12: The package-level variables jobIdSelectors and
hasArrayTaskSelectors create shared mutable global state; replace them by
encapsulating selector state into a struct (e.g., JobSelectorState with fields
JobIdSelectors and HasArrayTaskSelectors) and thread that struct through the
functions that currently read/write jobIdSelectors/hasArrayTaskSelectors (or
alternatively ensure every test/setup calls resetJobIdSelectors() to clear
state), updating callers to accept the new state parameter and removing the
globals to avoid races and test leakage.
In `@internal/cqueue/job_selector.go`:
- Around line 9-14: The package-level variables jobIdSelectors,
hasArrayTaskSelectors, allJobSelections, and arrayTaskIdSelections create global
mutable state; refactor by introducing a JobSelector struct (e.g., type
JobSelector struct { jobIdSelectors []util.JobIdSelector; hasArrayTaskSelectors
bool; allJobSelections map[uint32]struct{}; arrayTaskIdSelections
map[uint32]map[uint32]struct{} }) and move all functions that read/modify those
globals to be methods on *JobSelector, update constructors to initialize the
maps, and change call sites to create and pass a JobSelector instance instead of
relying on globals to eliminate races and improve test isolation.
- Around line 47-55: The current buildFilterIdsFromJobSelectors function uses
nil values for map entries (filterIds[selector.JobId] = nil) which is ambiguous
compared to the cacct version that uses &protos.JobStepIds{Steps: []uint32{}} to
mean “all steps”; update buildFilterIdsFromJobSelectors to either (A) keep nil
intentionally but add a concise comment above the function explaining that a nil
*protos.JobStepIds signals “job exists — fetch all steps” and ensure any callers
check for nil accordingly, or (B) change the map value to a non-nil empty struct
for consistency by setting filterIds[selector.JobId] = &protos.JobStepIds{Steps:
[]uint32{}} so semantics match cacct; reference buildFilterIdsFromJobSelectors,
jobIdSelectors and protos.JobStepIds when making the change.
In `@internal/cqueue/output.go`:
- Around line 61-64: The "Nodes"/"NodeNum" switch currently uses info.count
(array task count) which is semantically wrong for node counts; update the cases
in the switch handling "Nodes"/"NodeNum" (and consider
"NodeList/Reason"/"NodeList") so that when rendering an array-summary row you
show the actual node count (e.g., len(info.NodeList) or info.nodeCount) or a
clear placeholder like "-" if node information is not available, instead of
info.count; locate the switch in internal/cqueue/output.go and replace the use
of info.count for "Nodes"/"NodeNum" with the proper node-count field or computed
length, and ensure "NodeList"/"NodeList/Reason" still represents array task
details (or clarifies via "ArrayTasks=…") consistently.
In `@protos/PublicDefs.proto`:
- Around line 240-247: In the JobToCtld message reorder the fields so numeric
order is logical: move optional uint32 array_task_id (field 55) to appear before
optional uint32 array_index_stride (field 56) so the sequence is
array_index_start (53), array_index_end (54), array_task_id (55),
array_index_stride (56); update only the field ordering in the message
definition (JobToCtld) to improve readability without changing field numbers or
types.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9fce1098-4be6-4fec-8c38-90195097a006
📒 Files selected for processing (9)
internal/cacct/cacct.gointernal/cacct/job_selector.gointernal/cbatch/cbatch.gointernal/cqueue/job_selector.gointernal/cqueue/output.gointernal/cqueue/step.gointernal/util/job_id_selector.goprotos/Crane.protoprotos/PublicDefs.proto
✅ Files skipped from review due to trivial changes (1)
- internal/cacct/cacct.go
🚧 Files skipped from review as they are similar to previous changes (3)
- internal/cqueue/step.go
- internal/cbatch/cbatch.go
- internal/util/job_id_selector.go
| func formatStepIdForDisplay(stepInfo *protos.StepInfo, parentJob *protos.JobInfo) string { | ||
| return util.FormatStepIdWithArray( | ||
| util.ResolveArrayJobId(stepInfo.JobId, parentJob.ArrayJobId), | ||
| parentJob.ArrayTaskId, | ||
| stepInfo.StepId, | ||
| ) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how formatStepIdForDisplay is called to verify if parentJob can be nil
rg -n "formatStepIdForDisplay" --type=go -C3Repository: PKUHPC/CraneSched-FrontEnd
Length of output: 1328
🏁 Script executed:
#!/bin/bash
# Find StepData structure definition
rg -n "type StepData" --type=go -A 10Repository: PKUHPC/CraneSched-FrontEnd
Length of output: 721
🏁 Script executed:
#!/bin/bash
# Check the context around the call site in step.go at line 86
sed -n '70,100p' internal/cqueue/step.goRepository: PKUHPC/CraneSched-FrontEnd
Length of output: 846
🏁 Script executed:
#!/bin/bash
# Check the context around the call site in step.go at line 133
sed -n '120,145p' internal/cqueue/step.goRepository: PKUHPC/CraneSched-FrontEnd
Length of output: 591
🏁 Script executed:
#!/bin/bash
# Search for other potential calls to formatStepIdForDisplay that might not have been caught
rg -n "formatStepIdForDisplay" --type=goRepository: PKUHPC/CraneSched-FrontEnd
Length of output: 369
🏁 Script executed:
#!/bin/bash
# Find where stepDataList is populated
rg -n "stepDataList" --type=go -B 5 | head -100Repository: PKUHPC/CraneSched-FrontEnd
Length of output: 4863
Add nil check for parentJob parameter.
formatStepIdForDisplay accesses parentJob.ArrayJobId and parentJob.ArrayTaskId without checking if parentJob is nil. While stepData.job comes from populated slice data, the pointer type allows for nil values. Add a defensive check:
Suggested fix
func formatStepIdForDisplay(stepInfo *protos.StepInfo, parentJob *protos.JobInfo) string {
+ if parentJob == nil {
+ return fmt.Sprintf("%d.%d", stepInfo.JobId, stepInfo.StepId)
+ }
return util.FormatStepIdWithArray(
util.ResolveArrayJobId(stepInfo.JobId, parentJob.ArrayJobId),
parentJob.ArrayTaskId,
stepInfo.StepId,
)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func formatStepIdForDisplay(stepInfo *protos.StepInfo, parentJob *protos.JobInfo) string { | |
| return util.FormatStepIdWithArray( | |
| util.ResolveArrayJobId(stepInfo.JobId, parentJob.ArrayJobId), | |
| parentJob.ArrayTaskId, | |
| stepInfo.StepId, | |
| ) | |
| } | |
| func formatStepIdForDisplay(stepInfo *protos.StepInfo, parentJob *protos.JobInfo) string { | |
| if parentJob == nil { | |
| return fmt.Sprintf("%d.%d", stepInfo.JobId, stepInfo.StepId) | |
| } | |
| return util.FormatStepIdWithArray( | |
| util.ResolveArrayJobId(stepInfo.JobId, parentJob.ArrayJobId), | |
| parentJob.ArrayTaskId, | |
| stepInfo.StepId, | |
| ) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/cqueue/job_selector.go` around lines 103 - 109,
formatStepIdForDisplay currently dereferences parentJob without nil-check which
can panic; add a defensive nil check in formatStepIdForDisplay and use
parentJob.ArrayJobId and parentJob.ArrayTaskId only when parentJob != nil,
otherwise pass an empty/default arrayJobId and arrayTaskId into
util.ResolveArrayJobId and util.FormatStepIdWithArray so the function behaves
safely when parentJob is nil.
8f79957 to
d87dfea
Compare
Summary by CodeRabbit
New Features
Documentation
Behavior Changes