[typist] Go type consistency audit: duplicate clusters and the any-to-typed migration #34644
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-05-26T12:50:43.916Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey team — I just ran a type-consistency sweep over the Go code in
pkg/(839 files, 575 struct definitions, ~1900anyoccurrences in 325 non-test files) and there are some really nice wins available. The big-picture story is two-fold: first, thepkg/clilog-analysis pipeline has three slightly-different shapes for "one workflow run plus its analyses" that have to be hand-copied between each other, and thepkg/workflowSafeOutput family has ~25 nearly-identical config structs. Second, there's already a half-finished migration frommap[string]any→ typed structs (WorkflowData.ParsedTools,WorkflowStep,RuntimesConfig) — finishing it would knock out a large fraction of the remaininganysurface.Nothing here is urgent; this is a "where would refactoring time pay off most" map. Below is the full breakdown with file:line citations, organized by impact.
Full Analysis Report
Duplicated Type Definitions
Summary Statistics
.gofiles underpkg/)Cluster 1: SafeOutput entity-config family — Near duplicates
Type: Near duplicate (~25 structs differ only by docstring + a tiny field set)
Impact: High — largest concentration of duplicated declarations in the repo
Representative locations:
pkg/workflow/add_labels.go:10—AddLabelsConfig { BaseSafeOutputConfig; SafeOutputTargetConfig; SafeOutputFilterConfig; Allowed []string; Blocked []string }pkg/workflow/remove_labels.go:10—RemoveLabelsConfig { ...same three embeds; Allowed []string; Blocked []string }(byte-identical apart from the type name)pkg/workflow/assign_to_user.go:10—AssignToUserConfig { ...same three embeds; Allowed; Blocked; UnassignFirst *string }pkg/workflow/unassign_from_user.go:10,pkg/workflow/set_issue_type.go:10,pkg/workflow/set_issue_field.go:8,pkg/workflow/hide_comment.go:10,pkg/workflow/link_sub_issue.go:10,pkg/workflow/assign_milestone.go:10,pkg/workflow/mark_pull_request_as_ready_for_review.go:10,pkg/workflow/merge_pull_request.go:8,pkg/workflow/resolve_pr_review_thread.go:12,pkg/workflow/reply_to_pr_review_comment.go:11,pkg/workflow/dispatch_workflow.go:10,pkg/workflow/noop.go:10,pkg/workflow/call_workflow.go:13,pkg/workflow/create_check_run.go:14,pkg/workflow/create_agent_session.go:10,pkg/workflow/autofix_code_scanning_alert.go:10,pkg/workflow/add_reviewer.go:10,pkg/workflow/push_to_pull_request_branch.go:14,pkg/workflow/submit_pr_review.go:15,pkg/workflow/update_release.go:10Recommendation: Introduce a single shared
SafeOutputAllowBlockConfigembeddable (the three existing base embeds +Allowed []string+Blocked []string). ThenAddLabelsConfig/RemoveLabelsConfigcollapse totype AddLabelsConfig = SafeOutputAllowBlockConfig(or thin embeds). Estimated effort: 4-6 hours. Benefit: ~150 lines of boilerplate gone and the per-toolparse*Confighelpers can share more.Cluster 2: ProcessedRun / RunSummary / DownloadResult — Near duplicates
Type: Near duplicate (same conceptual record, three different shapes)
Impact: High — these are central to the
pkg/clilogs pipeline and currently require hand-copying between shapesLocations:
pkg/cli/logs_models.go:90—ProcessedRun { Run, AwContext, TaskDomain, BehaviorFingerprint, AgenticAssessments, AccessAnalysis, FirewallAnalysis, PolicyAnalysis, RedactedDomainsAnalysis, MissingTools, MissingData, Noops, MCPFailures, MCPToolUsage, TokenUsage, GitHubRateLimitUsage, JobDetails }(16 fields)pkg/cli/logs_models.go:203—RunSummary { CLIVersion, RunID, ProcessedAt, ...same 16+ analyses; ArtifactsList }pkg/cli/logs_models.go:229—DownloadResult { Run, Metrics, ...13 of the same analyses; Error, Skipped, Cached, LogsPath }13 of the analysis fields appear verbatim in all three structs.
Recommendation: Extract
type RunAnalyses struct { AwContext *AwContext; TaskDomain *TaskDomainInfo; BehaviorFingerprint *BehaviorFingerprint; AgenticAssessments []AgenticAssessment; AccessAnalysis *DomainAnalysis; FirewallAnalysis *FirewallAnalysis; RedactedDomainsAnalysis *RedactedDomainsAnalysis; MissingTools []MissingToolReport; MissingData []MissingDataReport; Noops []NoopReport; MCPFailures []MCPFailureReport; MCPToolUsage *MCPToolUsageData; TokenUsage *TokenUsageSummary; GitHubRateLimitUsage *GitHubRateLimitUsage; JobDetails []JobInfoWithDuration }and embed it. The three outer types keep only the role-specific fields (CLIVersion/ProcessedAt for RunSummary; Error/Skipped/Cached/LogsPath for DownloadResult). Estimated effort: 2-3 hours. Benefit: the manual field-copying between the three structs disappears.Cluster 3: MCPServerStats / MCPServerHealthDetail / MCPServerCrossRunHealth — Semantic duplicates
Locations:
pkg/cli/audit_report.go:197—MCPServerStats { ServerName, RequestCount, ToolCallCount, TotalInputSize, TotalOutputSize, AvgDuration, ErrorCount }pkg/cli/audit_expanded.go:88—MCPServerHealthDetail { ServerName, RequestCount, ToolCalls, ErrorCount, ErrorRate, ErrorRateStr, AvgLatency, Status }pkg/cli/audit_cross_run.go:89—MCPServerCrossRunHealth { ServerName, RunsConnected, TotalRuns, TotalCalls, TotalErrors, ErrorRate, Unreliable }Recommendation: Define a small
MCPServerCoreStats { ServerName; Requests; ToolCalls; Errors }shared bypkg/cli/audit_*files and embed it in the three role-specific extensions. Impact: medium — clearer code in three audit reports, easier to keep them in sync.Cluster 4: WorkflowTrialResult / TrialArtifacts — Near duplicates
Locations:
pkg/cli/trial_types.go:6—WorkflowTrialResult { WorkflowName; RunID; SafeOutputs map[string]any; AgenticRunInfo map[string]any; AdditionalArtifacts map[string]any; Timestamp time.Time }pkg/cli/trial_support.go:19—TrialArtifacts { SafeOutputs map[string]any; AgenticRunInfo map[string]any; AdditionalArtifacts map[string]any }Recommendation:
type WorkflowTrialResult struct { TrialArtifacts; WorkflowName; RunID; Timestamp }. 30-minute change.Cluster 5: PRInfo / PullRequest — Semantic duplicates
Locations:
pkg/cli/pr_command.go:27—PRInfo { Number, Title, Body, State, HeadSHA, BaseBranch, HeadBranch, SourceRepo, TargetRepo, AuthorLogin }pkg/cli/pr_automerge.go:21—PullRequest { Number, Title, IsDraft, Mergeable, CreatedAt, UpdatedAt }Both represent a GitHub PR in the same package, both unmarshal
gh pr --jsonoutput.Recommendation: Merge into a single
PullRequestwith the union of fields +omitemptyJSON tags so each command requests only what it needs fromgh pr --json.Cluster 6: AccessLogSummary / FirewallLogSummary — Near duplicates
Locations (same file):
pkg/cli/logs_report_firewall.go:12—AccessLogSummary { TotalRequests, AllowedCount, BlockedCount, AllowedDomains, BlockedDomains, ByWorkflow map[string]*DomainAnalysis }pkg/cli/logs_report_firewall.go:22—FirewallLogSummary { TotalRequests, AllowedRequests, BlockedRequests, AllowedDomains, BlockedDomains, RequestsByDomain, ByWorkflow map[string]*FirewallAnalysis }Recommendation: A shared
LogDomainSummaryparameterized by the per-workflow value type, or simply normalize the field names (AllowedRequests/BlockedRequestsis more descriptive) and share most fields.Cluster 7: Audit Diff
*Summarycluster — Near duplicatesLocations (same file):
pkg/cli/audit_diff.go:50—FirewallDiffSummary { NewDomainCount, RemovedDomainCount, StatusChangeCount, VolumeChangeCount, HasAnomalies, AnomalyCount }pkg/cli/audit_diff.go:240—MCPToolsDiffSummary { NewToolCount, RemovedToolCount, ChangedToolCount, HasAnomalies, AnomalyCount }pkg/cli/audit_diff.go:299—ToolCallsDiffSummary { NewToolCount, RemovedToolCount, ChangedToolCount, Run1TotalCalls, Run2TotalCalls }Recommendation: Generic
DiffSummary[T] struct { New, Removed, Changed int; HasAnomalies bool; AnomalyCount int }and per-type extensions for totals. Standardize on "New/Removed/Changed" instead of "NewDomain/NewTool".Cluster 8: Engine identity (AuditEngineConfig / EngineConfig / AwInfo) — Semantic duplicates
Locations:
pkg/cli/audit_expanded.go:21—AuditEngineConfig { EngineID, EngineName, Model, Version, CLIVersion, FirewallVersion, MCPServers, TriggerEvent, Repository }pkg/workflow/engine.go:29—EngineConfig(workflow runtime config)pkg/workflow/engine_definition.go:178—ResolvedEngineTargetField names diverge between layers (
IDvsEngineID), forcing manual mapping inextractEngineConfigWithInferredEngine.Recommendation: Introduce
EngineIdentity { EngineID, EngineName, Model, Version }inpkg/types, embed in bothworkflow.EngineConfigandcli.AuditEngineConfig. Eliminates the manual mapping.Cluster 9: Workflow representation triple — Near duplicates
Locations (all in
pkg/cli):pkg/cli/add_workflow_resolution.go:21—ResolvedWorkflow { Spec *WorkflowSpec; Content []byte; SourceInfo *FetchedWorkflow; ... }(the struct comment explicitly notesContentis a "convenience accessor" duplicatingSourceInfo.Content)pkg/cli/fetch.go:35—FetchedWorkflow { Content []byte; CommitSHA; IsLocal; SourcePath; ConvertedFromJSON; JSONConversionWarnings }pkg/cli/spec.go:33—WorkflowSpecpkg/cli/jsonworkflow_to_markdown.go:137—GeneratedWorkflowpkg/cli/workflows.go:61—GitHubWorkflow { ID, Name, Path, State }Recommendation: Drop the duplicated
Content []bytefromResolvedWorkflow(always read throughSourceInfo); renameGitHubWorkflow→GitHubWorkflowListItem(different concept: GitHub API listing payload).Cluster 10: ModelTokenUsage / ModelTokenUsageRow — Near duplicates
Locations:
pkg/cli/token_usage.go:66—ModelTokenUsage { Provider, InputTokens, OutputTokens, CacheReadTokens, CacheWriteTokens, Requests, DurationMs, ResponseBytes, EffectiveTokens }pkg/cli/token_usage.go:79—ModelTokenUsageRow { Model, Provider, InputTokens, OutputTokens, CacheReadTokens, CacheWriteTokens, EffectiveTokens, Requests, AvgDuration }Recommendation: Drop
ModelTokenUsageRow; render the map directly with a display helper that formatsDurationMsat print time.Cluster 11: TokenUsageEntry / TokenUsageSummary / TokenUsageDiff token fields — Near duplicates
Locations:
pkg/cli/token_usage.go:23—TokenUsageEntry { InputTokens, OutputTokens, CacheReadTokens, CacheWriteTokens, EffectiveTokens, ... }pkg/cli/token_usage.go:51—TokenUsageSummary { TotalInputTokens, TotalOutputTokens, TotalCacheReadTokens, TotalCacheWriteTokens, TotalEffectiveTokens, ... }pkg/cli/audit_diff.go:250—TokenUsageDiff { Run1InputTokens, Run2InputTokens, Run1OutputTokens, Run2OutputTokens, Run1CacheReadTokens, Run2CacheReadTokens, Run1CacheWriteTokens, Run2CacheWriteTokens, Run1EffectiveTokens, Run2EffectiveTokens, ... }Recommendation:
type TokenCounts struct { Input, Output, CacheRead, CacheWrite, Effective int }. Embed in entry, use as value in summary totals, use aRun1, Run2 TokenCountspair in the diff. Adding a sixth token class today requires editing three places — after this change, one.Cluster 12: AddCommentConfig (deprecated empty) — Dead-code leftover
Location:
pkg/workflow/add_comment.go:10—type AddCommentConfig struct{}with comment saying "deprecated, use AddCommentsConfig"Recommendation: Delete after verifying no callers, or alias
type AddCommentConfig = AddCommentsConfig. 5-minute change.Untyped Usages
Summary Statistics
interface{}usages: 0 (the codebase usesanyexclusively — good)anyusages (non-test): ~1900 across 325 filesmap[string]anyusages (non-test): dominant pattern; top-50 files account for >700 occurrencesFiles with heaviest untyped usage:
pkg/parser/import_field_extractor.go— 74anypkg/workflow/compiler_safe_outputs_handlers.go— 52anypkg/parser/schema_suggestions.go— 45anypkg/parser/mcp.go— 42anypkg/workflow/tools.go,pkg/workflow/role_checks.go— 32anyeachpkg/workflow/trigger_parser.go— 31anyImportant context: The project has already invested in
pkg/typeutil/convert.go(Lookup/Parse/Convert helpers) andpkg/workflow/tools_types.go(aToolsstruct) explicitly to migrate away frommap[string]any. The recommendations below align with that migration.Category 1:
anyin function parameters (runtime type is consistent)Example 1:
getGitHub*accessor familypkg/workflow/mcp_github_config.go:138, 163, 182, 194, 204, 264, 284, 414, 532func getGitHubToken(githubTool any) string,getGitHubLockdown(githubTool any) bool,getGitHubType(githubTool any) string, ... (about a dozen accessors)if toolConfig, ok := githubTool.(map[string]any); ok { ... }— the value is always the GitHub tool sub-map from frontmatterfunc getGitHubToken(githubTool *GitHubToolConfig) string—WorkflowData.ParsedTools.GitHubis already typedExample 2: PermissionsParser
pkg/workflow/permissions_parser.go:244—func NewPermissionsParserFromValue(permissionsValue any) *PermissionsParserGitHubActionsPermissionsConfig/GitHubAppPermissionsConfigalready exist inpkg/workflow/frontmatter_types.goFrontmatterConfig.PermissionsTypedExample 3: Job-dependency helpers
pkg/workflow/compiler_jobs.go:55, 72, 92—jobDependsOnPreActivation(jobConfig map[string]any) bool,jobDependsOnActivation,jobDependsOnAgentjobConfig["needs"]then asserts to either[]anyorstringtype JobConfig struct { Needs StringOrSlice; ... }Example 4: Dependabot value normalizers
pkg/workflow/dependabot.go:543, 570, 601—dependabotToAnySlice(value any) ([]any, bool),dependabotToStringAnyMap(value any) (map[string]any, bool),isYAMLNullOrEmptyScalar(value any) boolDependabotUpdatestruct (matching the documented schema) and unmarshal directly viayaml.Unmarshalonce; have helpers operate on the struct(value, ok)plumbing pathExample 5: OTLP header normalizer
pkg/workflow/observability_otlp.go:30—normalizeOTLPHeadersForEndpoint(raw any, endpoint string) stringstring,map[string]any)type OTLPHeaders struct { Raw string; Map map[string]string }(or a sum type) parsed once at the frontmatter boundaryExample 6: Workflow inputs extractor
pkg/workflow/workflow_inputs_extractor.go:40—extractInputsFromParsedWorkflow(workflow map[string]any, trigger string) map[string]anyworkflow["on"][trigger]["inputs"], returning empty map on each failureOnSection/WorkflowInputSpec(the project already hasparser.FrontmatterResultandFrontmatterConfig.On)Category 2:
map[string]anystruct fieldsExample 1:
WorkflowDatacarries 5 untyped mapspkg/workflow/compiler_types.go:483, 524, 537, 540, 551, 555Tools map[string]any,Jobs map[string]any,Features map[string]any,Runtimes map[string]any,RawFrontmatter map[string]any,ResolvedMCPServers map[string]anyToolsalready has a parallelParsedTools *Toolsfield, soToolsis a candidate for removal once readers migrate toParsedToolsRuntimesalready has*RuntimesConfigdefined infrontmatter_types.go:21Featuresonly holds bool+string —map[string]FeatureValue(a typed sum) would be enoughJobsshould becomemap[string]JobConfiganyusages stem from this one struct holding untyped mapsExample 2: Step lists held as
[]map[string]anypkg/workflow/compiler_types.go:499, 568, 746—WorkflowData.OnSteps,EngineConfigSteps,SecretMaskingConfig.Steps[]WorkflowStep—WorkflowStepalready exists inpkg/workflow/step_types.go:18withMapToStep/ToMapconvertersExample 3: TriggerIR filters
pkg/workflow/trigger_parser.go:23, 29—Filters map[string]any; AdditionalEvents map[string]anyFilterscarries branches/paths/tags/labels — a fixed settype TriggerFilters struct { Branches []string; BranchesIgnore []string; Paths []string; PathsIgnore []string; Tags []string; TagsIgnore []string; Types []string; Labels []string }anyreferences intrigger_parser.gobecomes field accessExample 4: Import-input maps
pkg/parser/import_field_extractor.go:60,pkg/parser/import_bfs.go:53, 59—importAccumulator.importInputs,nestedImportEntry.inputs,importBFSState.visitedInputsallmap[string]anytype ImportInputValues map[string]InputValuewhereInputValueis a small sum (string/int/bool); marshalling already JSON-encodes them inmarshalImportInputValueimport_field_extractor.go(the rejig docs #1 hotspot at 74any) andimport_bfs.go(22any) both rely on this — typing the inputs would cascadeExample 5: SafeOutputs handler-config builders
pkg/workflow/compiler_safe_outputs_handlers.go:6(handler registry) andpkg/workflow/compiler_safe_outputs_builder.go:112—type handlerBuilder func(*SafeOutputsConfig) map[string]anymap[string]anykeyed by string field namesHandlerConfigstruct (or a union of well-known structs); the JSON shape stays the same viajson.Marshalanyin one file)Example 6: GitHub MCP guard policies
pkg/workflow/mcp_github_config.go:284, 414—getGitHubGuardPolicies(githubTool any) map[string]any,deriveSafeOutputsGuardPolicyFromGitHub(githubTool any) map[string]anyGuardPolicy { Rules []GuardRule }explicitlyCategory 3:
anystruct fieldsExample 1:
WorkflowStep.ContinueOnErrorpkg/workflow/step_types.go:28—ContinueOnError any // Can be bool or string expressiontype BoolOrExpr struct { Bool *bool; Expr string }with custom YAML/JSON unmarshal — there's precedent:TemplatableInt32andTemplatableBoolalready exist for the same patternTemplatable*familyExample 2:
FrontmatterConfig.Checkoutpkg/workflow/frontmatter_parsing.go:54(config.Checkout.(bool))Checkout any(single obj, array of objs, or false-to-disable)CheckoutValueunion type with customUnmarshalJSON— the parser already callsParseCheckoutConfigs(config.Checkout), so the union can be parsed at unmarshal time.(bool)branchExample 3:
ThreatDetectionConfig.StepsandPostStepspkg/workflow/threat_detection.go:19, 20— both[]any[]WorkflowStepExample 4: SafeOutputsConfig top-level steps
pkg/workflow/compiler_types.go:692Steps []anyand:746Steps []map[string]any[]WorkflowStepmap[string]anybefore marshallingCategory 4: Untyped constants with semantic meaning
The constants package is already well-typed (
pkg/constants/constants.gousestime.Duration,fs.FileMode, semantic types likeLineLength,CommandPrefix,WorkflowID). Remaining candidates:Example 1: Network ports as
intliteralspkg/constants/constants.go:100-130—DefaultMCPGatewayPort = 8080,ClaudeLLMGatewayPort = 10000, etc.type NetworkPort uint16andconst DefaultMCPGatewayPort NetworkPort = 8080Example 2: Budget integers
pkg/constants/constants.go:263, 266—DefaultMaxEffectiveTokens int64 = 25000000,DefaultMaxRuns = 500type TokenBudget int64,type RunCount intExample 3: MCP type strings
pkg/parser/mcp.go:21—var ValidMCPTypes = []string{"stdio", "http", "local"}plus a stringly-typedIsMCPTypeswitchtype MCPType string; const ( MCPTypeStdio MCPType = "stdio"; ... )IsMCPType(s string)into a type-safe enumExample 4: AI engine identifier
pkg/workflow/compiler_types.go:486—AI string // "claude" or "codex"type EngineID string; const ( EngineClaude EngineID = "claude"; EngineCodex EngineID = "codex"; ... )Acceptable
anyuses (intentional — do not change)pkg/parser/schema_suggestions.go,pkg/parser/schema_compiler.go— operate on parsed JSON schema (arbitrary user-supplied structure)pkg/parser/yaml_import.go,frontmatter_hash.go,frontmatter_content.go— root YAML/JSON deserialization sinkspkg/typeutil/convert.go— purpose-built helpers for heterogeneous values; this is the place whereanylegitimately livespkg/workflow/yaml.go:371formatYAMLValue— generic YAML formatter, must accept any node valuepkg/linters/*— AST analyzers receiveast.Node, intrinsically polymorphicpkg/cli/codemod_*(~40 files) — codemod framework operates on raw YAML nodes;anyhere is structural, not domainpkg/workflow/expression_extraction.go:505marshalImportInputValue(value any) string— generic JSON marshaller for arbitrary input valuesRecommended Action Plan (Prioritized)
Priority 1 — Highest leverage
WorkflowData.ParsedToolsmigration (pkg/workflow/compiler_types.go:483). Remove the parallelTools map[string]anyfield. This unlocks typing the entiregetGitHub*accessor family (pkg/workflow/mcp_github_config.go) and similar Claude/Codex/Gemini tool expanders.RunAnalysesembeddable to deduplicateProcessedRun/RunSummary/DownloadResultinpkg/cli/logs_models.go. ~2-3 hours, removes hand-copying between three structs.pkg/parser/import_field_extractor.go, the single hottestanyfile at 74 occurrences). Cascades toimport_bfs.go.Priority 2 — Significant cleanup
SafeOutputAllowBlockConfigbase + thin per-entity wrappers. ~25 structs inpkg/workflow/lose their duplicated bodies.[]WorkflowStepeverywhere[]map[string]anyis currently used (compiler_types.go:499, 568, 746,threat_detection.go:19, 20). The type and converters already exist.TriggerIR.FiltersasTriggerFiltersstruct inpkg/workflow/trigger_parser.go.TokenCounts(Cluster 11) — eliminates triple maintenance.Priority 3 — Polish
PRInfo+PullRequest(Cluster 5).MCPTypeandEngineID.NetworkPort/TokenBudgetsemantic types to constants.AddCommentConfig(Cluster 12).Analysis Metadata
pkg/anyusages: ~1900 across 325 filespkg/+ targeted file reads, cross-referenced against the existing typed migration scaffolding (pkg/typeutil,pkg/workflow/tools_types.go,pkg/workflow/step_types.go)Beta Was this translation helpful? Give feedback.
All reactions