[typist] Go Type Consistency Analysis #33290
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Typist - Go Type Analysis. A newer discussion is available at Discussion #33529. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Analyzed 812 non-test Go files under
pkg/covering ~722 type definitions. The codebase has clearly been migrating from untyped maps to strongly-typed structs (semantic types likeEngineName,JobName,StepID,MCPServerIDare in place, andParsedTools *Toolsis being introduced alongside the legacyTools map[string]any), so the remaining issues are concentrated in specific in-flight transitions rather than scattered widely.Two high-impact themes emerge: (1) the GitHub MCP accessor family in
pkg/workflow/mcp_github_config.gostill takesgithubTool anyand re-asserts tomap[string]anyeven though*GitHubToolConfigis already populated onWorkflowData.ParsedTools.GitHub; (2) a cluster of three structs inpkg/cli/logs_models.go(ProcessedRun,RunSummary,DownloadResult) duplicates ~15 fields each describing an analyzed workflow run. Lower-priority opportunities include grouping 11*Outputand 11Artifact*string constants under semantic types (OutputName,ArtifactName) to match the existing convention.pkg/, non-test)anyoccurrences (raw)constdeclarationsFull Analysis Report
Duplicated Type Definitions
Summary Statistics
Cluster 1 —
ProcessedRun/RunSummary/DownloadResult(near, high)Locations:
pkg/cli/logs_models.go:90—type ProcessedRun struct {pkg/cli/logs_models.go:203—type RunSummary struct {pkg/cli/logs_models.go:229—type DownloadResult struct {All three carry the same 15+ analysis fields (
WorkflowRun,AwContext,TaskDomain,BehaviorFingerprint,AccessAnalysis,FirewallAnalysis,RedactedDomainsAnalysis,MissingTools,MissingData,Noops,MCPFailures,MCPToolUsage,TokenUsage,GitHubRateLimitUsage,JobDetails). Each adds 1–3 variant-specific extras:RunSummaryaddsCLIVersion/RunID/ProcessedAt/Metrics/ArtifactsList;DownloadResultaddsError/Skipped/Cached/LogsPath/Metrics;ProcessedRunaddsPolicyAnalysis.Recommendation: Extract a shared
RunAnalysiscarrying the common fields and haveRunSummary/DownloadResultembed it plus their distinguishing fields. Eliminates ~40 lines of duplicated field declarations and locks the JSON shape in one place.Cluster 2 —
TokenClassWeightsvstokenClassWeights(exact, high)Locations:
pkg/types/token_weights.go:7—type TokenClassWeights struct {pkg/cli/effective_tokens.go:46—type tokenClassWeights struct {Both define identical 5
float64fields (Input,CachedInput,Output,Reasoning,CacheWrite). The only delta is JSON tag style (hyphens vs underscores). The cli copy is package-private and used only to deserialize the embeddedmodel_multipliers.json.Recommendation: Drop the private
tokenClassWeightsand reusepkg/types.TokenClassWeights. If the on-disk underscore form must stay, add a small wrapper or alternateUnmarshalJSON.Cluster 3 —
experimentStateJSON/ExperimentState&experimentRunRecord/ExperimentRunRecord(near, high)Locations:
pkg/cli/audit_report_experiments.go:35—type experimentStateJSON struct {pkg/cli/audit_report_experiments.go:42—type experimentRunRecord struct {pkg/cli/experiments_command.go:31—type ExperimentState struct {pkg/cli/experiments_command.go:37—type ExperimentRunRecord struct {Both files deserialize the same
state.jsonwritten bypick_experiment.cjs.ExperimentState↔experimentStateJSONhave the same 2 fields (Counts,Runs);ExperimentRunRecord↔experimentRunRecordhave the same 3 fields (RunID,Timestamp,Assignments). One pair is exported; the other is package-private in a sibling file.Recommendation: Delete the private duplicates in
audit_report_experiments.goand import the exported types — or hoist both to a newpkg/cli/experiments_types.go.View remaining duplicate clusters (9)
Cluster 4 —
CacheMemoryToolConfig/RepoMemoryToolConfig/CommentMemoryToolConfig(exact, medium)pkg/workflow/tools_types.go:400—type CacheMemoryToolConfig struct { Raw any }pkg/workflow/repo_memory.go:62—type RepoMemoryToolConfig struct { Raw any }pkg/workflow/tools_types.go:407—type CommentMemoryToolConfig struct { Raw any }Identical single-field wrappers (
Raw any "yaml:-") used as parse-time placeholders that defer to dedicated parsers. Differ only by name.Recommendation: Define a single
RawToolConfig struct{ Raw any }and either alias each name or just reuse the one type.Cluster 5 —
AccessLogSummaryvsFirewallLogSummary(near, medium)pkg/cli/logs_report_firewall.go:12—type AccessLogSummary struct {pkg/cli/logs_report_firewall.go:22—type FirewallLogSummary struct {Both summarize aggregated log requests across runs. Same shape (
TotalRequests,Allowed*,Blocked*,AllowedDomains,BlockedDomains,ByWorkflow), but field names differ (AllowedCountvsAllowedRequests) andFirewallLogSummaryaddsRequestsByDomain. TheaggregateDomainStatshelper already abstracts the difference via a callback.Recommendation: Unify into one
LogSummarywith an optionalRequestsByDomainand consistent count field names.Cluster 6 —
DomainAnalysisvsFirewallAnalysis(near, medium)pkg/cli/access_log.go:33—type DomainAnalysis struct {pkg/cli/firewall_log.go:131—type FirewallAnalysis struct {Both embed
DomainBucketsand addTotalRequests+ allowed/blocked counters. Field names differ (AllowedCount/BlockedCountvsAllowedRequests/BlockedRequests);FirewallAnalysisaddsRequestsByDomain. Both implement the sameLogAnalysisinterface with near-identicalAddMetricsmethods.Recommendation: Promote a shared
DomainCounters(TotalRequests,Allowed,Blocked) intoDomainBuckets, then makeFirewallAnalysisits strict superset.Cluster 7 —
ResolutionFailurevsGHAWManifestResolutionFailure(exact, medium)pkg/actionpins/actionpins.go:74—type ResolutionFailure struct { Repo, Ref, ErrorType }pkg/workflow/safe_update_manifest.go:39—type GHAWManifestResolutionFailure struct { Repo, Ref, ErrorType }Same 3 fields representing an action-ref pinning failure. The workflow variant adds JSON tags for lock-file serialization; they are populated from each other across package boundaries.
Recommendation: Add JSON tags to
pkg/actionpins.ResolutionFailureand deleteGHAWManifestResolutionFailure.Cluster 8 —
WorkflowStatusvsWorkflowListItem(near, low)pkg/cli/list_workflows_command.go:20—type WorkflowListItem struct {pkg/cli/status_command.go:24—type WorkflowStatus struct {WorkflowStatusis a strict superset ofWorkflowListItem(sameWorkflow/EngineID/Compiled/Labels/Onwith identical console tags, plusStatus/TimeRemaining/RunStatus/RunConclusion).Recommendation: Embed
WorkflowListItemintoWorkflowStatusso the JSON shape stays a strict extension and the list/status commands share the same row marshaller.Cluster 9 —
FirewallDiffSummary/MCPToolsDiffSummary/ToolCallsDiffSummary(near, low)pkg/cli/audit_diff.go:50—type FirewallDiffSummary struct {pkg/cli/audit_diff.go:240—type MCPToolsDiffSummary struct {pkg/cli/audit_diff.go:299—type ToolCallsDiffSummary struct {All three are small summary structs with parallel
New/Removed/Changedcounters plus a couple of variant-specific extras (e.g.,HasAnomalies,VolumeChangeCount).Recommendation: Introduce a shared
DiffCounts(NewCount,RemovedCount,ChangedCount) embedded inside each summary, keeping variant-specific extras as separate fields.Cluster 10 —
PullRequestvsPRInfo(semantic, low)pkg/cli/pr_automerge.go:21—type PullRequest struct {pkg/cli/pr_command.go:27—type PRInfo struct {Both model a GitHub PR decoded from
gh proutput.PRInfocarries more metadata (Body, branches, repos, author);PullRequestcarriesMergeable/CreatedAt/UpdatedAt. Both live in the same package, hinting at parallel evolution.Recommendation: Define one
PullRequestsuperset in a sharedpr_types.gowithomitemptyon all optional fields.Cluster 11 —
WorkflowRunInfo/ListWorkflowRunsOptions/WorkflowRunResult(near, low)pkg/cli/run_workflow_tracking.go:19—type WorkflowRunInfo struct {pkg/cli/logs_github_api.go:168—type ListWorkflowRunsOptions struct {pkg/cli/run_workflow_execution.go:46—type WorkflowRunResult struct {Three workflow-run-centric structs scattered across
run_*.goandlogs_*.gowith overlapping run-identity fields (run id, workflow name, status). The codebase already has the heavierWorkflowRuninlogs_models.go.Recommendation: Audit which fields are actually consumed by each caller, then either reuse
WorkflowRunor compose around it. At minimum group these under a singlerun_types.goso the duplication is visible.Cluster 12 — Build-tag pairs (false positive, low)
pkg/workflow/repository_features_validation.go:58↔repository_features_validation_wasm.go:5(RepositoryFeatures)pkg/console/spinner.go:96↔spinner_wasm.go:10(SpinnerWrapper)pkg/console/progress.go:31↔progress_wasm.go:7(ProgressBar)Intentional native-vs-WASM build-tag variants. Listed here only to prevent confusion. Keep as-is.
Untyped Usages
Summary Statistics
interface{}occurrences (literal text): 0 — the codebase consistently usesanyanyoccurrences: 1,437 — most are legitimate YAML/JSON polymorphic boundariesconstdeclarations: 309 — many are already typed via semantic aliases; ~22 in one file are the main remaining opportunityCategory 1 —
func_paramfor GitHub MCP accessors (high)GitHub MCP accessor functions take
githubTool anyand immediately assert tomap[string]any, despite a parsed*GitHubToolConfigexisting in parallel onWorkflowData.ParsedTools.GitHub. All call sites pass the samedata.Tools["github"]. Migrating eliminates ~11 redundant assertions in one file plus ~7 call-site asserts.Examples:
pkg/workflow/mcp_github_config.go:138—func getGitHubType(githubTool any) string*GitHubToolConfigfunc getGitHubType(cfg *GitHubToolConfig) stringtoolConfig.(map[string]any)then readstoolConfig["type"]/toolConfig["mode"]— both already exist ascfg.Type/cfg.Mode.pkg/workflow/mcp_github_config.go:163—func getGitHubToken(githubTool any) string→func getGitHubToken(cfg *GitHubToolConfig) string. Reads onlytoolConfig["github-token"].(string).pkg/workflow/mcp_github_config.go:182—func getGitHubLockdown(githubTool any) bool→func getGitHubLockdown(cfg *GitHubToolConfig) bool.pkg/workflow/mcp_github_config.go:264—func getGitHubAllowedTools(githubTool any) []string→func getGitHubAllowedTools(cfg *GitHubToolConfig) []string. WalkstoolConfig["allowed"]as[]anyof strings.pkg/workflow/mcp_renderer_github.go:17—RenderGitHubMCP(yaml, githubTool any, ...)→RenderGitHubMCP(yaml, cfg *GitHubToolConfig, ...). Delegates entirely to thegetGitHub*accessors above; tightening cascades.Category 2 —
struct_fieldfor dual Tools storage (high)WorkflowData.Tools map[string]anyis the legacy untyped tools map; a strongly-typedParsedTools *Toolsalready exists alongside it as part of an in-progress migration (see the comment(NEW: parsed from Tools map)). Many consumers still read the untyped map.pkg/workflow/compiler_types.go:482—Tools map[string]anyParsedTools *Toolson the next lineToolsmap and route all readers throughParsedTools(or expose accessors on*Tools).data.Tools["github"].(map[string]any).Category 3 —
type_assertion_patternondata.Tools[<name>](high)map[string]anyassertions ondata.Tools[<name>]recur acrosscompiler_difc_proxy.go,workflow_github_app.go,compiler_github_mcp_steps.go,compiler_yaml.go, andmcp_github_config.go— each call walks the same legacy map for the GitHub tool config.pkg/workflow/compiler_difc_proxy.go:85—githubTool, hasGitHub := data.Tools["github"]; ... return len(getGitHubGuardPolicies(githubTool)) > 0→ usedata.ParsedTools.GitHubdirectly.pkg/workflow/mcp_github_config.go:85—if toolConfig, ok := githubTool.(map[string]any); ok { _, hasGitHubApp := toolConfig["github-app"]; ... }→return cfg != nil && cfg.GitHubApp != nil. The same assert-then-lookup-by-key pattern repeats 11 times in this file with hard-coded YAML key strings.Category 4 —
untyped_constfor step-output names (medium)11 step-output name constants in
pkg/constants/job_constants.go:207-217are declared as plain untyped strings, despite the file defining semantic typesJobName,StepID, andMCPServerIDfor the analogous identifiers a few lines above.pkg/constants/job_constants.go:207—const IsTeamMemberOutput = "is_team_member"type OutputName string; const IsTeamMemberOutput OutputName = "is_team_member"${{ steps.<id>.outputs.<output> }}expressions next toStepIDconstants; anOutputNamealias matches the existing convention and prevents mixing withJobName/StepID.pkg/constants/job_constants.go:217—const ActivatedOutput = "activated"→const ActivatedOutput OutputName = "activated". Group-migrate all 11*Outputconstants in this range.Category 5 —
untyped_constfor artifact names and filenames (medium)11 artifact-name and filename constants in
pkg/constants/job_constants.go:90-170are plain untyped strings, even though the file already defines analogous semantic types for other identifier classes.pkg/constants/job_constants.go:95—const AgentArtifactName = "agent"→type ArtifactName string; const AgentArtifactName ArtifactName = "agent". CoversSafeOutputArtifactName,AgentArtifactName,DetectionArtifactName,ActivationArtifactName,ExperimentArtifactName,SafeOutputItemsArtifactName,SarifArtifactName— same domain, same APIs.pkg/constants/job_constants.go:105—const AgentOutputFilename = "agent_output.json"→type ArtifactFilename string; const AgentOutputFilename ArtifactFilename = "agent_output.json". CoversAgentOutputFilename,SafeOutputsFilename,TokenUsageFilename,GithubRateLimitsFilename,OtelJsonlFilename,OtlpExportErrorsFilename,TemporaryIdMapFilename,SarifFileName.Deliberately not flagged
frontmatter map[string]anyandtools map[string]anyparameters (~1,400 occurrences) — YAML pass-through boundaries serving multiple polymorphic schemas.containsTrigger(any),normalizeOTLPHeadersForEndpoint(any),marshalImportInputValue(any),containsExpression(any),validateImportInputType(any)— legitimate sum-type / recursive-traversal patterns over heterogeneous YAML nodes.CacheMemoryToolConfig.Raw any,CommentMemoryToolConfig.Raw any— explicitly documented as polymorphic (bool/object/array) with dedicated parsers.url_constants.go,engine_constants.go,version_constants.go, top ofconstants.go— already typed (URL,DocURL,EngineName,Version,JobName,StepID,MCPServerID,CommandPrefix,LineLength,WorkflowID,FileMode).Refactoring Recommendations
Priority 1 — Tighten GitHub MCP accessor signatures (high, ~half day)
Steps:
func getGitHub*(githubTool any, ...)inpkg/workflow/mcp_github_config.goto takecfg *GitHubToolConfig.toolConfig.(map[string]any)+toolConfig["<key>"]lookups with struct field reads (cfg.Type,cfg.Mode,cfg.GitHubToken,cfg.Allowed.ToStringSlice(), ...).compiler_difc_proxy.go,workflow_github_app.go,compiler_github_mcp_steps.go,compiler_yaml.go,mcp_renderer_github.goto passdata.ParsedTools.GitHubinstead ofdata.Tools["github"].go test ./...and verify generated YAML is byte-identical.Benefit: removes ~11 redundant type assertions and ~16 hard-coded YAML key strings in one file; pulls the GitHub tool config fully onto the typed
ParsedToolsrail.Priority 2 — Consolidate run-analysis structs in
logs_models.go(high, half to one day)Steps:
ProcessedRun,RunSummary,DownloadResultinto aRunAnalysisstruct.RunAnalysisin each of the three and keep only the distinguishing fields.go testand a diff over a sample report.Benefit: removes ~40 lines of duplicated declarations; one place to add new analysis dimensions.
Priority 3 — Delete cross-file private duplicates (high, < 1 hour each)
pkg/cli/effective_tokens.go:tokenClassWeights→ reusepkg/types.TokenClassWeights(add an alternateUnmarshalJSONif the underscore tag form must stay on disk).pkg/cli/audit_report_experiments.go:experimentStateJSON/experimentRunRecord→ importExperimentState/ExperimentRunRecordfromexperiments_command.go.pkg/workflow/safe_update_manifest.go:GHAWManifestResolutionFailure→ add JSON tags topkg/actionpins.ResolutionFailureand reuse.Priority 4 — Migrate untyped constants to semantic types (medium, half a day)
Steps:
type OutputName stringand re-type all*Outputconstants inpkg/constants/job_constants.go:207-217.type ArtifactName stringandtype ArtifactFilename stringand re-type the corresponding constants inpkg/constants/job_constants.go:90-170.${{ steps.... }}template builders — they accept the underlying string).Benefit: matches the existing semantic-alias convention (
JobName,StepID,MCPServerID); prevents mixing constant kinds at call sites.Implementation Checklist
*GitHubToolConfigRunAnalysisinpkg/cli/logs_models.gotokenClassWeightsduplicate inpkg/cli/effective_tokens.goexperimentStateJSON/experimentRunRecordduplicates inpkg/cli/audit_report_experiments.goGHAWManifestResolutionFailureintopkg/actionpins.ResolutionFailureOutputName,ArtifactName,ArtifactFilenamesemantic typesRawToolConfig, unify*LogSummary, embedWorkflowListIteminWorkflowStatus, sharedDiffCountsin audit_diff.goWorkflowRun/WorkflowRunInfo/WorkflowRunResultfamily for consolidationgo test ./...and full generation diff after each priorityAnalysis Metadata
pkg/, excluding_test.go)References:
Beta Was this translation helpful? Give feedback.
All reactions