Skip to content

[refactor] Semantic function clustering: minor cohesion cleanups (no duplication found) #38980

@github-actions

Description

@github-actions

Overview

This is a semantic function-clustering analysis of all non-test .go files under pkg/ (913 source files, ~5,432 function definitions), focused on pkg/workflow/ (400 files) and pkg/cli/ (319 files).

Headline: the codebase is unusually well-modularized. No actionable copy-paste duplication was found in pkg/workflow/ or pkg/cli/. The package is strongly concern-partitioned (*_validation.go holds validators, *_parser.go holds parsers, etc.). The findings below are a small number of low-risk cohesion improvements — not bugs, not duplication.

Summary

  • Source files analyzed: 913 (non-test, under pkg/)
  • Function definitions parsed: ~5,432 (4,234 plain functions, 1,198 methods)
  • Actionable duplicate implementations: 0
  • Scattered-helper false positives explained: build-tag twins, delegating shims, linter testdata/ fixtures
  • Concrete cohesion opportunities: 3 files + 1 naming collision
  • Status: ✅ healthy organization

Findings

1. Outlier functions (mild file-cohesion opportunities)

Three files mix a secondary concern with their primary purpose. None are urgent.

a. pkg/workflow/js.go — JS lexer embedded among asset getters

The file is otherwise dedicated to Get*Script() asset accessors, but lines 71–391 contain a full JavaScript comment-stripper / tokenizer:

  • removeJavaScriptComments (js.go:71), removeJavaScriptCommentsFromLine (js.go:100)
  • isInsideStringLiteral (js.go:210), isInsideRegexLiteral (js.go:256)
  • canStartRegexLiteral (js.go:322), canStartRegexLiteralAt (js.go:327)
  • extractWordBefore (js.go:366), isLetter (js.go:386), isDigit (js.go:391)

Recommendation: extract the lexer helpers into js_lexer.go (or js_minify.go), leaving js.go for the script getters. Improved discoverability; behavior unchanged.

b. pkg/workflow/model_identifier.go — validators in a noun-named file

6 of 14 functions are validate*, in a file named for an identifier type. Sibling model_alias_validation.go already follows the *_validation.go convention:

  • validateProviderToken (model_identifier.go:231), validateModelToken (:251), validateModelGlobToken (:265), validateBareName (:283), validateParamKey (:327), validateParamValue (:343)

Recommendation: consider moving the validate* group into a model_validation.go (or merge with existing model validation), keeping parsing in model_identifier.go.

c. pkg/workflow/compiler_orchestrator_workflow.go — least cohesive by verb-spread

Mixes validateWorkflow* (4), mergeImported*/mergeWorkflow* (5), plus extract*/build*/parse*/format*Error. Expected for an orchestrator, but the validateWorkflowBuildContext / ...ModelAliasMap / ...EngineSettings / ...ToolConfigurations cluster reads like it could live beside the other validators. Lowest priority.

2. Naming collision (readability/navigation hazard)

extractToolsFromFrontmatter is defined in two packages with different signatures and contracts (so NOT a duplicate — just a grep/navigation hazard):

  • pkg/parser/content_extractor.go:60func(map[string]any) (string, error) (returns merged tools+mcp-servers as a JSON string)
  • pkg/workflow/frontmatter_extraction_metadata.go:304func(map[string]any) map[string]any (returns only the tools field)

Recommendation: rename one (e.g. the parser copy to extractToolsAndMCPAsJSON) to disambiguate.

Why the "scattered helper" signal was a false positive

Repeated function names explained (all benign)

Within pkg/cli/, every plain function name is unique (zero repeats). Within pkg/workflow/, the only name in 3+ files is init(). The apparent cross-file repeats elsewhere resolve to intentional patterns:

  • _wasm.go build-tag twins — same name, real impl + no-op stub, mutually exclusive via //go:build (e.g. RunGH in github_cli.go vs github_cli_wasm.go; validatePipPackages/validateUvPackages real vs stub returning nil). Must NOT be merged.
  • Delegating shims (already centralized — good)LevenshteinDistance (pkg/parser/schema_suggestions.go:245 is a 1-line delegate to stringutil.LevenshteinDistance); SanitizeName (pkg/workflow/strings.go:98 delegates to pkg/stringutil/sanitize.go:69, with SanitizeOptions a type alias).
  • Linter testdata/ fixturesgood/bad/suppressed/doWork repeated across pkg/linters/*/testdata/src/... are deliberate test fixtures.
Verb-prefix clusters in pkg/workflow (well-organized)

Dominant clusters over ~2,471 workflow functions: get (296), build (204), validate (201), parse (200), extract (181), generate (170). These six account for over half of all functions and map cleanly onto the compile pipeline. Accessor family (is 113, has 54, set 39) and pipeline family are the two coherent super-groups. Files like parser.go, tools_parser.go, trigger_parser.go, and compiler_validators.go are cleanly single-verb — no reorganization needed.

Recommendations (all optional, low-risk)

  1. (Priority 1) Extract JS lexer helpers from js.go into js_lexer.go — clearest cohesion win, mechanical move.
  2. (Priority 2) Rename one extractToolsFromFrontmatter to remove the cross-package name collision.
  3. (Priority 3) Optionally relocate the validate* group out of model_identifier.go.

No consolidation, generics, or duplicate-merge work is warranted — the codebase already centralizes shared logic via shim delegation to pkg/stringutil, pkg/fileutil, etc.

Analysis Metadata

  • Detection method: function-name frequency analysis + verb-prefix clustering + body comparison of repeated names
  • Source files analyzed: 913 (non-test, pkg/)
  • Function definitions parsed: ~5,432
  • Actionable duplicates: 0
  • Cohesion opportunities: 3 files + 1 naming collision
  • Analysis date: 2026-06-13

References: run 27450245053

Generated by 🔧 Semantic Function Refactoring · 352.4 AIC · ⌖ 15 AIC · ⊞ 9.8K ·

  • expires on Jun 14, 2026, 4:20 PM UTC-08:00

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions