Summary
Currently, CodeTour does not support repositories that use Git submodules.
When a project contains submodules, tour steps that reference files inside
submodule directories cannot properly track code changes, because the ref
field only records the parent repository's commit SHA, not the individual
submodule commits.
Motivation
Git submodules are widely used in large projects to organize code across
multiple repositories. A common example is a monorepo-style setup where:
project-root/
├── .gitmodules
├── src/ # parent repo code
├── libs/
│ ├── foo/ # submodule A (has its own commit history)
│ └── bar/ # submodule B (has its own commit history)
When creating a CodeTour for such a project, steps may reference files
from both the parent repo and submodules. However, since each submodule
maintains its own independent commit history, the current single ref
field is insufficient to accurately pin the exact version of code being
referenced in submodule files.
This means that after code changes, the "view at ref" feature breaks
for submodule files, making the tour outdated and unreliable.
Expected Behavior
CodeTour should:
- Detect submodules in the repository by reading
.gitmodules
- Automatically record each submodule's commit SHA when a tour is
created or a step is added
- Correctly restore the referenced version of submodule files when
viewing a tour step, using the submodule's own commit SHA
Proposed Schema Change
To maintain backward compatibility, I suggest adding an optional
submoduleRefs field at the tour level:
Current schema:
{
"title": "My Tour",
"ref": "abc1234", // parent repo commit SHA
"steps": [
{ "file": "src/main.ts", "line": 10, "description": "..." },
{ "file": "libs/foo/bar.ts", "line": 42, "description": "..." }
]
}
Proposed schema:
{
"title": "My Tour",
"ref": "abc1234", // parent repo commit SHA (unchanged)
"submoduleRefs": { // NEW: per-submodule commit SHAs
"libs/foo": "def5678",
"libs/bar": "ghi9012"
},
"steps": [
{ "file": "src/main.ts", "line": 10, "description": "..." },
{ "file": "libs/foo/bar.ts", "line": 42, "description": "..." }
]
}
Why this approach:
- ✅ Fully backward compatible (existing
.tour files are unaffected)
- ✅ Aligns with how Git natively tracks submodules
(parent repo stores submodule commit SHA in its tree object)
- ✅ Minimal schema change
- ✅ No need for step-level changes; submodule membership can be
inferred from file path + .gitmodules
Rough Implementation Idea
-
On tour/step creation:
- Parse
.gitmodules to get all submodule paths
- For each step's file path, check if it falls under a submodule path
- Run
git rev-parse HEAD inside each relevant submodule directory
- Store the results in
submoduleRefs
-
On tour playback (view at ref):
- For a given step's file, determine if it belongs to a submodule
- If yes, use
submoduleRefs[submodulePath] as the ref for that file
- If no, use the existing top-level
ref as before
Additional Considerations
- If a submodule has not been initialized (
git submodule update --init),
CodeTour could show a warning rather than failing silently
submoduleRefs could potentially be updated incrementally as new steps
are added to a tour
Willingness to Contribute
I am willing to submit a Pull Request for this feature if the maintainers
agree with the proposed direction. Happy to discuss any design concerns
before starting implementation.
Summary
Currently, CodeTour does not support repositories that use Git submodules.
When a project contains submodules, tour steps that reference files inside
submodule directories cannot properly track code changes, because the
reffield only records the parent repository's commit SHA, not the individual
submodule commits.
Motivation
Git submodules are widely used in large projects to organize code across
multiple repositories. A common example is a monorepo-style setup where:
When creating a CodeTour for such a project, steps may reference files
from both the parent repo and submodules. However, since each submodule
maintains its own independent commit history, the current single
reffield is insufficient to accurately pin the exact version of code being
referenced in submodule files.
This means that after code changes, the "view at ref" feature breaks
for submodule files, making the tour outdated and unreliable.
Expected Behavior
CodeTour should:
.gitmodulescreated or a step is added
viewing a tour step, using the submodule's own commit SHA
Proposed Schema Change
To maintain backward compatibility, I suggest adding an optional
submoduleRefsfield at the tour level:Current schema:
Proposed schema:
Why this approach:
.tourfiles are unaffected)(parent repo stores submodule commit SHA in its tree object)
inferred from file path +
.gitmodulesRough Implementation Idea
On tour/step creation:
.gitmodulesto get all submodule pathsgit rev-parse HEADinside each relevant submodule directorysubmoduleRefsOn tour playback (view at ref):
submoduleRefs[submodulePath]as the ref for that filerefas beforeAdditional Considerations
git submodule update --init),CodeTour could show a warning rather than failing silently
submoduleRefscould potentially be updated incrementally as new stepsare added to a tour
Willingness to Contribute
I am willing to submit a Pull Request for this feature if the maintainers
agree with the proposed direction. Happy to discuss any design concerns
before starting implementation.