Skip to content

Commit eaf06b3

Browse files
Copilotpelikhan
andauthored
optimize(glossary-maintainer): prompt trim, turn guardrail, batch reads, haiku sub-agent for term discovery (#40353)
* Initial plan * optimize glossary-maintainer: prompt trim, turn guardrail, batch reads, sub-agent for term discovery Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * address review: clarify sub-agent invocation, term criteria, and batch-read intent Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
1 parent 2a5482d commit eaf06b3

2 files changed

Lines changed: 34 additions & 87 deletions

File tree

.github/workflows/glossary-maintainer.lock.yml

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.github/workflows/glossary-maintainer.md

Lines changed: 33 additions & 86 deletions
Original file line numberDiff line numberDiff line change
@@ -112,10 +112,7 @@ Use the `search` tool to find relevant documentation with natural language queri
112112
- When identifying relevant files — use it to narrow down which pages cover a feature or concept
113113
- When understanding a term — query to find authoritative documentation describing it
114114

115-
Example queries:
116-
- `search("safe-outputs create-pull-request options")`
117-
- `search("engine configuration copilot")`
118-
- `search("cache-memory persistent storage")`
115+
Example: `search("safe-outputs create-pull-request options")`
119116

120117
Always read the returned file paths to get full content — `search` returns paths, not content.
121118

@@ -133,14 +130,17 @@ Use Serena to:
133130

134131
## Task Steps
135132

133+
**Turn limit**: If you have completed more than 55 turns without successfully calling `create-pull-request` or `noop`, stop immediately. Call `noop` with a brief status note describing how far you got. Do not continue processing.
134+
136135
### 1. Determine Scan Scope
137136

138-
The pre-step has already determined the scan scope. Read it from the file:
137+
The pre-step has already determined the scan scope. Read all context files in one step (the glossary is read here early for efficiency, so Step 4 can work with already-loaded content):
139138

140139
```bash
141-
cat /tmp/gh-aw/agent/scan-scope.txt # "daily" or "weekly"
142-
cat /tmp/gh-aw/agent/recent-commits.txt # pre-fetched commit list
143-
cat /tmp/gh-aw/agent/doc-changes.txt # commits that touched docs
140+
cat /tmp/gh-aw/agent/scan-scope.txt \
141+
/tmp/gh-aw/agent/recent-commits.txt \
142+
/tmp/gh-aw/agent/doc-changes.txt \
143+
docs/src/content/docs/reference/glossary.md
144144
```
145145

146146
- **`weekly`** (Monday): Full scan — review changes from the last 7 days
@@ -161,38 +161,18 @@ Check your cache to avoid duplicate work:
161161

162162
### 3. Scan Recent Changes
163163

164-
Based on the scope (daily or weekly):
165-
166-
**Use QMD search first** — for each changed area or feature name, run `search` to discover whether existing documentation already covers it before deciding if a new glossary term is needed:
167-
- e.g., `search("cache-memory workflow persistence")` to check for existing docs before adding a term
168-
- e.g., `search("MCP server configuration tools")` to find all documentation on a concept
169-
170-
**Use GitHub tools sparingly** — prefer the pre-fetched files above:
171-
- Use `get_commit` for detailed diff of specific commit SHAs from `recent-commits.txt` (at most 20 commits)
172-
- Use `search_pull_requests` to find merged PRs from the timeframe (at most 10 PRs)
173-
- Use `pull_request_read` to inspect specific PR changes — pass `method: get_files` or `method: get_diff` as the operation
164+
Use the `discover-terms` sub-agent to identify new technical terms from the pre-fetched files. Invoke it by name and pass these file paths as context:
165+
- `/tmp/gh-aw/agent/scan-scope.txt`
166+
- `/tmp/gh-aw/agent/recent-commits.txt`
167+
- `/tmp/gh-aw/agent/doc-changes.txt`
174168

175-
**Look for new terminology in `docs/**/*.{md,mdx}` (and nowhere else)**
176-
- New configuration fields in frontmatter (YAML keys)
177-
- New CLI commands or flags
178-
- New tool names or MCP servers
179-
- New concepts or features
180-
- Technical acronyms (MCP, CLI, YAML, etc.)
181-
- Specialized terminology (safe-outputs, frontmatter, engine, etc.)
169+
The sub-agent returns a JSON array of candidate terms with context and source references. Collect this list for use in Steps 4–7.
182170

183171
### 4. Review Current Glossary
184172

185-
Read the current glossary:
173+
The current glossary was already read in Step 1 along with the scope files.
186174

187-
```bash
188-
cat docs/src/content/docs/reference/glossary.md
189-
```
190-
191-
**For each candidate term, use `search` to find documentation that describes it** — this provides authoritative context for writing accurate definitions and reveals whether any documentation page already explains the term:
192-
- e.g., `search("safe-outputs create-pull-request")` to find pages describing that feature
193-
- e.g., `search("engine configuration copilot")` to find all documentation on engines
194-
- e.g., `search("cache-memory persistent storage")` to find documentation on memory tools
195-
- Read the returned file paths for full context before writing definitions
175+
**For each candidate term, use `search` as described in Available Tools above** to find documentation that describes it — this provides authoritative context for writing accurate definitions. Read the returned file paths for full context before writing definitions.
196176

197177
**Check for:**
198178
- Terms that are missing from the glossary
@@ -289,46 +269,9 @@ This prevents duplicate work and helps track progress.
289269

290270
### 9. Create Pull Request
291271

292-
If you made any changes to the glossary:
293-
294-
1. **Use safe-outputs create-pull-request** to create a PR
295-
2. **Include in the PR description**:
296-
- Whether this was an incremental (daily) or full (weekly) scan
297-
- List of terms added
298-
- List of terms updated
299-
- Summary of recent changes that triggered the updates
300-
- Links to relevant commits or PRs
301-
302-
**PR Title Format**:
303-
- Daily: `[docs] Update glossary - daily scan`
304-
- Weekly: `[docs] Update glossary - weekly full scan`
305-
306-
**PR Description Template**:
307-
```markdown
308-
## Glossary Updates - [Date]
309-
310-
### Scan Type
311-
- [ ] Incremental (daily - last 24 hours)
312-
- [ ] Full scan (weekly - last 7 days)
313-
314-
### Terms Added
315-
- **Term Name**: Brief explanation of why it was added
316-
317-
### Terms Updated
318-
- **Term Name**: What changed and why
319-
320-
### Changes Analyzed
321-
- Reviewed X commits from [timeframe]
322-
- Analyzed Y merged PRs
323-
- Processed Z new features
324-
325-
### Related Changes
326-
- Commit SHA: Brief description
327-
- PR #NUMBER: Brief description
328-
329-
### Notes
330-
[Any additional context or terms that need manual review]
331-
```
272+
If you made any changes to the glossary, use **safe-outputs create-pull-request** with:
273+
- **Title**: `[docs] Update glossary - daily scan` or `[docs] Update glossary - weekly full scan`
274+
- **Description**: scan type (daily/weekly), terms added, terms updated, and relevant commits or PRs
332275

333276
### 10. Handle Edge Cases
334277

@@ -337,17 +280,6 @@ If you made any changes to the glossary:
337280
- **Unclear terms**: If a term is ambiguous, add it with a note that it needs review
338281
- **Conflicting definitions**: If a term has multiple meanings, note both in the definition
339282

340-
## Guidelines
341-
342-
- **Be Selective**: Only add terms that genuinely need explanation
343-
- **Be Accurate**: Ensure definitions match actual implementation
344-
- **Be Consistent**: Follow existing glossary style and structure
345-
- **Be Complete**: Don't leave terms partially defined
346-
- **Be Clear**: Write for users who are learning, not experts
347-
- **Follow Structure**: Maintain alphabetical order within sections
348-
- **Use Cache**: Track your work to avoid duplicates
349-
- **Link Appropriately**: Add references to related documentation
350-
351283
## Constraints
352284

353285
To keep this workflow efficient, adhere to these hard limits:
@@ -371,3 +303,18 @@ To keep this workflow efficient, adhere to these hard limits:
371303
Good luck! Your work helps users understand GitHub Agentic Workflows terminology.
372304

373305
{{#runtime-import shared/noop-reminder.md}}
306+
307+
## agent: `discover-terms`
308+
---
309+
model: claude-haiku-4.5
310+
description: Scans recent commits and doc changes to identify new technical terms
311+
---
312+
Read the provided commit log and doc-change files. Fetch diffs for at most 20 commits using get_commit. For each commit, identify new technical terms introduced in user-facing docs (docs/**/*.md, docs/**/*.mdx). Look for:
313+
- New configuration fields in frontmatter (YAML keys)
314+
- New CLI commands or flags
315+
- New tool names or MCP servers
316+
- New concepts or features
317+
- Technical acronyms (MCP, CLI, YAML, etc.)
318+
- Specialized terminology (safe-outputs, frontmatter, engine, etc.)
319+
320+
Only include terms from `docs/**/*.{md,mdx}` files, not internal code or comments. Return a JSON array: [{"term": "...", "context": "...", "source": "commit:<SHA> or pr:<N>"}]. If no new terms, return [].

0 commit comments

Comments
 (0)