Skip to content

Commit 4eca853

Browse files
committed
feat(skill): BM25-based smart skill retrieval for large catalogues
When >80 skills are installed, the system prompt switches from a full listing to a compact name-only format. The model discovers skills via the Skill tool's new action:"search" endpoint backed by a BM25 index. Three tiers by skill count: ≤ 80: legacy full listing (prompt-cache optimal) 81-300: compact name + description + search > 300: names only + search required Key changes: - skill/search.ts: BM25 index with synonym expansion (zero deps) - skill/registry.ts: auto-detect tier, lazy content loading - skill/parser.ts: parseSkillMetaFromFile (frontmatter-only) - skill-tool.ts: search action on existing Skill tool - system.md: search-first workflow instructions Performance (measured with 1,530 real skills): - System prompt: 118K → 8.4K tokens (93% reduction) - Startup memory: 88MB → 4MB (95% reduction) - Search latency: 0.0-0.2ms per query - Lazy content load: 0.4ms per skill activation Closes #725
1 parent f874251 commit 4eca853

11 files changed

Lines changed: 625 additions & 14 deletions

File tree

.changeset/skill-search-bm25.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@moonshot-ai/kimi-code": minor
3+
---
4+
5+
Add BM25-based skill search for large catalogues. When >80 skills are installed, the system prompt switches from a full listing to a compact name-only format and the model discovers skills via the Skill tool's new `action: "search"` endpoint. Startup memory reduced ~95% via lazy content loading.

packages/agent-core/src/profile/default/system.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -148,9 +148,12 @@ Skills are grouped by scope (`Project`, `User`, `Extra`, `Built-in`) so you can
148148

149149
## How to use skills
150150

151-
Identify the skills that are likely to be useful for the tasks you are currently working on, read the skill file for detailed instructions, guidelines, scripts and more.
151+
When you need a skill, follow this two-step process:
152152

153-
Only read skill details when needed to conserve the context window.
153+
1. **Search**: Call the `Skill` tool with `action: "search"` and relevant keywords to find matching skills. The search returns ranked results instantly.
154+
2. **Load**: Once you identify the right skill from search results, call the `Skill` tool with `action: "load"` and the skill name to load its full instructions into context.
155+
156+
Only read skill details when needed to conserve the context window. Do NOT guess skill names — always search first when the skill listing above does not contain enough detail.
154157

155158
# Ultimate Reminders
156159

packages/agent-core/src/skill/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@ export * from './builtin';
22
export * from './parser';
33
export * from './registry';
44
export * from './scanner';
5+
export * from './search';
56
export * from './types';

packages/agent-core/src/skill/parser.ts

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
import { createReadStream } from 'node:fs';
12
import { readFile } from 'node:fs/promises';
23
import path from 'pathe';
34

@@ -8,6 +9,13 @@ import type { SkillDefinition, SkillMetadata, SkillSource } from './types';
89
import { isSupportedSkillType } from './types';
910
import { escapeXmlTags } from '../utils/xml-escape';
1011

12+
/**
13+
* Sentinel stored in SkillDefinition.content when only frontmatter was
14+
* parsed at startup. renderSkillPrompt() checks for this to decide
15+
* whether to lazy-load the full body from disk.
16+
*/
17+
export const LAZY_CONTENT_SENTINEL = '\u0000LAZY';
18+
1119
export class FrontmatterError extends Error {
1220
constructor(message: string, cause?: unknown) {
1321
super(message);
@@ -79,6 +87,53 @@ export async function parseSkillFromFile(options: ParseSkillOptions): Promise<Sk
7987
return parseSkillText({ ...options, text });
8088
}
8189

90+
/**
91+
* Read only the frontmatter from a SKILL.md file, leaving `content` empty.
92+
* The body is not read from disk — callers can load it later via
93+
* `readFile` + `parseSkillText` when the full content is actually needed.
94+
*
95+
* This avoids loading the full body of thousands of SKILL files into memory
96+
* at startup when only the index (name, description) is needed.
97+
*/
98+
export async function parseSkillMetaFromFile(options: ParseSkillOptions): Promise<SkillDefinition> {
99+
const stream = createReadStream(options.skillMdPath, { encoding: 'utf8', highWaterMark: 4096 });
100+
let buffer = '';
101+
let fenceCount = 0;
102+
103+
try {
104+
for await (const chunk of stream) {
105+
buffer += chunk;
106+
const fences = buffer.match(/^---\s*$/gm);
107+
if (fences !== null && fences.length >= 2) {
108+
fenceCount = 2;
109+
break;
110+
}
111+
}
112+
} finally {
113+
stream.close();
114+
}
115+
116+
if (fenceCount < 2) {
117+
return parseSkillFromFile(options);
118+
}
119+
120+
// M1 fix: find second fence with line-anchored regex (not indexOf)
121+
const lines = buffer.split(/\r?\n/);
122+
let offset = 0;
123+
let fencesFound = 0;
124+
for (const line of lines) {
125+
if (/^---\s*$/.test(line)) {
126+
fencesFound++;
127+
if (fencesFound === 2) break;
128+
}
129+
offset += line.length + 1;
130+
}
131+
132+
const frontmatterOnly = buffer.slice(0, offset + 3);
133+
const result = parseSkillText({ ...options, text: frontmatterOnly });
134+
return { ...result, content: LAZY_CONTENT_SENTINEL };
135+
}
136+
82137
export function parseFrontmatter(text: string): ParsedFrontmatter {
83138
const lines = text.split(/\r?\n/);
84139
if (lines[0]?.trim() !== FENCE) {

packages/agent-core/src/skill/registry.ts

Lines changed: 94 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,28 @@
1-
import { expandSkillParameters, skillArgumentNames } from './parser';
1+
import { readFileSync } from 'node:fs';
2+
3+
import { LAZY_CONTENT_SENTINEL, expandSkillParameters, skillArgumentNames, parseSkillMetaFromFile, parseSkillText } from './parser';
24
import { discoverSkills, type DiscoverSkillsOptions } from './scanner';
5+
import { SkillSearchIndex, type SkillSearchResult } from './search';
36
import type { SkillDefinition, SkillRoot, SkillSource, SkippedSkill } from './types';
47
import { isInlineSkillType, normalizeSkillName } from './types';
58
import { escapeXmlAttr } from '../utils/xml-escape';
69

710
const LISTING_DESC_MAX = 250;
811

12+
/**
13+
* Above this threshold, getModelSkillListing() switches to a compact
14+
* name-only listing and tells the model to use the `skill_search` tool.
15+
* Below it, the legacy full listing is injected into the system prompt
16+
* (cheaper for prompt caching with small catalogues).
17+
*/
18+
const COMPACT_LISTING_THRESHOLD = 80;
19+
20+
/**
21+
* Above this threshold, the compact listing drops descriptions entirely
22+
* and lists only skill names.
23+
*/
24+
const NAMES_ONLY_LISTING_THRESHOLD = 300;
25+
926
export class SkillNotFoundError extends Error {
1027
readonly skillName: string;
1128

@@ -30,6 +47,9 @@ export class SkillRegistry {
3047
private readonly discoverImpl: typeof discoverSkills;
3148
private readonly onWarning: (message: string, cause?: unknown) => void;
3249
readonly sessionId?: string;
50+
private readonly searchIndex = new SkillSearchIndex();
51+
52+
private indexDirty = false;
3353

3454
constructor(options: SkillRegistryOptions = {}) {
3555
this.discoverImpl = options.discover ?? discoverSkills;
@@ -42,8 +62,13 @@ export class SkillRegistry {
4262
if (!this.roots.includes(root.path)) this.roots.push(root.path);
4363
}
4464

65+
// Only parse frontmatter at startup (name, description, whenToUse).
66+
// The full body is loaded on demand when renderSkillPrompt() is called.
67+
// This saves ~95% memory for large skill catalogues.
68+
4569
const skills = await this.discoverImpl({
4670
roots,
71+
parse: parseSkillMetaFromFile,
4772
onWarning: this.onWarning,
4873
onSkippedByPolicy: (skill) => this.skipped.push(skill),
4974
onDiscoveredSkill: (skill) => {
@@ -54,6 +79,10 @@ export class SkillRegistry {
5479
for (const skill of skills) {
5580
this.byName.set(normalizeSkillName(skill.name), skill);
5681
}
82+
83+
// Build the BM25 search index so the model can discover skills
84+
// via the `skill_search` tool instead of scanning a full listing.
85+
this.searchIndex.build(this.listInvocableSkills());
5786
}
5887

5988
registerBuiltinSkill(skill: SkillDefinition): void {
@@ -64,6 +93,7 @@ export class SkillRegistry {
6493
const key = normalizeSkillName(skill.name);
6594
if (options.replace === true || !this.byName.has(key)) {
6695
this.byName.set(key, skill);
96+
this.indexDirty = true;
6797
}
6898
this.indexPluginSkill(skill, options);
6999
}
@@ -88,8 +118,22 @@ export class SkillRegistry {
88118
}
89119

90120
renderSkillPrompt(skill: SkillDefinition, rawArgs: string): string {
121+
// Lazy content loading: when compact mode parsed only frontmatter,
122+
// the body is empty. Read the full file now (sync, only for activated skills).
123+
let content = skill.content;
124+
if (content === LAZY_CONTENT_SENTINEL && skill.path.length > 0) {
125+
const text = readFileSync(skill.path, 'utf8');
126+
const full = parseSkillText({
127+
skillMdPath: skill.path,
128+
skillDirName: skill.dir.split('/').pop() ?? skill.dir,
129+
source: skill.source,
130+
text,
131+
});
132+
content = full.content;
133+
}
134+
91135
const argumentNames = skillArgumentNames(skill.metadata);
92-
const content = expandSkillParameters(skill.content, rawArgs, {
136+
content = expandSkillParameters(content, rawArgs, {
93137
skillDir: skill.dir,
94138
sessionId: this.sessionId,
95139
argumentNames,
@@ -129,16 +173,47 @@ export class SkillRegistry {
129173
return rendered.length === 0 ? 'No skills' : rendered;
130174
}
131175

176+
/**
177+
* Search skills by free-text query. Delegates to the BM25 index.
178+
* Lazily rebuilds the index if skills were registered since the last build.
179+
*/
180+
searchSkills(query: string, limit?: number): readonly SkillSearchResult[] {
181+
if (this.indexDirty) {
182+
this.searchIndex.build(this.listInvocableSkills());
183+
this.indexDirty = false;
184+
}
185+
return this.searchIndex.search(query, limit);
186+
}
187+
132188
getModelSkillListing(): string {
133-
const lines = ['DISREGARD any earlier skill listings. Current available skills:'];
134-
const listing = renderGroupedSkills(
135-
this.listInvocableSkills().filter((skill) => skill.metadata.isSubSkill !== true),
136-
formatModelSkill,
189+
const invocable = this.listInvocableSkills().filter(
190+
(skill) => skill.metadata.isSubSkill !== true,
137191
);
138-
if (listing.length > 0) {
139-
lines.push(listing);
192+
193+
// Auto-detect: small catalogue → legacy full listing.
194+
// Large catalogue → compact/names-only + search-first.
195+
if (invocable.length <= COMPACT_LISTING_THRESHOLD) {
196+
const lines = ['DISREGARD any earlier skill listings. Current available skills:'];
197+
const listing = renderGroupedSkills(invocable, formatModelSkill);
198+
if (listing.length > 0) lines.push(listing);
199+
return lines.length === 1 ? '' : lines.join('\n');
140200
}
141-
return lines.length === 1 ? '' : lines.join('\n');
201+
202+
// Tier 2+3: Large catalogue — search-first.
203+
const count = invocable.length;
204+
const format = count > NAMES_ONLY_LISTING_THRESHOLD
205+
? formatNameOnlySkill
206+
: formatCompactSkill;
207+
const lines = [
208+
`You have access to ${String(count)} registered skills.`,
209+
'To find relevant skills, call the `Skill` tool with `action: "search"` and keywords from the user\'s request.',
210+
'Do NOT guess skill names — always search first, then load with `action: "load"`.',
211+
'',
212+
'Skill names by scope:',
213+
];
214+
const listing = renderGroupedSkills(invocable, format);
215+
if (listing.length > 0) lines.push(listing);
216+
return lines.join('\n');
142217
}
143218
}
144219

@@ -182,6 +257,16 @@ function formatModelSkill(skill: SkillDefinition): readonly string[] {
182257
return lines;
183258
}
184259

260+
/** Compact format: name + 80-char description, no path. */
261+
function formatCompactSkill(skill: SkillDefinition): readonly string[] {
262+
return [`- ${skill.name}: ${truncate(skill.description, 80)}`];
263+
}
264+
265+
/** Minimal format: name only. Used for catalogues > 200 skills. */
266+
function formatNameOnlySkill(skill: SkillDefinition): readonly string[] {
267+
return [`- ${skill.name}`];
268+
}
269+
185270
function truncate(value: string, max: number): string {
186271
return value.length > max ? value.slice(0, max) : value;
187272
}

0 commit comments

Comments
 (0)