Skip to content

Commit b0333ee

Browse files
committed
Add theme momentum coverage quality metrics
1 parent 92b7d5e commit b0333ee

6 files changed

Lines changed: 54 additions & 2 deletions

File tree

README.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,19 @@ plugin contract.
8484
6. Downstream runtimes must treat the artifact as advisory context only until a
8585
separate deterministic policy engine explicitly consumes it.
8686

87+
88+
## Name and Horizon Boundary
89+
90+
The repository name remains acceptable for now because this repo owns the
91+
long-horizon AI shadow context and cross-sector theme research artifacts.
92+
Short/medium/long final recommendations are produced by
93+
`QuantAdvisorResearch`, not by this repository.
94+
95+
If the theme-momentum layer later becomes broader than AI context, a future
96+
rename such as `LongHorizonResearchSignals` can be considered, but that should
97+
be a separate migration because GitHub repo links, cross-repo checkout paths,
98+
and documentation references would all need updates.
99+
87100
## GitHub Configuration
88101

89102
The model API keys are centralized in `CodexAuditBridge`; do not add
@@ -263,5 +276,7 @@ failures are recorded in `data_quality.missing_price_symbols` by default;
263276
`--strict-downloads` turns those into hard failures.
264277

265278
The snapshot records fixed 12-1m, 6-1m, and 3m momentum windows, breadth, risk
266-
penalties, top symbols per theme, and a policy block that keeps the artifact
267-
research-only.
279+
penalties, top symbols per theme, source metadata, and a policy block that keeps
280+
the artifact research-only. `data_quality.coverage` now records configured
281+
symbol count, priced symbol count, price coverage ratio, and symbols with
282+
insufficient price history.

README.zh-CN.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,13 @@ data/output/signal_history/2026-05-28.json
6868
5. 所有 AI 生成的 artifact 必须保持 `mode=shadow`,并通过本地 schema validation。
6969
6. 下游系统在单独的确定性 policy engine 显式消费前,只能把 artifact 当作 advisory context。
7070

71+
72+
## 名称和周期边界
73+
74+
当前暂不建议改仓库名。`AiLongHorizonSignalPipelines` 仍然准确描述了本仓库的核心职责:维护长周期 AI shadow context 和跨板块主题研究 artifact。短线/中线/长线最终推荐是在 `QuantAdvisorResearch` 里生成的,不由本仓库直接输出。
75+
76+
如果后续主题动量层明显扩展成更通用的研究信号仓,可以单独评估改名,例如 `LongHorizonResearchSignals`。这需要迁移 GitHub 仓库链接、跨仓 checkout 路径和文档引用,不建议和本次数据质量增强混在一起做。
77+
7178
## GitHub 配置
7279

7380
模型 API key 集中在 `CodexAuditBridge`;不要把 `OPENAI_API_KEY``ANTHROPIC_API_KEY` 放到本仓库。
@@ -216,6 +223,7 @@ python scripts/build_theme_momentum_snapshot.py \
216223
- `theme_ranks`:主题排名、动量分、breadth、风险惩罚和主题内 top symbols
217224
- `methodology`:固定窗口和权重,便于后续 walk-forward replay
218225
- `policy`:明确这是研究排序,不允许下单或仓位分配
226+
- `data_quality.coverage`:配置标的数、已有价格标的数、价格覆盖率和价格历史不足标的
219227

220228
当前固定窗口:
221229

docs/architecture.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,3 +138,12 @@ The artifact is point-in-time research context. It ranks themes and highlights
138138
strong members inside a theme, but it does not encode orders, target weights, or
139139
execution policy. Future replay must consume saved snapshots rather than
140140
recomputing old theme ranks with revised constituents or revised weights.
141+
142+
## Repository Name Decision
143+
144+
Keep `AiLongHorizonSignalPipelines` for now. The short/medium/long final
145+
recommendation buckets live in `QuantAdvisorResearch`; this repository still
146+
provides long-horizon AI shadow context plus cross-sector theme artifacts. A
147+
rename to `LongHorizonResearchSignals` may be reasonable later, but it should be
148+
a deliberate migration because cross-repository workflow checkouts and public
149+
links would need updates.

scripts/build_theme_momentum_snapshot.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,9 @@ def main() -> int:
107107
"ranked_theme_count": snapshot["summary"]["ranked_theme_count"],
108108
"priced_symbol_count": snapshot["summary"]["priced_symbol_count"],
109109
"top_theme_ids": snapshot["summary"]["top_theme_ids"],
110+
"price_coverage_ratio": snapshot["data_quality"]["coverage"]["price_coverage_ratio"],
110111
"missing_price_symbols": snapshot["data_quality"]["missing_price_symbols"],
112+
"insufficient_history_symbols": snapshot["data_quality"].get("insufficient_history_symbols", []),
111113
}
112114
)
113115
)

src/ai_long_horizon_signal_pipelines/theme_momentum.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,11 @@ def build_theme_momentum_snapshot(
133133
for symbol, symbol_rows in sorted(rows_by_symbol.items())
134134
if symbol_rows
135135
}
136+
exposure_symbols = sorted({symbol.upper() for symbol in exposures})
137+
priced_exposure_symbols = [symbol for symbol in exposure_symbols if symbol in symbol_scores]
138+
insufficient_history_symbols = sorted(
139+
symbol for symbol in priced_exposure_symbols if symbol_scores[symbol]["momentum_score"] is None
140+
)
136141
latest_dates = [parse_price_date(item["as_of"]) for item in symbol_scores.values()]
137142
snapshot_as_of = (as_of_date or max(latest_dates)).isoformat() if latest_dates or as_of_date else dt.date.today().isoformat()
138143

@@ -231,7 +236,16 @@ def build_theme_momentum_snapshot(
231236
},
232237
"theme_ranks": theme_ranks,
233238
"data_quality": {
239+
"coverage": {
240+
"configured_symbol_count": len(exposure_symbols),
241+
"priced_symbol_count": len(priced_exposure_symbols),
242+
"price_coverage_ratio": round_optional(
243+
len(priced_exposure_symbols) / len(exposure_symbols) if exposure_symbols else None
244+
),
245+
"insufficient_history_symbol_count": len(insufficient_history_symbols),
246+
},
234247
"missing_price_symbols": sorted(missing_price_symbols),
248+
"insufficient_history_symbols": insufficient_history_symbols,
235249
"unranked_themes": sorted(unranked_themes),
236250
},
237251
"policy": {

tests/test_theme_momentum.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ def test_theme_momentum_ranks_strong_broad_theme_first() -> None:
6363
assert [item["symbol"] for item in ranked[0]["top_symbols"]] == ["MU", "HBM2"]
6464
assert ranked[0]["momentum_score"] > ranked[1]["momentum_score"]
6565
assert snapshot["policy"]["execution_allowed"] is False
66+
assert snapshot["data_quality"]["coverage"]["price_coverage_ratio"] == 1.0
6667

6768

6869
def test_theme_momentum_records_missing_price_coverage() -> None:
@@ -86,5 +87,8 @@ def test_theme_momentum_records_missing_price_coverage() -> None:
8687
snapshot = build_theme_momentum_snapshot(rows, themes=themes, exposures=exposures)
8788

8889
assert snapshot["data_quality"]["missing_price_symbols"] == ["SMCI"]
90+
assert snapshot["data_quality"]["coverage"]["configured_symbol_count"] == 2
91+
assert snapshot["data_quality"]["coverage"]["priced_symbol_count"] == 1
92+
assert snapshot["data_quality"]["coverage"]["price_coverage_ratio"] == 0.5
8993
assert snapshot["theme_ranks"][0]["component_count"] == 2
9094
assert snapshot["theme_ranks"][0]["priced_symbol_count"] == 1

0 commit comments

Comments
 (0)