Skip to content

Commit b1b4979

Browse files
committed
Support structured signal context bias
1 parent d9461b9 commit b1b4979

7 files changed

Lines changed: 175 additions & 18 deletions

File tree

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -215,6 +215,29 @@ data/output/signal_history/YYYY-MM-DD.json
215215
All artifacts must remain shadow-only. They cannot encode broker orders, target
216216
quantities, or live allocation overrides.
217217

218+
`candidate_bias` and `theme_bias` may use either the legacy compact form:
219+
220+
```json
221+
{"MU": "watch"}
222+
```
223+
224+
or the structured audit form:
225+
226+
```json
227+
{
228+
"MU": {
229+
"bias": "watch",
230+
"confidence": 0.55,
231+
"linked_themes": ["hbm_memory"],
232+
"rationale": "Shadow context only; not a trade instruction."
233+
}
234+
}
235+
```
236+
237+
`symbol_bias` is optional and uses the same structured shape for symbol-specific
238+
long-horizon context. Downstream Advisor code treats these fields as context and
239+
still blocks orders, target quantities, and portfolio weights.
240+
218241
## Replay Contract
219242

220243
Historical validation should replay stored signal artifacts instead of asking a

README.zh-CN.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,27 @@ data/output/signal_history/YYYY-MM-DD.json
178178

179179
所有 artifacts 必须保持 shadow-only。它们不能编码券商订单、目标数量或实盘 allocation override。
180180

181+
`candidate_bias``theme_bias` 支持两种写法。兼容旧的紧凑写法:
182+
183+
```json
184+
{"MU": "watch"}
185+
```
186+
187+
也支持更适合审计的结构化写法:
188+
189+
```json
190+
{
191+
"MU": {
192+
"bias": "watch",
193+
"confidence": 0.55,
194+
"linked_themes": ["hbm_memory"],
195+
"rationale": "只作为 shadow context,不是交易指令。"
196+
}
197+
}
198+
```
199+
200+
`symbol_bias` 是可选字段,使用同样结构表达单个 symbol 的长线背景。下游 Advisor 只把这些字段当作上下文,仍然禁止订单、目标股数和组合权重。
201+
181202
## Replay Contract
182203

183204
历史验证必须 replay 已保存 signal artifacts,而不是让模型重新生成过去的判断。当前示例 policy 有意保持保守:

docs/architecture.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -106,16 +106,17 @@ The taxonomy intentionally covers multiple durable sectors:
106106
- consumer platforms, industrial automation, EV/auto, and crypto infrastructure
107107

108108
Theme membership is static research context. A symbol is not added to a theme
109-
just because it is hot this month. Monthly AI output may express `theme_bias`,
110-
but downstream consumers must keep that output shadow-only and replay saved
111-
artifacts point-in-time.
109+
just because it is hot this month. Monthly AI output may express `theme_bias`
110+
and optional `symbol_bias`; both can use structured values with bias,
111+
confidence, linked themes, rationale, and risk flags. Downstream consumers must
112+
keep that output shadow-only and replay saved artifacts point-in-time.
112113

113114
This is the anti-overfit boundary:
114115

115116
1. Define universe and theme exposure before looking at future returns.
116117
2. Save every AI theme judgment as an artifact.
117118
3. Replay only saved artifacts; never regenerate old model judgments.
118-
4. Treat theme bias as context, not as execution or allocation.
119+
4. Treat theme and symbol bias as context, not as execution or allocation.
119120

120121
## Horizon Boundary
121122

examples/latest_signal.example.json

Lines changed: 65 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,42 @@
5858
"downstream_use": "Shadow context only; deterministic policy must explicitly opt in before any future use."
5959
},
6060
"theme_bias": {
61-
"ai_compute": "watch",
62-
"hbm_memory": "watch",
63-
"ai_server_infrastructure": "watch",
64-
"foundry_semicap": "watch",
61+
"ai_compute": {
62+
"bias": "watch",
63+
"confidence": 0.42,
64+
"horizon": "1-3 years",
65+
"rationale": "AI infrastructure demand is durable but requires valuation and capex review.",
66+
"risk_flags": [
67+
"valuation_sensitive"
68+
]
69+
},
70+
"hbm_memory": {
71+
"bias": "positive",
72+
"confidence": 0.58,
73+
"horizon": "1-3 years",
74+
"rationale": "HBM and high-end memory remain linked to AI server buildout.",
75+
"risk_flags": [
76+
"memory_cycle_risk"
77+
]
78+
},
79+
"ai_server_infrastructure": {
80+
"bias": "watch",
81+
"confidence": 0.5,
82+
"horizon": "1-3 years",
83+
"rationale": "AI server infrastructure demand remains visible but margins and order conversion need review.",
84+
"risk_flags": [
85+
"margin_risk"
86+
]
87+
},
88+
"foundry_semicap": {
89+
"bias": "watch",
90+
"confidence": 0.46,
91+
"horizon": "1-3 years",
92+
"rationale": "Foundry and semiconductor capital spending are long-cycle context themes.",
93+
"risk_flags": [
94+
"capex_cycle_risk"
95+
]
96+
},
6597
"defense_aerospace": "watch",
6698
"healthcare_policy": "neutral",
6799
"energy_security": "neutral",
@@ -92,5 +124,34 @@
92124
"XLF": [
93125
"financial_market_infrastructure"
94126
]
127+
},
128+
"symbol_bias": {
129+
"MU": {
130+
"bias": "watch",
131+
"confidence": 0.54,
132+
"linked_themes": [
133+
"hbm_memory",
134+
"ai_compute"
135+
],
136+
"rationale": "HBM exposure is positive context, but memory cycle and valuation still need confirmation."
137+
},
138+
"INTC": {
139+
"bias": "watch",
140+
"confidence": 0.45,
141+
"linked_themes": [
142+
"foundry_semicap",
143+
"ai_compute"
144+
],
145+
"rationale": "Foundry and domestic semiconductor policy are long-horizon context, execution risk remains high."
146+
},
147+
"DELL": {
148+
"bias": "watch",
149+
"confidence": 0.47,
150+
"linked_themes": [
151+
"ai_server_infrastructure",
152+
"ai_compute"
153+
],
154+
"rationale": "AI server demand is relevant, while margin and backlog conversion need review."
155+
}
95156
}
96157
}

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,5 +17,5 @@ test = ["pytest>=8"]
1717
where = ["src"]
1818

1919
[tool.pytest.ini_options]
20-
pythonpath = ["src"]
20+
pythonpath = ["src", "."]
2121
testpaths = ["tests"]

src/research_signal_context_pipelines/schema.py

Lines changed: 29 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,13 @@ def _require_string_list(value: Any, name: str, *, allow_empty: bool = False) ->
6969
return result
7070

7171

72+
def _require_number_0_1(value: Any, name: str) -> None:
73+
if not isinstance(value, (int, float)) or isinstance(value, bool):
74+
raise SignalValidationError(f"{name} must be numeric")
75+
if value < 0 or value > 1:
76+
raise SignalValidationError(f"{name} must be between 0 and 1")
77+
78+
7279
def validate_signal(payload: Mapping[str, Any]) -> None:
7380
missing = [key for key in REQUIRED_TOP_LEVEL_KEYS if key not in payload]
7481
if missing:
@@ -94,17 +101,15 @@ def validate_signal(payload: Mapping[str, Any]) -> None:
94101

95102
if "theme_bias" in payload:
96103
_validate_bias_mapping(_require_mapping(payload["theme_bias"], "theme_bias"), "theme_bias")
104+
if "symbol_bias" in payload:
105+
_validate_bias_mapping(_require_mapping(payload["symbol_bias"], "symbol_bias"), "symbol_bias")
97106
if "symbol_theme_exposure" in payload:
98107
symbol_theme_exposure = _require_mapping(payload["symbol_theme_exposure"], "symbol_theme_exposure")
99108
for symbol, theme_ids in symbol_theme_exposure.items():
100109
_require_string(symbol, "symbol_theme_exposure key")
101110
_require_string_list(theme_ids, f"symbol_theme_exposure[{symbol!r}]")
102111

103-
confidence = payload["confidence"]
104-
if not isinstance(confidence, (int, float)) or isinstance(confidence, bool):
105-
raise SignalValidationError("confidence must be numeric")
106-
if confidence < 0 or confidence > 1:
107-
raise SignalValidationError("confidence must be between 0 and 1")
112+
_require_number_0_1(payload["confidence"], "confidence")
108113

109114
evidence = _require_mapping(payload["evidence"], "evidence")
110115
_require_string_list(evidence.get("sources"), "evidence.sources")
@@ -121,7 +126,22 @@ def validate_signal(payload: Mapping[str, Any]) -> None:
121126
def _validate_bias_mapping(mapping: Mapping[str, Any], name: str) -> None:
122127
for key, bias in mapping.items():
123128
_require_string(key, f"{name} key")
124-
if bias not in ALLOWED_BIAS_VALUES:
125-
raise SignalValidationError(
126-
f"{name}[{key!r}] must be one of: {', '.join(sorted(ALLOWED_BIAS_VALUES))}"
127-
)
129+
_validate_bias_value(bias, f"{name}[{key!r}]")
130+
131+
132+
def _validate_bias_value(value: Any, name: str) -> None:
133+
if isinstance(value, str):
134+
bias = value
135+
else:
136+
raw = _require_mapping(value, name)
137+
bias = _require_string(raw.get("bias"), f"{name}.bias")
138+
if "confidence" in raw:
139+
_require_number_0_1(raw["confidence"], f"{name}.confidence")
140+
for optional_key in ("rationale", "horizon"):
141+
if optional_key in raw:
142+
_require_string(raw[optional_key], f"{name}.{optional_key}")
143+
for optional_list_key in ("risk_flags", "linked_themes"):
144+
if optional_list_key in raw:
145+
_require_string_list(raw[optional_list_key], f"{name}.{optional_list_key}", allow_empty=True)
146+
if bias not in ALLOWED_BIAS_VALUES:
147+
raise SignalValidationError(f"{name} must be one of: {', '.join(sorted(ALLOWED_BIAS_VALUES))}")

tests/test_signal_validation.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,40 @@ def test_signal_accepts_optional_theme_bias_and_exposure() -> None:
5353
validate_signal(payload)
5454

5555

56+
def test_signal_accepts_structured_theme_and_symbol_bias() -> None:
57+
payload = load_example()
58+
payload["theme_bias"] = {
59+
"hbm_memory": {
60+
"bias": "positive",
61+
"confidence": 0.62,
62+
"horizon": "1-3 years",
63+
"rationale": "HBM demand remains a long-horizon research context.",
64+
"risk_flags": ["cycle_risk"],
65+
}
66+
}
67+
payload["symbol_bias"] = {
68+
"MU": {
69+
"bias": "watch",
70+
"confidence": 0.55,
71+
"linked_themes": ["hbm_memory"],
72+
"rationale": "Symbol-level shadow context remains watch-only.",
73+
}
74+
}
75+
76+
validate_signal(payload)
77+
78+
5679
def test_signal_rejects_invalid_theme_bias() -> None:
5780
payload = load_example()
5881
payload["theme_bias"] = {"hbm_memory": "hot"}
5982

6083
with pytest.raises(SignalValidationError, match="theme_bias"):
6184
validate_signal(payload)
85+
86+
87+
def test_signal_rejects_invalid_structured_bias_confidence() -> None:
88+
payload = load_example()
89+
payload["symbol_bias"] = {"MU": {"bias": "watch", "confidence": 1.5}}
90+
91+
with pytest.raises(SignalValidationError, match="confidence"):
92+
validate_signal(payload)

0 commit comments

Comments
 (0)