Skip to content

Commit 14a03ea

Browse files
enriquephlclaude
andcommitted
chore(model-config): unify affinity/pde defaults on gemini-flash-lite
affinity_evaluation now defaults to google/gemini-3.1-flash-lite, with claude-haiku-4.5 demoted to the last fallback (after deepseek-v4-flash): it's cheaper, more NSFW-tolerant, and unifies the per-task model chain with memory/insight extraction. Comment updated to match. Also moves the reserved pde_decision task to gemini-3.1-flash-lite (with a commented haiku fallback, max_tokens 300 -> 400) and bumps chat_companion max_tokens 1500 -> 1600. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent ca681f4 commit 14a03ea

1 file changed

Lines changed: 10 additions & 8 deletions

File tree

examples/model_config.toml

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
model = "x-ai/grok-4.20"
1414
fallback = ["thedrummer/cydonia-24b-v4.1", "z-ai/glm-4.7-flash"]
1515
temperature = 0.8
16-
max_tokens = 1500
16+
max_tokens = 1600
1717
# Companion replies only render `content`; the reasoning channel is ignored,
1818
# so disable it to save latency/cost. Same shape as OpenRouter's `reasoning`
1919
# object — set `{ exclude = true }` instead if you want it generated but hidden.
@@ -58,19 +58,21 @@ max_tokens = 800
5858
# decision/classification layer (fast, strong instruction-following); the
5959
# structured-extraction tasks above use gemini-flash-lite instead.
6060
[tasks.pde_decision]
61-
model = "anthropic/claude-haiku-4.5"
61+
model = "google/gemini-3.1-flash-lite"
62+
#fallback = ["deepseek/deepseek-v4-flash", "anthropic/claude-haiku-4.5"]
6263
temperature = 0.3
63-
max_tokens = 300
64+
max_tokens = 400
6465

6566
# Per-turn six-axis affinity evaluation (post-process, semantic axes).
6667
# Scores how a single chat turn should move warmth/trust/intimacy plus the
6768
# content nudges to intrigue/tension, as small per-turn deltas. Runs only on
68-
# Reply turns, after the SSE stream — never blocks the chat response. Haiku
69-
# 4.5 is fast with reliable structured JSON; gemini-flash-lite +
70-
# deepseek-v4-flash keep it alive if Anthropic is down.
69+
# Reply turns, after the SSE stream — never blocks the chat response.
70+
# gemini-3.1-flash-lite leads here: cheaper, more NSFW-tolerant, and unifies
71+
# the model chain with the extraction tasks above. deepseek-v4-flash +
72+
# claude-haiku-4.5 fallback keep it alive if Google is down.
7173
[tasks.affinity_evaluation]
72-
model = "anthropic/claude-haiku-4.5"
73-
fallback = ["google/gemini-3.1-flash-lite", "deepseek/deepseek-v4-flash"]
74+
model = "google/gemini-3.1-flash-lite"
75+
fallback = ["deepseek/deepseek-v4-flash", "anthropic/claude-haiku-4.5"]
7476
temperature = 0.3
7577
max_tokens = 400
7678

0 commit comments

Comments
 (0)