Commit 7fde821
committed
llm_runner: plumb prefill temperature
Session-based serving drives generation as prefill plus token steps instead of one monolithic generate call. For that path to be correct, the first sampled token produced during prefill must honor the same sampling inputs as the rest of the decode loop; otherwise requests using temperature can silently start greedily and then switch behavior on later tokens.
This threads optional temperature through TextPrefiller and exposes the existing TextTokenGenerator logit-processor application so token-step callers can reuse the same sampling preparation as generate(). The goal is to remove a divergence point before session-backed serving starts depending on these primitives.
Default behavior remains greedy, so existing callers that do not pass temperature keep the same semantics. The added tests focus on the new non-default path and on sharing the logit-processor logic rather than duplicating it.1 parent d7ca5db commit 7fde821
4 files changed
Lines changed: 52 additions & 20 deletions
File tree
- extension/llm/runner
- test
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
82 | | - | |
83 | | - | |
| 82 | + | |
| 83 | + | |
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| |||
112 | 112 | | |
113 | 113 | | |
114 | 114 | | |
115 | | - | |
| 115 | + | |
116 | 116 | | |
117 | | - | |
| 117 | + | |
118 | 118 | | |
119 | 119 | | |
120 | 120 | | |
| |||
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
220 | | - | |
221 | | - | |
| 220 | + | |
| 221 | + | |
222 | 222 | | |
223 | 223 | | |
224 | 224 | | |
225 | 225 | | |
226 | | - | |
227 | | - | |
| 226 | + | |
| 227 | + | |
228 | 228 | | |
229 | 229 | | |
230 | 230 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
31 | | - | |
| 31 | + | |
| 32 | + | |
32 | 33 | | |
33 | 34 | | |
34 | 35 | | |
| |||
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
57 | | - | |
58 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
59 | 67 | | |
60 | 68 | | |
61 | 69 | | |
| |||
65 | 73 | | |
66 | 74 | | |
67 | 75 | | |
68 | | - | |
| 76 | + | |
69 | 77 | | |
70 | 78 | | |
71 | 79 | | |
72 | 80 | | |
73 | 81 | | |
74 | | - | |
| 82 | + | |
| 83 | + | |
75 | 84 | | |
76 | 85 | | |
77 | 86 | | |
| |||
92 | 101 | | |
93 | 102 | | |
94 | 103 | | |
95 | | - | |
| 104 | + | |
| 105 | + | |
96 | 106 | | |
97 | 107 | | |
98 | 108 | | |
| |||
128 | 138 | | |
129 | 139 | | |
130 | 140 | | |
131 | | - | |
| 141 | + | |
| 142 | + | |
132 | 143 | | |
133 | 144 | | |
134 | 145 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
38 | 40 | | |
39 | | - | |
| 41 | + | |
| 42 | + | |
40 | 43 | | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
44 | 47 | | |
45 | 48 | | |
| 49 | + | |
| 50 | + | |
46 | 51 | | |
47 | 52 | | |
48 | 53 | | |
49 | 54 | | |
50 | | - | |
| 55 | + | |
| 56 | + | |
51 | 57 | | |
52 | 58 | | |
53 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
58 | 70 | | |
59 | 71 | | |
60 | 72 | | |
| |||
126 | 138 | | |
127 | 139 | | |
128 | 140 | | |
129 | | - | |
130 | | - | |
131 | | - | |
| 141 | + | |
132 | 142 | | |
133 | 143 | | |
134 | 144 | | |
| |||
180 | 190 | | |
181 | 191 | | |
182 | 192 | | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
183 | 198 | | |
184 | 199 | | |
185 | 200 | | |
| |||
0 commit comments