Commit 4344f00
committed
cuda: add per-session mutable state rebinding
Local agent serving needs to host multiple logical conversations on one CUDA-resident model without multiplying the model weights. Loading one AOTI module per conversation is not viable for large local models, while sharing the default mutable state across conversations would let KV/recurrent/conv buffers bleed between users.
This adds the CUDA-private foundation for separating those concerns: weights remain owned by the loaded AOTI container, while mutable buffer FQNs can be registered as per-session state and rebound before execution. The path is fail-closed and dormant until a model opts in by creating a mutable-state context and validating coverage, so existing CUDA models keep their current behavior.
The branch also wires the new source and unit coverage into both Buck and CMake so the primitive can land independently before any model-specific engine consumes it.1 parent d7ca5db commit 4344f00
6 files changed
Lines changed: 1223 additions & 1 deletion
File tree
- backends/cuda
- runtime
- test
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
187 | | - | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
188 | 190 | | |
189 | 191 | | |
190 | 192 | | |
| |||
236 | 238 | | |
237 | 239 | | |
238 | 240 | | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| 3 | + | |
2 | 4 | | |
3 | 5 | | |
4 | 6 | | |
| |||
105 | 107 | | |
106 | 108 | | |
107 | 109 | | |
| 110 | + | |
108 | 111 | | |
109 | 112 | | |
110 | 113 | | |
| 114 | + | |
111 | 115 | | |
112 | 116 | | |
113 | 117 | | |
| |||
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
436 | 437 | | |
437 | 438 | | |
438 | 439 | | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
439 | 444 | | |
440 | 445 | | |
441 | 446 | | |
| |||
539 | 544 | | |
540 | 545 | | |
541 | 546 | | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
542 | 553 | | |
543 | 554 | | |
544 | 555 | | |
| |||
826 | 837 | | |
827 | 838 | | |
828 | 839 | | |
| 840 | + | |
| 841 | + | |
829 | 842 | | |
830 | 843 | | |
831 | 844 | | |
| |||
0 commit comments