Commit 2ef5950
Cross-load packed weight cache reuse for XNNPACK
Summary:
Add cross-load reuse + multi-PTE safety to the file-backed packed weight cache (D106673663). The first PTE in a session calls `save_packed_index()` to append a trailer; subsequent process launches mmap the file and pre-populate `name_to_packed_data_metadata_` so `look_up()` hits for every saved weight and `xnn_create_runtime` skips packing entirely.
## Cache file format
```
[packed data regions] (written by reserve_space)
[index entries] (written by save_packed_index)
each: name_len(4B) | name(N) | file_offset(8B) | data_size(8B)
[footer: 20 bytes]
index_start(8B) | entry_count(4B) | magic "XPWC"(4B) | version(4B)
```
## Lifecycle invariants
- `cache_loaded_` gate: `load_packed_cache()` runs at most once per process per path. Subsequent PTE inits for the same path reopen the write fd without re-reading the trailer.
- `from_load` flag: persistent entries (loaded from trailer or promoted on save) skip `delete_packed_data` cleanup. This keeps the mmap region and metadata alive across PTE unload/reload, so the next init hits the cache instead of repacking. Without this, every PTE destroy/recreate cycle appended a fresh copy to the file (~450 MB per cycle).
- No-op save short-circuit: `save_packed_index` returns early when no new `reserve_space` happened since the last save, avoiding the mtime churn that previously made the cache file look modified on every model load.
## Multi-PTE behavior
- Multiple PTEs (or methods that don't share weights) in the same model load share one cache file. Each PTE's `reserve_space` extends the file; `finalize_for_runtime` msyncs only newly added regions; `save_packed_index` writes one trailer covering all PTEs at the end of the load.
- Sibling PTEs that opt out of the mmap path (caller passes empty `packed_cache_path`) early-return from `initialize_for_runtime` and fall through to heap allocation, without touching the singleton's PLLM state.
- Cross-model coexistence relies on caller-side discipline: only models that opt in set a non-empty cache path. Setting different non-empty paths concurrently is not supported by this singleton design.
## Caller change
`XNNPACKBackend::init` always calls `set_packed_cache_path` (with empty string for non-opted-in PTEs). This keeps the singleton path in sync with the current PTE instead of inheriting a sibling's path.
## Test Plan
```
buck2 test fbcode//executorch/backends/xnnpack/test:test_xnn_weights_cache # 5 pass
buck2 build fbsource//xplat/executorch/backends/xnnpack:xnnpack_backendApple
buck2 build fbsource//xplat/executorch/backends/xnnpack:xnnpack_backend
buck2 build fbcode//executorch/backends/xnnpack:xnnpack_backend
```
On device (iOS Stella build, PLLM + Llama3 runner):
- Cold start: load `(1184 entries)` from cache, `reserve_mmap=0` for cached weights
- Cache file size stable at ~593 MB across PLLM unload/reload cycles
- `app_peak ~700 MB` (vs ~2.5 GB pre-fix)
- `compressed ~100 MB` (vs ~1.7 GB pre-fix)
Differential Revision: D1067170931 parent 73f61cf commit 2ef5950
7 files changed
Lines changed: 1004 additions & 119 deletions
File tree
- backends/xnnpack
- runtime
- test/runtime
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
103 | | - | |
104 | | - | |
105 | | - | |
| 103 | + | |
106 | 104 | | |
107 | | - | |
| 105 | + | |
108 | 106 | | |
109 | 107 | | |
110 | 108 | | |
| |||
120 | 118 | | |
121 | 119 | | |
122 | 120 | | |
123 | | - | |
| 121 | + | |
124 | 122 | | |
125 | 123 | | |
126 | 124 | | |
| |||
149 | 147 | | |
150 | 148 | | |
151 | 149 | | |
152 | | - | |
| 150 | + | |
153 | 151 | | |
154 | 152 | | |
155 | 153 | | |
| |||
180 | 178 | | |
181 | 179 | | |
182 | 180 | | |
183 | | - | |
| 181 | + | |
184 | 182 | | |
185 | 183 | | |
186 | 184 | | |
187 | 185 | | |
188 | 186 | | |
189 | 187 | | |
190 | | - | |
| 188 | + | |
| 189 | + | |
191 | 190 | | |
192 | 191 | | |
193 | 192 | | |
| |||
218 | 217 | | |
219 | 218 | | |
220 | 219 | | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
221 | 227 | | |
222 | 228 | | |
223 | | - | |
224 | | - | |
| 229 | + | |
| 230 | + | |
225 | 231 | | |
226 | 232 | | |
227 | | - | |
| 233 | + | |
228 | 234 | | |
229 | 235 | | |
230 | 236 | | |
231 | 237 | | |
232 | 238 | | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
242 | 243 | | |
243 | 244 | | |
244 | 245 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
23 | 34 | | |
24 | 35 | | |
25 | 36 | | |
| |||
0 commit comments