Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions docs/tutorials/export-pytorch-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,3 +106,54 @@ torch.onnx.export(CustomInverse(), (x,), f, custom_opsets={"com.microsoft": 5})
```

Note that you can export a custom op to any version >= the opset version used at registration.

## Handling External Data Files

When exporting PyTorch models using `torch.onnx.export` with opset 17+,
PyTorch may automatically split the export into two files:

- `model.onnx` — the model graph structure
- `model.onnx.data` — the weight tensors stored as external data

Both files must be present in the same directory for ONNX Runtime to load
the model correctly. `InferenceSession` will fail if `model.onnx.data` is missing.

```python
# Fails if model.onnx.data is not in the same directory
session = onnxruntime.InferenceSession("model.onnx")
```

### Option A — Models under 2 GB: merge into a single file

For models whose weights fit within protobuf's 2 GB limit, the split files can be
merged into a single self-contained `.onnx` file. This simplifies distribution
to Docker images, cloud buckets, and API endpoints:

```python
import onnx

model = onnx.load("model.onnx", load_external_data=True)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot what do you recommend for files > 2Gb? (not supported by protobuf)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR to address this — added Option B covering models ≥ 2 GB: keep both files co-located; ONNX Runtime resolves .data relative to the .onnx path at load time. Also includes the GCS/S3 upload pattern.

onnx.save(model, "model_single.onnx", save_as_external_data=False)
```

> **Note:** Requires the `onnx` package (`pip install onnx`). Both `model.onnx`
> and `model.onnx.data` must be in your working directory before running this.

### Option B — Models 2 GB and above: deploy as a file pair

When model weights exceed protobuf's 2 GB limit, merging is not possible.
The `.onnx.data` file is the ONNX spec's mechanism for handling this case.
Both files must be co-located at inference time:

```python
# Both model.onnx and model.onnx.data must be in the same directory
session = onnxruntime.InferenceSession("/path/to/model/model.onnx")
# ONNX Runtime resolves model.onnx.data relative to the .onnx file location
```

For cloud storage (e.g. GCS, S3), upload both files to the same prefix and
download both to the same local directory before loading:

```bash
gsutil cp model.onnx model.onnx.data gs://your-bucket/models/
```