Name: torch
Version: 2.8.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, sympy, typing-extensions
Required-by: executorch, timm, torchaudio, torchdata, torchsr, torchvision
---
Name: transformers
Version: 4.47.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by:
---
Name: coremltools
Version: 9.0b1
Summary: Community Tools for Core ML
Home-page: https://github.com/apple/coremltools
Author: Apple Inc.
Author-email: coremltools@apple.com
License: BSD
Location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Editable project location: /opt/miniconda3/envs/executorch/lib/python3.11/site-packages
Requires: attrs, cattrs, numpy, packaging, protobuf, pyaml, sympy, tqdm
Required-by: executorch
🐞Describing the bug
I'm roughly following this guide on LLM exporting. I adjusted the input names to be able to use it with this HF demo. I also added W8A8 quantization to reduce space. The model takes 1.4Gb saved. by my estimations, all the activations shouldn't take more than 1Gb. Nevertheless, when calling "predict" on IPhone or IPad (iOS18 both, M1 chip IPad pro and IPhone 15 pro, both have ~8Gb of RAM), it OOMs with memory profiler showing a 10Gb malloc call inside some metal dispatch call. It also happens when using the
.mlpackageGUI benchmarking feature. It also uses a bunch of RAM (~12Gb) when benchmarking on iOS15 using that feature.To Reproduce
System environment (please complete the following information):
Converted on
macOS 15.6.1 Apple M3 Pro 18Gb.Ran on:
macOS 15.6.1 Apple M3 Pro 18Gb.iPadOS 18.6.2 iPad Pro M1iOS 18.6.2 iPhone 15 ProAdditional context