This guide covers how to add new models to the repository and contribute changes.
Before contributing:
-
Legal approval is required before publishing new models. Submit a request at go/genairequest.
-
Join the access lists (required for GitHub repo access): qai-hub-users and qai-hub-models-contributors.
-
Clone the repository and set up your environment:
git clone https://github.com/qualcomm/ai-hub-models cd ai-hub-models python scripts/build_and_test.py install_deps source qaihm-dev/bin/activate pre-commit install
-
Create a branch using the format:
dev/<username>/<branch_name>
- model_id - The folder name (e.g.,
yolov7,ddrnet23_slim) - model_name - The published display name in info.yaml (e.g., "YOLOv7", "DDRNet23-Slim")
Each model lives in qai_hub_models/models/<model_id>/ and requires:
| File | Purpose |
|---|---|
model.py |
PyTorch model inheriting from BaseModel |
app.py |
End-to-end application with pre/post-processing (can be inherited from models/_shared/) |
demo.py |
CLI demo running the app on sample data |
test.py |
Unit tests for the model |
info.yaml |
Metadata for the public website |
Note: Many models extend shared implementations from models/_shared/. For example, segmentation models can inherit from CityscapesSegmentor and use the shared SegmentationApp. Check existing models solving similar tasks before writing everything from scratch.
Sample models to reference:
- Regular model:
ddrnet23_slim- segmentation model using shared base class - Collection model:
whisper_tiny- encoder-decoder with separately compiled components
| File | Purpose |
|---|---|
requirements.txt |
Model-specific dependencies (pinned versions required) |
code-gen.yaml |
Custom options for export.py generation |
These are created by running codegen scripts:
| File | Generated By |
|---|---|
README.md, export.py, evaluate.py, test_generated.py |
python qai_hub_models/scripts/run_codegen.py -m <model_id> (evaluate.py only generated if model defines eval_datasets() and get_evaluator()) |
perf.yaml, numerics.yaml, release-assets.yaml |
Scorecard / CI (generated weekly, not included in your PR) |
Model-specific dependencies not in the base environment.
All packages must be pinned to exact versions (e.g., torch==2.0.1).
When adding dependencies:
- Check
qai_hub_models/requirements.txtfirst - don't duplicate - Check
global_requirements.txt- use the same version if possible - If a different version is required, set
global_requirements_incompatible: trueincode-gen.yaml - After adding new packages, run
python qai_hub_models/scripts/generate_global_requirements.py - If the package is not in
global_requirements.txt, you can choose any version as long as its dependencies are compatible with existing global requirements
Additional code-gen.yaml options for complex dependency scenarios:
global_requirements_incompatible- Set when versions differ from global requirementspip_pre_build_reqs- Packages that must be installed before the model's requirements (for broken build dependencies)pip_install_flags- Extra pip flags needed when installing (requiresglobal_requirements_incompatible: true)
The model file defines the PyTorch module that will be compiled and run on-device.
All models must:
- Inherit from
BaseModel(which inheritstorch.nn.Module) - Include
MODEL_ID = __name__.split(".")[-2]before the class definition - Implement the required methods listed below
| Method | Description |
|---|---|
from_pretrained(cls) |
Classmethod to load pretrained weights. All arguments must have defaults. |
get_input_spec() |
Static method returning InputSpec dict of {input_name: (shape, dtype)}. For image inputs, follow the standard: RGB format with values in range [0, 1]. |
get_output_names() |
Static method returning list of output tensor names |
These have default implementations but can be overridden:
| Method | Description |
|---|---|
_sample_inputs_impl() |
Provide real sample inputs instead of random data. Important for more accurate PSNR data, since export.py runs a single sample inference and reports the PSNR difference between torch and device. |
_get_input_spec_for_instance() |
Instance-specific input spec (when shapes depend on instance vars) |
get_channel_last_inputs() / get_channel_last_outputs() |
Inputs/outputs to transpose for on-device performance. Highly recommended for 4D tensors (especially image tensors). |
get_hub_compile_options() / get_hub_profile_options() |
Custom AI Hub Workbench flags |
get_unsupported_reason() |
Mark specific device attributes that can't be supported. Rarely defined in practice; only necessary if a specific Hexagon version is required for advanced models. |
eval_datasets() |
List of dataset names for evaluation (must be names listed in datasets/__init__.py). If unset, model will not support eval. |
get_evaluator() |
Return evaluator instance for accuracy measurement. If unset, model will not support eval. |
calibration_dataset_name() |
Dataset for quantization calibration. If unset, model will not support quantization. |
get_hub_litemp_percentage(precision) |
Percentage (0-100) of layers to keep in higher precision for mixed precision |
When your model depends on external code (e.g., from a GitHub repo or third-party package), follow this preference order:
-
Add to requirements.txt (Preferred) - Add the dependency to
requirements.txt. Prefer PyPI packages when available:timm==0.9.2If not on PyPI, install directly from GitHub by adding to
code-gen.yaml:additional_pip_requirements: - git+https://github.com/owner/repo.git@v1.0.0
-
Monkeypatching - If you need to modify external code behavior, use monkeypatching rather than copying source (see Appendix: Monkeypatching).
-
SourceAsRoot (Last Resort) - Only use
SourceAsRootwhen the above approaches are not possible (e.g., the code is not installable, requires heavy modifications, or has incompatible dependencies). This utility clones the external repo to the user's machine and sets up the Python environment as if running from that repo's source root. It also allows you to apply patches to the cloned source. Should be avoided when possible.
For pretrained weights:
- Library weights - Many libraries provide pretrained options
- Download links - Use URLs from model READMEs
- S3 upload - Upload to
qaihub-public-assetsbucket (see Uploading Assets to S3)
CollectionModel - For models compiled as separate assets (e.g., encoder-decoder):
- Main class inherits from
CollectionModel - Each component inherits from
BaseModel - See
whisper_tinyfor an example
BasePrecompiledModel - For models where only compiled assets are published (no PyTorch source).
The app enables end-to-end usage with user-friendly I/O formats.
- Define an
Appclass that takes aCallablein__init__. The callable should represent a model. We useCallable(rather than a specific type) because the PyTorch implementation may be replaced with an equivalent implementation that uses a different runtime. - Implement a
predict()method for inference
Check these files in qai_hub_models/utils/ before implementing your own:
| File | Purpose |
|---|---|
image_processing.py |
Image preprocessing and conversion (e.g., app_to_net_image_inputs) |
bounding_box_processing.py |
NMS and bounding box utilities for object detection |
draw.py |
Drawing utilities for visualization (points, connections, boxes) |
display.py |
Display and save image utilities |
asset_loaders.py |
Loading assets from URLs, S3, Google Drive |
transpose_channel.py |
Channel transposition utilities (NCHW <-> NHWC) |
Also check qai_hub_models/models/_shared/ for shared app implementations for common tasks (segmentation, classification, object detection, etc.).
Follow existing patterns for consistency. For image inputs to the model, follow the standard: RGB format with values normalized to range [0, 1].
| Task | Input Format | Output Format |
|---|---|---|
| Image processing | PIL.Image.Image |
PIL.Image.Image |
| Video processing | filepath string | filepath string |
| Classification | image/tensor | tensor of probabilities (post-softmax) |
| Object detection | image | list of bounding boxes (post-NMS) |
Check qai_hub_models/utils/ for existing utilities like app_to_net_image_inputs before implementing your own.
The demo runs the app on sample data and presents results. See ddrnet23_slim/demo.py for an example.
- Parse input arguments (model options,
--eval-mode,--device,--output-dir, etc.) - Initialize the model
- Load and preprocess inputs
- Initialize and run the app
- Display or save results (behind
is_testflag for unit tests)
Sample data is typically stored in S3 and loaded using asset utilities.
| Test | Required | Purpose |
|---|---|---|
test_task |
Yes | PyTorch model accuracy on sample input |
test_demo |
Yes | Demo runs without exceptions |
Metadata about the model for the public website.
- See
ddrnet23_slim/info.yamlfor an example - See
QAIHMModelInfoinqai_hub_models/models/_configs/info_yaml.pyfor field details
Key fields:
name,id,status- Model identity (id must match folder name). Setstatus: pendingfor new models. The scorecard will automatically update this topubliconce the model passes validation.headline,description,use_case,domain- Public-facing descriptiontags,applicable_scenarios,related_models- Categorizationform_factors- Target devices (Phone, Tablet, IoT)technical_details- Model specs (some auto-filled by script)license_type,source_repo,research_paper- Attributionlabels_file- Path to labels file for classification models
See QAIHMModelInfo in info_yaml.py or an example like ddrnet23_slim/info.yaml for the full list of available fields.
Auto-fill size and parameter count:
python qai_hub_models/scripts/autofill_info_yaml.py -m <model_id>Options for generating export.py, evaluate.py, and test_generated.py.
- See
ddrnet23_slim/code-gen.yamlfor a basic example - See
yolov7/code-gen.yamlfor an example withdisabled_paths - See
QAIHMModelCodeGeninqai_hub_models/models/_configs/code_gen_yaml.pyfor all options
Common options:
supported_precisions- List of precisions to enable (float, w8a8, w8a16, w4a16, etc.)has_on_target_demo- Set to true if the model supports on-device demodisabled_paths- Disable specific precision/runtime combinations with reasonglobal_requirements_incompatible- Set to true if model needs different package versions
To enable quantized precisions (w8a8, w8a16, etc.) for a model:
Create or reuse a dataset in qai_hub_models/datasets/. Inherit from BaseDataset and implement the required methods.
Required methods:
| Method | Description |
|---|---|
__init__(self, split, ...) |
Initialize with DatasetSplit.TRAIN or DatasetSplit.VAL |
__len__(self) |
Return the number of samples |
__getitem__(self, idx) |
Return (input_tensor, ground_truth) for a sample |
_download_data(self) |
Download dataset to self.dataset_path |
default_samples_per_job() |
Static method returning default batch size for inference jobs |
get_dataset_metadata() |
Return DatasetMetadata(link, split_description) for website |
Optional methods:
| Method | Description |
|---|---|
_validate_data(self) |
Validate downloaded data (default: check path exists) |
collate_fn(batch) |
Custom collation for DataLoader |
See imagenette.py for an example dataset implementation.
Create or reuse an evaluator in qai_hub_models/evaluators/. Inherit from BaseEvaluator and implement the required methods.
Required methods:
| Method | Description |
|---|---|
add_batch(self, output, gt) |
Accumulate metrics for a batch of model outputs vs ground truth |
reset(self) |
Reset accumulated state |
get_accuracy_score(self) |
Return single float accuracy (higher is better) |
formatted_accuracy(self) |
Return formatted string with accuracy and units |
Optional methods:
| Method | Description |
|---|---|
get_metric_metadata(self) |
Return MetricMetadata for website publishing |
See classification_evaluator.py for an example evaluator implementation.
Add these methods to model.py:
@staticmethod
def eval_datasets() -> list[str]:
return ["<dataset_name>"]
@staticmethod
def calibration_dataset_name() -> str:
return "<dataset_name>"
def get_evaluator(self) -> BaseEvaluator:
return YourEvaluator(...) # Return an instanceAdd supported precisions (defaults to float only if not specified):
supported_precisions:
- float
- w8a8
- w8a16python qai_hub_models/scripts/run_codegen.py -m <model_id>
python -m qai_hub_models.models.<model_id>.evaluate --precision w8a8Accuracy drop from float should be reasonable (10 points or less). Consider mixed precision (e.g., w8a8_mixed_int16) if accuracy is too low.
For model checkpoints or test data not available via public URLs:
-
Authenticate: Run
python scripts/build_and_test.py validate_aws_credentials(prompts for password) -
Upload: Use AWS profile
qaihm:aws s3 cp <local_file> s3://qaihub-public-assets/qai-hub-models/models/<model_id>/v1/ --profile qaihm
-
Set permissions: Grant public-read access when uploading
-
Reference in code: Set
MODEL_ASSET_VERSION = 1in model.py -
Versioning: Assets cannot be deleted. For new versions, create
v2/,v3/, etc. and updateMODEL_ASSET_VERSION.
Before submitting a PR:
# Run codegen for your model
python qai_hub_models/scripts/run_codegen.py -m <model_id>
# Auto-fill info.yaml
python qai_hub_models/scripts/autofill_info_yaml.py -m <model_id>
# Run all pre-commit hooks
pre-commit run --all-files
# Run package unit tests
python scripts/build_and_test.py test_qaihmRun model-specific tests:
# Test export (install model dependencies first per README)
python -m qai_hub_models.models.<model_id>.export --target-runtime tflite --chipset qualcomm-snapdragon-8gen3
# Test evaluation (if available)
python -m qai_hub_models.models.<model_id>.evaluateexport.pyshould produce a model that profiles successfully on device (for all added precisions)evaluate.pyshould produce good results in torch and on-device for all supported precisions
You can run these tools manually, but they also run automatically via pre-commit hooks:
- Ruff - Linting and formatting:
ruff check --fixandruff format - mypy - Type checking:
mypy qai_hub_models/
Hooks run automatically and include:
- License header insertion (BSD-3)
- YAML validation, trailing whitespace, large file detection
- Ruff check + format
- mypy type checking
Don't import directly: numba, xtcocotools, git - use qai_hub_models.extern.* wrappers instead.
For questions, reach out to the default PR reviewers or ping the Teams channel.
When external model code needs modification for on-device compilation, prefer monkeypatching over copying source. This keeps the codebase maintainable and makes it clear what was changed.
Common pattern:
- Create a
model_patches.pyfile with replacement functions - Import the original class/module
- Replace the method:
OriginalClass.method = patched_method
Example - replacing a forward method (gkt):
# In model_patches.py - define the patched function
def KernelAttention_forward(self, q, k, v, skip=None, mask=None):
# Modified implementation for on-device compatibility
...
# In model.py - apply the patch after importing
from external_repo import KernelAttention
from .model_patches import KernelAttention_forward
KernelAttention.forward = KernelAttention_forwardExample - patching multiple components (sam2):
# Patch module-level functions
import sam2.modeling.backbones.hieradet as hieradet
hieradet.window_partition = window_partition_5d
# Patch class methods with functools.partial
sam2.sam_prompt_encoder._embed_points = functools.partial(
patched_embed_points, sam2.sam_prompt_encoder
)
# Replace entire submodules
for block in model.blocks:
block.mlp = PatchedMLP(block.mlp)When to monkeypatch:
- Replacing operations unsupported by QNN (e.g., certain einsum patterns, dynamic shapes)
- Optimizing tensor layouts for on-device performance (e.g., 6D → 5D tensors)
- Removing GPU-specific code paths
