Skip to content

Latest commit

 

History

History
476 lines (324 loc) · 18.7 KB

File metadata and controls

476 lines (324 loc) · 18.7 KB

Contributing to AI Hub Models

Qualcomm® AI Hub Models

This guide covers how to add new models to the repository and contribute changes.

Prerequisites

Before contributing:

  1. Legal approval is required before publishing new models. Submit a request at go/genairequest.

  2. Join the access lists (required for GitHub repo access): qai-hub-users and qai-hub-models-contributors.

  3. Clone the repository and set up your environment:

    git clone https://github.com/qualcomm/ai-hub-models
    cd ai-hub-models
    python scripts/build_and_test.py install_deps
    source qaihm-dev/bin/activate
    pre-commit install
  4. Create a branch using the format: dev/<username>/<branch_name>

Terminology

  • model_id - The folder name (e.g., yolov7, ddrnet23_slim)
  • model_name - The published display name in info.yaml (e.g., "YOLOv7", "DDRNet23-Slim")

Model Directory Structure

Each model lives in qai_hub_models/models/<model_id>/ and requires:

File Purpose
model.py PyTorch model inheriting from BaseModel
app.py End-to-end application with pre/post-processing (can be inherited from models/_shared/)
demo.py CLI demo running the app on sample data
test.py Unit tests for the model
info.yaml Metadata for the public website

Note: Many models extend shared implementations from models/_shared/. For example, segmentation models can inherit from CityscapesSegmentor and use the shared SegmentationApp. Check existing models solving similar tasks before writing everything from scratch.

Sample models to reference:

  • Regular model: ddrnet23_slim - segmentation model using shared base class
  • Collection model: whisper_tiny - encoder-decoder with separately compiled components

Optional Files

File Purpose
requirements.txt Model-specific dependencies (pinned versions required)
code-gen.yaml Custom options for export.py generation

Auto-generated Files

These are created by running codegen scripts:

File Generated By
README.md, export.py, evaluate.py, test_generated.py python qai_hub_models/scripts/run_codegen.py -m <model_id> (evaluate.py only generated if model defines eval_datasets() and get_evaluator())
perf.yaml, numerics.yaml, release-assets.yaml Scorecard / CI (generated weekly, not included in your PR)

1. requirements.txt

Model-specific dependencies not in the base environment.

All packages must be pinned to exact versions (e.g., torch==2.0.1).

When adding dependencies:

  1. Check qai_hub_models/requirements.txt first - don't duplicate
  2. Check global_requirements.txt - use the same version if possible
  3. If a different version is required, set global_requirements_incompatible: true in code-gen.yaml
  4. After adding new packages, run python qai_hub_models/scripts/generate_global_requirements.py
  5. If the package is not in global_requirements.txt, you can choose any version as long as its dependencies are compatible with existing global requirements

Additional code-gen.yaml options for complex dependency scenarios:

  • global_requirements_incompatible - Set when versions differ from global requirements
  • pip_pre_build_reqs - Packages that must be installed before the model's requirements (for broken build dependencies)
  • pip_install_flags - Extra pip flags needed when installing (requires global_requirements_incompatible: true)

2. model.py

The model file defines the PyTorch module that will be compiled and run on-device.

Basic Structure

All models must:

  • Inherit from BaseModel (which inherits torch.nn.Module)
  • Include MODEL_ID = __name__.split(".")[-2] before the class definition
  • Implement the required methods listed below

Required Methods

Method Description
from_pretrained(cls) Classmethod to load pretrained weights. All arguments must have defaults.
get_input_spec() Static method returning InputSpec dict of {input_name: (shape, dtype)}. For image inputs, follow the standard: RGB format with values in range [0, 1].
get_output_names() Static method returning list of output tensor names

Optional Methods

These have default implementations but can be overridden:

Method Description
_sample_inputs_impl() Provide real sample inputs instead of random data. Important for more accurate PSNR data, since export.py runs a single sample inference and reports the PSNR difference between torch and device.
_get_input_spec_for_instance() Instance-specific input spec (when shapes depend on instance vars)
get_channel_last_inputs() / get_channel_last_outputs() Inputs/outputs to transpose for on-device performance. Highly recommended for 4D tensors (especially image tensors).
get_hub_compile_options() / get_hub_profile_options() Custom AI Hub Workbench flags
get_unsupported_reason() Mark specific device attributes that can't be supported. Rarely defined in practice; only necessary if a specific Hexagon version is required for advanced models.
eval_datasets() List of dataset names for evaluation (must be names listed in datasets/__init__.py). If unset, model will not support eval.
get_evaluator() Return evaluator instance for accuracy measurement. If unset, model will not support eval.
calibration_dataset_name() Dataset for quantization calibration. If unset, model will not support quantization.
get_hub_litemp_percentage(precision) Percentage (0-100) of layers to keep in higher precision for mixed precision

Loading External Model Code

When your model depends on external code (e.g., from a GitHub repo or third-party package), follow this preference order:

  1. Add to requirements.txt (Preferred) - Add the dependency to requirements.txt. Prefer PyPI packages when available:

    timm==0.9.2
    

    If not on PyPI, install directly from GitHub by adding to code-gen.yaml:

    additional_pip_requirements:
      - git+https://github.com/owner/repo.git@v1.0.0
  2. Monkeypatching - If you need to modify external code behavior, use monkeypatching rather than copying source (see Appendix: Monkeypatching).

  3. SourceAsRoot (Last Resort) - Only use SourceAsRoot when the above approaches are not possible (e.g., the code is not installable, requires heavy modifications, or has incompatible dependencies). This utility clones the external repo to the user's machine and sets up the Python environment as if running from that repo's source root. It also allows you to apply patches to the cloned source. Should be avoided when possible.

For pretrained weights:

  1. Library weights - Many libraries provide pretrained options
  2. Download links - Use URLs from model READMEs
  3. S3 upload - Upload to qaihub-public-assets bucket (see Uploading Assets to S3)

Special Model Types

CollectionModel - For models compiled as separate assets (e.g., encoder-decoder):

  • Main class inherits from CollectionModel
  • Each component inherits from BaseModel
  • See whisper_tiny for an example

BasePrecompiledModel - For models where only compiled assets are published (no PyTorch source).


3. app.py

The app enables end-to-end usage with user-friendly I/O formats.

Requirements

  • Define an App class that takes a Callable in __init__. The callable should represent a model. We use Callable (rather than a specific type) because the PyTorch implementation may be replaced with an equivalent implementation that uses a different runtime.
  • Implement a predict() method for inference

Useful Utilities

Check these files in qai_hub_models/utils/ before implementing your own:

File Purpose
image_processing.py Image preprocessing and conversion (e.g., app_to_net_image_inputs)
bounding_box_processing.py NMS and bounding box utilities for object detection
draw.py Drawing utilities for visualization (points, connections, boxes)
display.py Display and save image utilities
asset_loaders.py Loading assets from URLs, S3, Google Drive
transpose_channel.py Channel transposition utilities (NCHW <-> NHWC)

Also check qai_hub_models/models/_shared/ for shared app implementations for common tasks (segmentation, classification, object detection, etc.).

I/O Conventions

Follow existing patterns for consistency. For image inputs to the model, follow the standard: RGB format with values normalized to range [0, 1].

Task Input Format Output Format
Image processing PIL.Image.Image PIL.Image.Image
Video processing filepath string filepath string
Classification image/tensor tensor of probabilities (post-softmax)
Object detection image list of bounding boxes (post-NMS)

Check qai_hub_models/utils/ for existing utilities like app_to_net_image_inputs before implementing your own.


4. demo.py

The demo runs the app on sample data and presents results. See ddrnet23_slim/demo.py for an example.

Standard Pattern

  1. Parse input arguments (model options, --eval-mode, --device, --output-dir, etc.)
  2. Initialize the model
  3. Load and preprocess inputs
  4. Initialize and run the app
  5. Display or save results (behind is_test flag for unit tests)

Sample data is typically stored in S3 and loaded using asset utilities.


5. test.py

Standard Tests

Test Required Purpose
test_task Yes PyTorch model accuracy on sample input
test_demo Yes Demo runs without exceptions

6. info.yaml

Metadata about the model for the public website.

  • See ddrnet23_slim/info.yaml for an example
  • See QAIHMModelInfo in qai_hub_models/models/_configs/info_yaml.py for field details

Key fields:

  • name, id, status - Model identity (id must match folder name). Set status: pending for new models. The scorecard will automatically update this to public once the model passes validation.
  • headline, description, use_case, domain - Public-facing description
  • tags, applicable_scenarios, related_models - Categorization
  • form_factors - Target devices (Phone, Tablet, IoT)
  • technical_details - Model specs (some auto-filled by script)
  • license_type, source_repo, research_paper - Attribution
  • labels_file - Path to labels file for classification models

See QAIHMModelInfo in info_yaml.py or an example like ddrnet23_slim/info.yaml for the full list of available fields.

Auto-fill size and parameter count:

python qai_hub_models/scripts/autofill_info_yaml.py -m <model_id>

7. code-gen.yaml (Optional)

Options for generating export.py, evaluate.py, and test_generated.py.

Common options:

  • supported_precisions - List of precisions to enable (float, w8a8, w8a16, w4a16, etc.)
  • has_on_target_demo - Set to true if the model supports on-device demo
  • disabled_paths - Disable specific precision/runtime combinations with reason
  • global_requirements_incompatible - Set to true if model needs different package versions

Adding Quantization Support

To enable quantized precisions (w8a8, w8a16, etc.) for a model:

1. Add a Dataset

Create or reuse a dataset in qai_hub_models/datasets/. Inherit from BaseDataset and implement the required methods.

Required methods:

Method Description
__init__(self, split, ...) Initialize with DatasetSplit.TRAIN or DatasetSplit.VAL
__len__(self) Return the number of samples
__getitem__(self, idx) Return (input_tensor, ground_truth) for a sample
_download_data(self) Download dataset to self.dataset_path
default_samples_per_job() Static method returning default batch size for inference jobs
get_dataset_metadata() Return DatasetMetadata(link, split_description) for website

Optional methods:

Method Description
_validate_data(self) Validate downloaded data (default: check path exists)
collate_fn(batch) Custom collation for DataLoader

See imagenette.py for an example dataset implementation.

2. Add an Evaluator

Create or reuse an evaluator in qai_hub_models/evaluators/. Inherit from BaseEvaluator and implement the required methods.

Required methods:

Method Description
add_batch(self, output, gt) Accumulate metrics for a batch of model outputs vs ground truth
reset(self) Reset accumulated state
get_accuracy_score(self) Return single float accuracy (higher is better)
formatted_accuracy(self) Return formatted string with accuracy and units

Optional methods:

Method Description
get_metric_metadata(self) Return MetricMetadata for website publishing

See classification_evaluator.py for an example evaluator implementation.

3. Update the Model

Add these methods to model.py:

@staticmethod
def eval_datasets() -> list[str]:
    return ["<dataset_name>"]

@staticmethod
def calibration_dataset_name() -> str:
    return "<dataset_name>"

def get_evaluator(self) -> BaseEvaluator:
    return YourEvaluator(...)  # Return an instance

4. Update code-gen.yaml

Add supported precisions (defaults to float only if not specified):

supported_precisions:
  - float
  - w8a8
  - w8a16

5. Run Codegen and Test

python qai_hub_models/scripts/run_codegen.py -m <model_id>
python -m qai_hub_models.models.<model_id>.evaluate --precision w8a8

Accuracy drop from float should be reasonable (10 points or less). Consider mixed precision (e.g., w8a8_mixed_int16) if accuracy is too low.


Uploading Assets to S3

For model checkpoints or test data not available via public URLs:

  1. Authenticate: Run python scripts/build_and_test.py validate_aws_credentials (prompts for password)

  2. Upload: Use AWS profile qaihm:

    aws s3 cp <local_file> s3://qaihub-public-assets/qai-hub-models/models/<model_id>/v1/ --profile qaihm
  3. Set permissions: Grant public-read access when uploading

  4. Reference in code: Set MODEL_ASSET_VERSION = 1 in model.py

  5. Versioning: Assets cannot be deleted. For new versions, create v2/, v3/, etc. and update MODEL_ASSET_VERSION.


Verification Workflow

Before submitting a PR:

# Run codegen for your model
python qai_hub_models/scripts/run_codegen.py -m <model_id>

# Auto-fill info.yaml
python qai_hub_models/scripts/autofill_info_yaml.py -m <model_id>

# Run all pre-commit hooks
pre-commit run --all-files

# Run package unit tests
python scripts/build_and_test.py test_qaihm

Run model-specific tests:

# Test export (install model dependencies first per README)
python -m qai_hub_models.models.<model_id>.export --target-runtime tflite --chipset qualcomm-snapdragon-8gen3

# Test evaluation (if available)
python -m qai_hub_models.models.<model_id>.evaluate
  • export.py should produce a model that profiles successfully on device (for all added precisions)
  • evaluate.py should produce good results in torch and on-device for all supported precisions

Code Quality

Linting, Formatting, and Type Checking

You can run these tools manually, but they also run automatically via pre-commit hooks:

  • Ruff - Linting and formatting: ruff check --fix and ruff format
  • mypy - Type checking: mypy qai_hub_models/

Pre-commit Hooks

Hooks run automatically and include:

  • License header insertion (BSD-3)
  • YAML validation, trailing whitespace, large file detection
  • Ruff check + format
  • mypy type checking

Import Restrictions

Don't import directly: numba, xtcocotools, git - use qai_hub_models.extern.* wrappers instead.


Support

For questions, reach out to the default PR reviewers or ping the Teams channel.


Appendix: Monkeypatching

When external model code needs modification for on-device compilation, prefer monkeypatching over copying source. This keeps the codebase maintainable and makes it clear what was changed.

Common pattern:

  1. Create a model_patches.py file with replacement functions
  2. Import the original class/module
  3. Replace the method: OriginalClass.method = patched_method

Example - replacing a forward method (gkt):

# In model_patches.py - define the patched function
def KernelAttention_forward(self, q, k, v, skip=None, mask=None):
    # Modified implementation for on-device compatibility
    ...

# In model.py - apply the patch after importing
from external_repo import KernelAttention
from .model_patches import KernelAttention_forward

KernelAttention.forward = KernelAttention_forward

Example - patching multiple components (sam2):

# Patch module-level functions
import sam2.modeling.backbones.hieradet as hieradet
hieradet.window_partition = window_partition_5d

# Patch class methods with functools.partial
sam2.sam_prompt_encoder._embed_points = functools.partial(
    patched_embed_points, sam2.sam_prompt_encoder
)

# Replace entire submodules
for block in model.blocks:
    block.mlp = PatchedMLP(block.mlp)

When to monkeypatch:

  • Replacing operations unsupported by QNN (e.g., certain einsum patterns, dynamic shapes)
  • Optimizing tensor layouts for on-device performance (e.g., 6D → 5D tensors)
  • Removing GPU-specific code paths