Contributing to Awkward Array

Thank you for your interest in contributing! We're eager to see your ideas and look forward to working with you.

This document describes the technical procedures we follow in this project. It should also be stressed that as members of the Scikit-HEP community, we are all obliged to maintaining a welcoming, harassment-free environment. See the Code of Conduct for details.

Where to start

The front page for the Awkward Array project is its GitHub README. This leads directly to tutorials and reference documentation that you may have already seen. It also includes instructions for compiling for development.

Reporting issues

The first thing you should do if you want to fix something is to submit an issue through GitHub. That way, we can all see it and maybe one of us or a member of the community knows of a solution that could save you the time spent fixing it. If you "assign yourself" to the issue (top of right side-bar), you can signal your intent to fix it in the issue report.

Contributing a pull request

Feel free to open pull requests in GitHub from your forked repo when you start working on the problem. We recommend opening the pull request early so that we can see your progress and communicate about it. (Note that you can git commit --allow-empty to make an empty commit and start a pull request before you even have new code.)

Please make the pull request a draft to indicate that it is in an incomplete state and shouldn't be merged until you click "ready for review."

AI-assisted contributions

We welcome the use of AI tools as part of the development process. They can be valuable aids for drafting, refactoring, documentation, and exploration. However, contributions to Awkward Array require human judgment, contextual understanding, and familiarity with the project’s structure, goals, and standards.

When using AI tools:

You remain the author of the contribution. Review, understand, and test all AI-assisted code or documentation before submitting it under your name. You should be able to explain and defend the changes on request.
Avoid fully automated submissions. Issues or pull requests generated end-to-end by automated tools, without meaningful human review or intent, are not appropriate.
Be respectful of reviewers’ time. Ensure that both the content of the PR and its description reflect your own understanding. Reviewers should not be expected to infer authorship or unknowingly interact with an AI during review.
Disclose significant AI assistance. If AI tools were used for a substantial portion of the contribution, please note this in the PR description. (This guide is an example of that: we used ChatGPT to help with the writing, and in this comment, we acknowledge that fact.)

Contributors are responsible for the correctness, maintainability, and long-term impact of all submitted changes, regardless of whether AI tools were used.

Pull requests that do not meet these expectations may be closed without review.

Getting your pull request reviewed

Currently, we have four regular reviewers of pull requests:

Ianna Osborne (ianna)
Peter Fackeldey (pfackeldey)
Andres Rios Tascon (ariostas)
Iason Krommydas (ikrommyd)

You can request a review from one of us or just comment in GitHub that you want a review and we'll see it. Only one review is required to be allowed to merge a pull request. We'll work with you to get it into shape.

If you're waiting for a response and haven't heard in a few days, it's possible that we forgot/got distracted/thought someone else was reviewing it/thought we were waiting on you, rather than you waiting on us—just write another comment to remind us.

Becoming a regular committer

If you want to contribute frequently, we'll grant you write access to the scikit-hep/awkward repo itself. This is more convenient than pull requests from forked repos.

Git practices

Unless you ask us not to, we might commit directly to your pull request as a way of communicating what needs to be changed. That said, most of the commits on a pull request are from a single author: corrections and suggestions are exceptions.

Therefore, we prefer git branches to be named with your GitHub userid, such as ianna/write-contributing-md.

The titles of pull requests (and therefore the merge commit messages) should follow these conventions. Mostly, this means prefixing the title with one of these words and a colon:

feat: new feature
fix: bug-fix
perf: code change that improves performance
refactor: code change that neither fixes a bug nor adds a feature
style: changes that do not affect the meaning of the code
test: adding missing tests or correcting existing tests
build: changes that affect the build system or external dependencies
docs: documentation only changes
ci: changes to our CI configuration files and scripts
chore: other changes that don't modify src or test files
revert: reverts a previous commit

Almost all pull requests are merged with the "squash and merge" feature, so details about commit history within a pull request are hidden from the main branch's history. Feel free, therefore, to commit with any frequency you're comfortable with.

It is unnecessary to manually edit (rebase) commit history within a pull request.

Building and testing locally

The installation for developers procedure is described in brief on the front page, and in more detail here.

Awkward Array is shipped as two packages: awkward and awkward-cpp. The awkward-cpp package contains the compiled C++ components required for performance, and awkward is only Python code. If you do not need to modify any C++ (the usual case), then awkward-cpp can simply be installed using pip or conda.

Subsequent steps require the generation of code and datafiles (kernel specification, header-only includes). This can be done with the prepare nox session:

nox -s prepare

Details

The prepare session accepts flags to specify exact generation targets, e.g.

nox -s prepare -- --tests --docs

This can reduce the time taken to perform the preparation step in the event that only the package-building step is needed.

nox also lets us reuse the virtualenvs that it creates for each session with the -R flag, eliminating the dependency reinstall time:

nox -R -s prepare

Installing the `awkward-cpp` package

The C++ components can be installed by building the awkward-cpp package:

python -m pip install ./awkward-cpp

Details

If you are working on the C++ components of Awkward Array, it might be more convenient to skip the build isolation step, which involves creating an isolated build environment. First, you must install the build requirements:

python -m pip install "scikit-build-core[pyproject,color]" pybind11 ninja cmake

Then the installation can be performed without build isolation:

python -m pip install --no-build-isolation --check-build-dependencies ./awkward-cpp

Installing the `awkward` package

With awkward-cpp installed, an editable installation of the pure-python awkward package can be performed with

python -m pip install -e .

Testing the installed packages

Finally, let's run the integration test suite to ensure that everything's working as expected:

python -m pytest -n auto tests

For more fine-grained testing, we also have tests of the low-level kernels, which can be invoked with

python -m pytest -n auto awkward-cpp/tests-spec
python -m pytest -n auto awkward-cpp/tests-cpu-kernels

This assumes that the nox -s prepare session ran the --tests target.

Furthermore, if you have an Nvidia GPU and CuPy installed, you can run the CUDA tests with

python -m pytest tests-cuda-kernels
python -m pytest tests-cuda

Unit tests for the kernels

You can also run additional unit tests that have more test coverage for all the low-level kernels for even more detailed fine-grained testing.

For Python Kernels:

python -m pytest -n auto awkward-cpp/tests-spec-explicit

For CPU Kernels:

python -m pytest -n auto awkward-cpp/tests-cpu-kernels-explicit

For CUDA Kernels

python -m pytest tests-cuda-kernels-explicit

Building wheels

Sometimes it's convenient to build a wheel for the awkward-cpp package, so that subsequent re-installs do not require the package to be rebuilt. The build package can be used to do this, though care must be taken to specify the current Python interpreter in pipx:

pipx run --python=$(which python) build --wheel awkward-cpp

The built wheel will then be available in awkward-cpp/dist.

Automatic formatting and linting

The Awkward Array project uses pre-commit to handle formatters and linters. This automatically checks (and may push corrections to) your pull request's git branch.

To respond more quickly to pre-commit's feedback, it can help to install it and run it locally. Once it is installed, run

pre-commit run -a

to test all of your files. If you leave off the -a, it will run only on currently stashed changes.

Automated tests

As stated above, we use pytest to verify the correctness of the code, and GitHub will reject a pull request if either pre-commit or pytest fails (red "X"). All tests must pass for a pull request to be accepted.

Note that if a pull request doesn't modify code, only the documentation tests will run. That's okay: documentation-only pull requests only need the documentation tests to pass.

Testing practices

Unless you're refactoring code, such that your changes are fully tested by the existing test suite, new code should be accompanied by new tests. Our testing suite is organized by GitHub issue or pull request number: that is, test file names are

tests/test_XXXX-yyyy.py

where XXXX is either the number of the issue your pull request fixes or the number of the pull request and yyyy is descriptive text, often the same as the git branch. This makes it easier to run your test in isolation:

python -m pytest tests/test_XXXX-yyyy.py

and it makes it easier to figure out why a particular test was added. The easiest way to make a new testing file is to copy an existing one and replace its test_zzzz functions with your own. The previous tests should also give you a sense of the way we test things and the kinds of things that are constrained in tests.

Building documentation locally

Documentation is automatically built by each pull request. You usually won't need to build the documentation locally, but if you do, this section describes how.

We use Sphinx to generate documentation. You may need to install some additional packages:

To build documentation locally, first prepare the generated data files with

nox -s prepare

Details

Only the --headers and --docs flags are actually required at the time of writing. These can be passed with:

nox -s prepare -- --docs --headers

Then, use nox to run the various documentation build steps

nox -s docs

This command executes multiple custom Python scripts (some require a working internet connection), in addition to using Sphinx and Doxygen to generate the required browser viewable documentation.

To view the built documentation, open

docs/_build/html/index.html

from the root directory of the project in your preferred web browser, e.g.

python -m http.server 8080 --directory docs/_build/html/

Before re-building documentation, you might want to delete the files that were generated to create viewable documentation. A simple command to remove all of them is

rm -rf docs/reference/generated docs/_build docs/_static/doxygen

There is also a cache in the docs/_build/.jupyter_cache directory for Jupyter Book, which can be removed.

The main branch

The Awkward Array main branch must be kept in an unbroken state. There are two reasons for this: so that developers can work independently on known-to-be-working states and so that users can test the latest changes (usually to see if the bug they've discovered is fixed by a potential correction).

The main branch is also never far from the latest released version. We usually deploy patch releases (z in a version number like x.y.z) within days of a bug-fix.

Committing directly to main is not allowed except for

updating the pyproject.toml file to increase the version number, which should be independent of pull requests
updating documentation or non-code files
unprecedented emergencies

and only by the the reviewing team.

The main-v1 branch

The main-v1 branch was split from main just before Awkward 1.x code was removed, so it exists to make 1.10.x bug-fix releases. These commits must be drawn from main-v1, not main, and pull requests must target main-v1 (not the GitHub default). A single commit cannot be applied to both main and main-v1 because they have diverged too much. If a bug-fix needs to be applied to both (unlikely), it will have to be reimplemented on both.

Releases

Currently, only one person can deploy releases:

Ianna Osborne (ianna)

There are two kinds of releases: (1) awkward-cpp updates, which only occur when the C++ is updated (rare) and involves compilation on many platforms (takes hours), and (2) awkward updates, which can happen with any bug-fix. The releases listed in GitHub are awkward releases, not awkward-cpp.

If you need your merged pull request to be deployed in a release, just ask!

`awkward-cpp` releases

To make an awkward-cpp release:

A commit to main should increase the version number in awkward-cpp/pyproject.toml and the corresponding dependency in pyproject.toml. This ensures that awkward-cpp and awkward remain in-sync.
The Deploy C++ GitHub Actions workflow should be manually triggered.
A git tag awkward-cpp-{version} should be created for the new version epoch.

`awkward` releases

To make an awkward release:

A commit to main should increase the version number in pyproject.toml
A new GitHub release must be published. "Generate release notes" produces categorized notes automatically: a workflow labels each PR type/<type> from its conventional-commit title, and .github/release.yml groups those labels into sections.
A docs/switcher.json entry must be added for new minor/major versions.

Pushes that modify docs/switcher.json on main will automatically be synchronised with AWS.

Nightly wheels

Nightly wheels of awkward-cpp and awkward are built and published to the Scientific Python Nightly Wheels Anaconda Cloud organization. As the awkward-cpp and awkward nightly wheels do not include version control system information, they will have the same version numbers as the last released versions on the public PyPI. To avoid resolution conflicts when installing the nightly wheels, it is recommended to first install awkward-cpp and awkward from PyPI to get all of their dependencies, then uninstall awkward-cpp and awkward and install the nightly wheels from the Scientific Python nightly index.

python -m pip install --upgrade awkward
python -m pip uninstall --yes awkward awkward-cpp
python -m pip install --upgrade --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple awkward

Guidelines for Dependency and Version Changes

To ensure stability and reproducibility across the Awkward ecosystem:

Collaborative Review Required
- Pull requests that affect core dependencies (e.g., Numba, Pandas, PyArrow, NumPy) or Python version support must be discussed with maintainers before merging.
- This avoids unilateral decisions that could break critical workflows.
Compatibility Transparency
- If a dependency is not yet available for a new Python release, note this clearly in the PR description and documentation.
- Example: “Supports Python 3.14, but Numba-dependent features are disabled until upstream support is released.”
Testing Expectations
- CI should run against all supported Python versions.
- Tests that depend on unsupported features (e.g., Numba on Python 3.14) must be marked with xfail so contributors see the limitation.
Version Pinning
- Pin or constrain versions of critical dependencies to prevent silent breakages from upstream changes.
- Floating versions may be allowed for secondary utilities, but core scientific packages should be locked.
Communication
- Use GitHub Discussions or Issues to propose changes before opening a PR that alters compatibility.
- This ensures consensus and avoids surprises for downstream users.

Example PR Checklist

Have I discussed this change with maintainers?
Did I document limitations (e.g., missing Numba support)?
Are CI tests updated to reflect compatibility status?
Did I pin or constrain critical dependencies?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to Awkward Array

Where to start

Reporting issues

Contributing a pull request

AI-assisted contributions

Getting your pull request reviewed

Becoming a regular committer

Git practices

Building and testing locally

Installing the `awkward-cpp` package

Installing the `awkward` package

Testing the installed packages

Unit tests for the kernels

Building wheels

Automatic formatting and linting

Automated tests

Testing practices

Building documentation locally

The main branch

The main-v1 branch

Releases

`awkward-cpp` releases

`awkward` releases

Nightly wheels

Guidelines for Dependency and Version Changes

Example PR Checklist

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to Awkward Array

Where to start

Reporting issues

Contributing a pull request

AI-assisted contributions

Getting your pull request reviewed

Becoming a regular committer

Git practices

Building and testing locally

Installing the awkward-cpp package

Installing the awkward package

Testing the installed packages

Unit tests for the kernels

Building wheels

Automatic formatting and linting

Automated tests

Testing practices

Building documentation locally

The main branch

The main-v1 branch

Releases

awkward-cpp releases

awkward releases

Nightly wheels

Guidelines for Dependency and Version Changes

Example PR Checklist

Installing the `awkward-cpp` package

Installing the `awkward` package

`awkward-cpp` releases

`awkward` releases