Thanks for your interest in contributing to the dspy-opt!
To submit PRs, please fill out the PR template along with the PR. If the PR fixes an issue, don't forget to link the PR to the issue!
Clone the repository and create the python virtual environment:
uv sync --all-extras --devActivate the virtual environment:
source .venv/bin/activateOnce the python virtual environment is setup, you can run pre-commit hooks using:
pre-commit run --all-filesThe project includes a Makefile with common development tasks. Run make help to see all available targets.
make lint-fmt # Format code and auto-fix lint issues
make lint-check # Check formatting and lint without modifying files
make lint-style # Lint with ruff (check only)
make lint-typing # Type check with mypy
make lint-typos # Check for typos
make lint-all # Format, lint, and type checkmake test # Run unit tests (excluding integration)
make test-cov # Run tests with coverage
make test-ci # Run tests with coverage + XML/junit output for CI
make cov # Run tests and generate coverage reports (xml, html)make security-bandit # Run Bandit security scan
make security-audit # Run pip-audit dependency vulnerability scan
make security # Run all security scansmake sync # Sync project and install dependencies
make clean # Clean build artifacts and cachesFor code style, we recommend the PEP 8 style guide.
For docstrings we use Google format.
We use ruff for code formatting and static code analysis. Ruff checks various rules including flake8. The pre-commit hooks show errors which you need to fix before submitting a PR.
Last but not the least, we use type hints in our code which is then checked using mypy.
We welcome contributions that extend the functionality of this project. Here are a few ways you can contribute:
- Create a new directory in
src/dspy_opt(e.g.,src/dspy_opt/new_dataset). - Create the necessary files, following the structure of the existing dataset modules:
- An indexing script (
new_dataset_indexing.py). - A RAG module (
new_dataset_rag_module.py). - Configuration files for different optimizers.
- An indexing script (
- Implement the pipeline logic in your RAG module, composing the shared utilities from
src/dspy_opt/utilsas needed.
- Create a new
dspy.Modulein thesrc/dspy_opt/utilsdirectory. - Ensure it has a clear and well-defined responsibility.
- Add docstrings and type hints.
- Integrate the new module into one or more RAG pipelines to demonstrate its usage.
To experiment with different language or embedding models, simply change the model names and API keys in the .yml configuration files for the relevant dataset.