River is a Python library for online (streaming) machine learning. All estimators implement incremental learn_one/predict_one methods (or learn_many/predict_many for mini-batch).
# Install / sync dependencies (also builds Cython + Rust extensions)
uv sync
# Run tests (excludes datasets and slow markers by default, includes doctests)
uv run pytest
# Run a single test file
uv run pytest river/linear_model/test_glm.py
# Run tests in parallel
uv run pytest -n auto
# Run only web-dependent tests (datasets downloads)
uv run pytest -m datasets
# Lint and format (via pre-commit hooks)
uv run pre-commit run --all-files
# Type checking
uv run mypy river
# Build and serve docs locally
uv sync --group docs
make livedocAll estimators inherit from base.Estimator (which inherits from base.Base). Key interfaces:
Classifier/MiniBatchClassifierβ classificationRegressor/MiniBatchRegressorβ regressionTransformer/SupervisedTransformerβ feature transformationClustererβ clusteringDriftDetector/BinaryDriftDetectorβ concept drift detectionEnsemble/WrapperEnsembleβ ensemble methodsWrapperβ wrapping other estimators
learn_one(x, y)/predict_one(x)is the core online learning interface__init__parameters need type hints; provide defaults or implement_unit_test_params()_unit_test_skips()returns check names to skip in automated testing_supervised,_multiclass,_is_stochastic,_tagsare special class attributes- Pipeline composition:
scaler | model(uses__or__),+for parallel union
utils.check_estimator(MyEstimator)automatically discovers and runs validation checks (repr, cloning, pickling, feature robustness, etc.). New estimators must pass all checks.- Pytest is configured with
--doctest-modulesβ docstring examples are executed as tests.
- Cython:
.pyxfiles throughoutriver/for performance-critical code - Rust:
rust_src/lib.rsvia PyO3, exposed asriver.stats._rust_stats
- When you're done, add a entry to
unreleased.mdif its relevant to end users. - Performance matters: if you make a significant change, run a benchmark.