Sparv is a text analysis tool run from the command line. The documentation can be found here: https://spraakbanken.gu.se/sparv.
Check the changelog to see what's new!
Sparv is developed by Språkbanken. The source code is available under the MIT license.
If you have any questions, problems or suggestions please contact sb-sparv@svenska.gu.se.
- A Unix-like environment (e.g. Linux, OS X or Windows Subsystem for Linux) Note: Most of Sparv's features should work in a Windows environment as well, but since we don't do any testing on Windows we cannot guarantee anything.
- Python 3.11 or newer.
Sparv is available on PyPI under the name sparv. Refer to the Sparv user
manual for detailed installation and setup
instructions.
To set up a development environment for Sparv, we recommend using uv to create a virtual environment and install the dependencies.
-
Install uv if you don't have it already.
-
While in the Sparv project directory, run:
uv sync
This will create a virtual environment in the
.venvdirectory and install the dependencies listed inpyproject.toml, including the development dependencies. -
Either activate the virtual environment manually:
source .venv/bin/activateor use
uv run <command>to run commands inside the virtual environment without activating it.
Alternatively, you can set up a virtual environment manually using Python's built-in venv module and install the
dependencies using pip:
python3 -m venv .venv
source .venv/bin/activate
pip install -e . && pip install . --group devTo run the test suite, make sure you have set up the development environment as described above. You also need to have Git LFS installed to get the test data. If you cloned the repository before installing Git LFS, you need to run
git lfs fetchto download the test data files.
While in the Sparv project directory, you can run the tests using uv:
uv run pytestAlternatively, if you have activated the virtual environment manually, you can simply run:
pytestYou can run specific tests using the provided markers (e.g. pytest -m swe to run the Swedish tests only) or via
substring matching (e.g. pytest -k "not slow" to skip the slow tests).