CHANGELOG

Unreleased

Migration guide: See docs/migration-guide.md for step-by-step upgrade instructions.

Breaking Changes

evaluate_all() removed — raises DeprecationError at runtime. Replace with the three-step pipeline: predict_dataset() → CanonicalMapper.get_mapped_results_dataframe() → calculate_score_on_df().
entity_mapping parameter removed from SpanEvaluator, TokenEvaluator, and BaseEvaluator — entity mapping is now the responsibility of CanonicalMapper.
compare_by_io parameter removed from evaluator constructors — BIO/BILUO prefix stripping is now performed by CanonicalMapper.
BaseEvaluator.from_dataset() removed — use model.predict_dataset(dataset) directly.
Non-Presidio model wrappers removed: FlairModel, SpacyModel, StanzaModel, AzureAITextAnalyticsWrapper. Add models directly through Presidio to evaluate them.
Minimum Python version raised to 3.11 (was 3.10) — required by numpy >= 2.4.0.
Package manager changed from Poetry to uv — install with uv sync, run with uv run.

New Features

BaseModel.predict_dataset(dataset) — runs the model on a list of InputSample objects and returns a 5-column DataFrame (sentence_id, token, annotation, prediction, start_indices).
CanonicalMapper — replaces EntityMappingHelper with an improved four-tier auto-resolution strategy (EXACT, COUNTRY, FUZZY, PENDING). Key methods:
- CanonicalMapper.from_dataset(dataset) — builds a mapper from dataset labels.
- mapper.get_mapped_results_dataframe(results_df) — applies entity mapping to a predictions DataFrame.
- mapper.get_mapping(mode='html' | 'text') — returns the final {raw_label: canonical | None} dict.
- mapper.map({"LABEL": "CANONICAL"}) — manually resolve pending labels.
- mapper.render_html() — display the resolution audit table in Jupyter.
TokenEvaluator.calculate_score_on_df(results_df) — score token-level predictions from a DataFrame.
SpanEvaluator.calculate_score_on_df(per_type, results_df) — score span-level predictions from a DataFrame.
Ruff — added as the project linter and formatter (ruff.toml at project root).
Pre-commit hooks — ruff format, ruff check, and pytest run automatically before every commit (.pre-commit-config.yaml).
Test reorganisation — tests are now grouped by topic (tests/data_generator/, tests/entity_mapping/, tests/evaluation/, tests/models/, tests/integration/). Integration tests are tagged with pytest.mark.integration.

Deprecations

evaluator.get_results_dataframe() — soft DeprecationWarning emitted at runtime. Replace with model.predict_dataset(dataset).

Version 0.2.5

Improvements

Introduced a new evaluator, SpanEvaluator which compares full spans of annotations and predictions, instead of tokens. (#141)
Make Azure SDK as an optional dependency (#116)
Add a DF output to evaluation results (#126)

Bug Fixes

Fixed bugs around plotting and experiment tracking (#140) around configuring Presidio in the evaluation loop. (#155)
Data generation bug fixes #113

Version 0.2.0

Breaking changes

Removed notebooks (pseudonomyzation)
Removed redundant classes FakerSpan, FakerSpanResult and updated code to use Span and InputSample respectively, changed SentenceFaker to inherit from Faker instead of using composition.
Removed functions from_faker_span, from_faker_spans_result convert_faker_spans from InputSample, as faker spans are now Spans so there no need for translation.
Removed PresidioDataGenerator to use PresidioSentenceFaker instead
Removed support for CRF models
Removed the FlairTrainer class, please refer to the official Flair documentation for training Flair models
Removed CRF as the package used is no longer maintained

Improvements

Improved evaluation notebooks: Notebook 4 shows a vanilla Presidio evaluation, notebook 5 shows a more customized Presidio with improved accuracy (#103)
Removed the Pseudonomyzation notebook as there is a more advanced approach within Presidio (#103)
Added the ability to use generic entities and skip words (#103)
Added the ability to do faster batch predict (#103)
Added sample_id to be able to reproduce the full sample (#103)
Fixed issue with hospital provider networking (#103)

Bug Fixes

Fix translation of Input Sample tags (#88)
Fix Presidio wrapper to call predict with a language parameter (#79)

Other Changes

Updates to all classes inheriting from BaseModel, as the predict signature has changed (now containing **kwargs) (#92)
Added Poetry instead of setup.py (#91)
Rename UsDriverLicenseProvider.driver_license to us_driver_license (#90)
Removed redundant classes FakerSpan, FakerSpanResult and updated code to use Span and InputSample respectively instead (#72)
Changed SentenceFaker to inherit from Faker instead of using composition (#72)
Simplified the use of SentenceFaker in the default option (RecordGenerator is instantiated if records are passed, otherwise a SpanGenerator is instantiated) (#72)
Updates to unit tests to support this change (#72)
Updates to poetry to include the config in setup.cfg, setup.py, and pytest.ini (#72)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CHANGELOG

Unreleased

Breaking Changes

New Features

Deprecations

Version 0.2.5

Improvements

Bug Fixes

Version 0.2.0

Breaking changes

Improvements

Bug Fixes

Other Changes

Uh oh!

FilesExpand file tree

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

CHANGELOG

Unreleased

Breaking Changes

New Features

Deprecations

Version 0.2.5

Improvements

Bug Fixes

Version 0.2.0

Breaking changes

Improvements

Bug Fixes

Other Changes