Migration guide: See docs/migration-guide.md for step-by-step upgrade instructions.
evaluate_all()removed — raisesDeprecationErrorat runtime. Replace with the three-step pipeline:predict_dataset()→CanonicalMapper.get_mapped_results_dataframe()→calculate_score_on_df().entity_mappingparameter removed fromSpanEvaluator,TokenEvaluator, andBaseEvaluator— entity mapping is now the responsibility ofCanonicalMapper.compare_by_ioparameter removed from evaluator constructors — BIO/BILUO prefix stripping is now performed byCanonicalMapper.BaseEvaluator.from_dataset()removed — usemodel.predict_dataset(dataset)directly.- Non-Presidio model wrappers removed:
FlairModel,SpacyModel,StanzaModel,AzureAITextAnalyticsWrapper. Add models directly through Presidio to evaluate them. - Minimum Python version raised to 3.11 (was 3.10) — required by
numpy >= 2.4.0. - Package manager changed from Poetry to uv — install with
uv sync, run withuv run.
BaseModel.predict_dataset(dataset)— runs the model on a list ofInputSampleobjects and returns a 5-column DataFrame (sentence_id,token,annotation,prediction,start_indices).CanonicalMapper— replacesEntityMappingHelperwith an improved four-tier auto-resolution strategy (EXACT,COUNTRY,FUZZY,PENDING). Key methods:CanonicalMapper.from_dataset(dataset)— builds a mapper from dataset labels.mapper.get_mapped_results_dataframe(results_df)— applies entity mapping to a predictions DataFrame.mapper.get_mapping(mode='html' | 'text')— returns the final{raw_label: canonical | None}dict.mapper.map({"LABEL": "CANONICAL"})— manually resolve pending labels.mapper.render_html()— display the resolution audit table in Jupyter.
TokenEvaluator.calculate_score_on_df(results_df)— score token-level predictions from a DataFrame.SpanEvaluator.calculate_score_on_df(per_type, results_df)— score span-level predictions from a DataFrame.- Ruff — added as the project linter and formatter (
ruff.tomlat project root). - Pre-commit hooks —
ruff format,ruff check, andpytestrun automatically before every commit (.pre-commit-config.yaml). - Test reorganisation — tests are now grouped by topic (
tests/data_generator/,tests/entity_mapping/,tests/evaluation/,tests/models/,tests/integration/). Integration tests are tagged withpytest.mark.integration.
evaluator.get_results_dataframe()— softDeprecationWarningemitted at runtime. Replace withmodel.predict_dataset(dataset).
- Introduced a new evaluator,
SpanEvaluatorwhich compares full spans of annotations and predictions, instead of tokens. (#141) - Make Azure SDK as an optional dependency (#116)
- Add a DF output to evaluation results (#126)
- Fixed bugs around plotting and experiment tracking (#140) around configuring Presidio in the evaluation loop. (#155)
- Data generation bug fixes #113
- Removed notebooks (pseudonomyzation)
- Removed redundant classes
FakerSpan,FakerSpanResultand updated code to useSpanandInputSamplerespectively, changedSentenceFakerto inherit from Faker instead of using composition. - Removed functions
from_faker_span,from_faker_spans_resultconvert_faker_spansfromInputSample, as faker spans are nowSpans so there no need for translation. - Removed
PresidioDataGeneratorto usePresidioSentenceFakerinstead - Removed support for CRF models
- Removed the
FlairTrainerclass, please refer to the official Flair documentation for training Flair models - Removed CRF as the package used is no longer maintained
- Improved evaluation notebooks: Notebook 4 shows a vanilla Presidio evaluation, notebook 5 shows a more customized Presidio with improved accuracy (#103)
- Removed the Pseudonomyzation notebook as there is a more advanced approach within Presidio (#103)
- Added the ability to use generic entities and skip words (#103)
- Added the ability to do faster batch predict (#103)
- Added sample_id to be able to reproduce the full sample (#103)
- Fixed issue with hospital provider networking (#103)
- Fix translation of Input Sample tags (#88)
- Fix Presidio wrapper to call predict with a language parameter (#79)
- Updates to all classes inheriting from BaseModel, as the predict signature has changed (now containing **kwargs) (#92)
- Added Poetry instead of setup.py (#91)
- Rename UsDriverLicenseProvider.driver_license to us_driver_license (#90)
- Removed redundant classes FakerSpan, FakerSpanResult and updated code to use Span and InputSample respectively instead (#72)
- Changed SentenceFaker to inherit from Faker instead of using composition (#72)
- Simplified the use of SentenceFaker in the default option (RecordGenerator is instantiated if records are passed, otherwise a SpanGenerator is instantiated) (#72)
- Updates to unit tests to support this change (#72)
- Updates to poetry to include the config in setup.cfg, setup.py, and pytest.ini (#72)