Skip to content

Roadmap: close the remaining gaps #92

Description

@avayaaggarwal

Summary

Erasus already covers a wide span of unlearning methods, but to be the clear state-of-the-art package for unlearning research it needs a tighter combination of reproducibility, benchmark parity, scalable training support, and stronger packaging ergonomics.

This issue tracks the highest-leverage work needed to make Erasus the default open-source package for unlearning experiments rather than just a broad collection of implementations.

Why this matters

Researchers and practitioners evaluate unlearning frameworks on four things:

  • breadth of methods
  • credibility of benchmark results
  • ease of reproducing real papers on real models
  • reliability of the package in messy real environments

Erasus is already strong on breadth. The next gap is turning that breadth into reproducible, benchmark-backed, easy-to-run depth.

Roadmap

1. Reproducible real-benchmark harnesses

  • Standardize real benchmark entrypoints for TOFU, MUSE, WMDP, and lm-eval tasks behind one common CLI contract
  • Add config presets for paper-faithful runs on GPT-2, Zephyr-7B, and at least one PEFT-based 7B path
  • Save machine-readable result bundles with metrics, configs, seed, git SHA, model revision, and dataset revision
  • Add a benchmark manifest schema so leaderboard outputs are reproducible and comparable across runs

2. Baseline parity with major unlearning papers

  • Audit implemented strategies against the latest LLM unlearning papers and document missing training details or approximations
  • Add paper-parity benchmark scripts for NPO, SimNPO, RMU, FLAT, UNDIAL, activation steering, DExperts, and delta-unlearning
  • Publish baseline result tables in-repo for at least TOFU, WMDP, and post-unlearning capability tasks
  • Add a clear "implemented", "approximated", and "paper-faithful" status tag in strategy docs

3. Scalable training and inference support

  • Add first-class PEFT support across unlearners with LoRA/QLoRA adapters
  • Add accelerate integration for multi-GPU and gradient accumulation workflows
  • Support 4-bit / 8-bit loading paths for both unlearning and post-unlearning evaluation
  • Add checkpoint resume support for long-running benchmark jobs

4. Stronger verification and privacy evaluation

  • Expand prompt-extraction suites into reusable attack packs with jailbreak, multilingual, paraphrase, and long-context retrieval modes
  • Add calibration and confidence reports to MIA outputs, not just scalar attack scores
  • Add corpus-level memorization reports for forget sets with attribution to exact samples/documents
  • Add a unified privacy report spanning MIA, memorization, extraction, relearning, and RAG leakage

5. Dataset and deletion workflow quality

  • Add canonical dataset wrappers for popular unlearning datasets with version pinning and preprocessing provenance
  • Add machine-readable forget request manifests for sample-, user-, concept-, and document-level deletion settings
  • Add validation utilities that detect overlap/leakage between forget, retain, and eval splits
  • Add documentation for supported deletion granularities and expected benchmark mappings

6. Package ergonomics and modularity

  • Introduce optional dependency extras by surface area such as llm, vision, audio, benchmarks, and ui
  • Make package-level imports consistently lazy so optional integrations do not make import erasus brittle
  • Add a plugin/discovery mechanism for third-party strategies, selectors, metrics, and benchmark adapters
  • Standardize structured result objects and serialization across unlearning, evaluation, verification, and benchmarks

7. Quality gates and developer trust

  • Re-enable the full main test suite in CI and keep it green with coverage enforcement
  • Add smoke tests for package import surfaces under reduced optional-dependency environments
  • Add benchmark regression tests that check output schema, not just execution
  • Add docs pages mapping claims to tests, benchmarks, and implemented modules

Suggested acceptance criteria

  • A new user can reproduce at least one real TOFU, MUSE, and WMDP run from documented commands without editing source files
  • Strategy docs clearly state whether each implementation is approximate or paper-faithful
  • Result bundles are reproducible across seeds and include all provenance needed for comparison
  • import erasus and core workflows succeed cleanly in minimal environments with optional features gated behind extras
  • Verification outputs are consolidated into a single report that is suitable for papers and model-card publication

Nice-to-have follow-ups

  • Public benchmark artifact hosting for result bundles and plots
  • Example notebooks for end-to-end LLM unlearning on real models
  • Automatic model-card generation from benchmark + verification reports

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions