Skip to content

nirbhays/model-ledger

Repository files navigation

ModelLedger

MLOps compliance dashboard -- model lineage tracking and audit report generation in one command.

CI PyPI version Python 3.10+ License: MIT Tests No External Services

modelledger report --sample -- instant compliance report. No database, no cloud, no setup.


The Problem

AI regulations (EU AI Act, NIST AI RMF) require auditable records of:

  • What data trained each model
  • Which code version produced it
  • How it was evaluated
  • What risks were identified

Most teams track this in spreadsheets, wikis, or not at all.

The Fix

ModelLedger generates structured compliance reports from simple JSON data:

modelledger report --sample

Output: a complete Markdown report with model inventory, dataset provenance, experiment history, lineage graph, and automated risk assessment.

Quickstart

pip install -e .

# Instant demo -- no files needed
modelledger report --sample

# Create your own data file
modelledger init --output my_data.json
# Edit my_data.json with your models/datasets/experiments
modelledger report --data my_data.json

Why ModelLedger?

Approach Setup Cost Offline Audit-Ready
MLflow Server + DB Free/Paid No Partial
Weights & Biases Cloud account $$ No Partial
DVC Git-based Free Yes No
ModelLedger pip install Free Yes Yes

CLI Reference

modelledger report

modelledger report --sample                    # Built-in sample data
modelledger report --data my_data.json         # Your data
modelledger report --data d.json --format html # HTML output
modelledger report --data d.json --output r.md # Write to file
Flag Description
--sample Generate report from built-in sample data
--data PATH Path to JSON data file
--format markdown (default) or html
--output FILE Write to file instead of stdout
--title TEXT Custom report title

modelledger inspect

modelledger inspect --data my_data.json

Rich terminal tables showing all models, datasets, and experiments.

modelledger init

modelledger init --output my_data.json

Generate a sample JSON data file as a starting point.

Report Contents

A generated report includes:

  1. Model Inventory -- all registered models with version, framework, metrics, tags
  2. Dataset Registry -- datasets with provenance (path, hash, sample count)
  3. Experiment History -- runs with hyperparameters, metrics, and status
  4. Lineage Graph -- relationships between models, datasets, and experiments
  5. Risk Assessment -- automatic flags for:
    • Models missing dataset references (unknown training data)
    • Models missing git commits (unverifiable code)
    • Experiments without metrics (unevaluated models)
    • Failed experiments (investigate cause)
    • Models with no experiments (never evaluated)

Data Format

{
  "models": [{
    "model_name": "my-model",
    "model_version": "1.0.0",
    "framework": "pytorch",
    "metrics": {"accuracy": 0.95},
    "tags": ["production"],
    "git_commit": "abc123",
    "dataset_ref": "my-dataset"
  }],
  "datasets": [{
    "name": "my-dataset",
    "version": "1.0.0",
    "path": "/data/my-dataset/",
    "hash": "sha256...",
    "num_samples": 10000
  }],
  "experiments": [{
    "experiment_id": "exp-001",
    "model_ref": "my-model",
    "dataset_ref": "my-dataset",
    "metrics": {"accuracy": 0.95},
    "params": {"learning_rate": 0.001},
    "status": "completed"
  }]
}

Library API

from modelledger.core import Ledger, ReportGenerator
from modelledger.models import ModelRecord, DatasetRecord

ledger = Ledger()
ledger.add_model(ModelRecord(
    model_name="my-model",
    model_version="1.0.0",
    framework="pytorch",
    dataset_ref="my-dataset",
))

generator = ReportGenerator(ledger)
report = generator.generate_report()
print(report)

Use Cases

EU AI Act Compliance

Maintain auditable records of training data, model performance, and decision-making processes as required by emerging AI regulations.

Model Auditing

Trace any model back to its training data, code version, and evaluation experiments through the lineage graph.

Team Onboarding

New team members run modelledger inspect --data project_data.json for an instant overview of all models, datasets, and experiments.

Development

pip install -e ".[dev]"
pytest
ruff check src/ tests/
mypy src/

Why ModelLedger?

The EU AI Act and NIST AI RMF require auditable records of:

  • What data trained each model
  • Which code version produced it
  • How it was evaluated
  • What risks were identified

Most teams track this in spreadsheets or not at all. Audits become expensive, retroactive scrambles.

ModelLedger generates compliance reports from structured JSON — one command, audit-ready output.

Installation

pip install modelledger

Quick Start

# Generate a sample compliance report instantly
modelledger report --sample

# Report from your own data
modelledger report --data project_data.json --framework eu-ai-act

# Inspect model lineage
modelledger inspect --data project_data.json

Sample Output

ModelLedger Compliance Report
Framework: EU AI Act
Generated: 2025-04-07

Model: customer-churn-v2
Training Data: customer_dataset_v3 (SHA: a8f3c9...)
Code Version: git:abc1234
Evaluation: F1=0.91, AUC=0.94
Risk Level: Limited Risk (Article 6)

Control Status:
✅ Data provenance documented
✅ Model version tracked  
✅ Evaluation recorded
⚠️  Human oversight mechanism — incomplete
❌  Post-market monitoring plan — missing

Compliance Frameworks

Framework Status
EU AI Act ✅ Full mapping
NIST AI RMF ✅ Full mapping
ISO 42001 🔨 In progress

Project Links

Connect & Follow

If you find this project useful, consider:

Built with ❤️ by Nirbhay Singh — Cloud & AI Architect

License

MIT. See LICENSE.

About

MLOps compliance dashboard. Model lineage tracking + audit reports for EU AI Act, NIST AI RMF. One command setup.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors