Data Fidelity Policy

Version: 1.0 Date: 2026-01-17 Status: APPROVED

Core Principle

Display what exists, not what we assume.

All OpenAdapt viewers MUST adhere to strict data fidelity standards to maintain user trust and scientific reproducibility.

Policy Statement

1. NEVER Invent Data

❌ PROHIBITED:

# WRONG: Making up descriptions
description = "Click System Settings icon in dock"  # ← WHERE DID THIS COME FROM?

# WRONG: Assuming intent without evidence
action = "User navigates to settings"  # ← DID THEY? HOW DO YOU KNOW?

# WRONG: Filling gaps with plausible values
if not event.description:
    event.description = "Unknown action"  # ← JUST LEAVE IT EMPTY!

✅ REQUIRED:

# RIGHT: Use actual data from source
description = episode["steps"][i]  # ← From episodes.json

# RIGHT: Use raw events if no semantic description
description = f"{event.type} at ({event.x}, {event.y})"  # ← From capture.db

# RIGHT: Be explicit about missing data
description = None  # or ""  # ← Honest about what we don't have

2. ALWAYS Label Provenance

Every piece of displayed data MUST indicate its source:

Data Type	Provenance Label	Example
Hardware Event	`RAW`	`mouse.down at (1248, 701)`
ML-Inferred	`ML-INFERRED (model, confidence)`	`ML-INFERRED (GPT-4o, 0.92): "Click Settings icon"`
Human-Labeled	`HUMAN-LABELED`	`HUMAN-LABELED: "Turn off Night Shift"`
Derived	`DERIVED (from: X)`	`DERIVED (from: 13 mouse events): "13 clicks"`

3. Distinguish Data Source vs Data Content

Data Source = Where the data comes from (file path, database, API) Data Content = What the data says (values, descriptions, metadata)

Example: The Nightshift Recording

✅ Correct Understanding:

Source: turn-off-nightshift/episodes.json (REAL file from actual recording)
Content: "Click System Settings icon in dock" (ML-generated by GPT-4o)
Provenance: ML-INFERRED (GPT-4o, confidence: 0.92)

❌ Incorrect Understanding:

Source: Real episodes.json ✓
Content: Synthetic/made-up ✗ (It's ML-inferred, not invented!)

4. When in Doubt, Show Raw

If uncertain about the semantic meaning, default to displaying raw event data:

# If we have a semantic description from ML
if episode.get("steps"):
    display = episode["steps"][i]
    provenance = f"ML-INFERRED ({episode['llm_model']}, {episode['boundary_confidence']:.2f})"

# If we only have raw events
elif event.type == "mouse.down":
    display = f"Mouse click at ({event.x}, {event.y})"
    provenance = "RAW"

# If we have neither
else:
    display = None  # Don't display anything
    provenance = None

5. Preserve Metadata

All data MUST preserve its provenance metadata:

Required Metadata Fields:

source: Where the data came from (file, DB table, API)
provenance: How the data was created (raw, ML, human, derived)
timestamp: When the data was created/captured
confidence: For ML-inferred data, include model confidence
model: For ML-inferred data, include model name/version

Example:

step = ExecutionStep(
    action_details={
        "description": "Click System Settings icon in dock",
        "source": "episodes.json",
        "provenance": "ml_inferred",
        "model": "gpt-4o",
        "confidence": 0.92,
        "timestamp": "2026-01-17T12:00:00.000000",
    }
)

Implementation Guidelines

For real_data_loader.py

def load_real_capture_data(capture_path: Path) -> BenchmarkRun:
    """Load REAL data with proper provenance labeling."""

    # Load episodes (ML-generated semantic data)
    with open(capture_path / "episodes.json") as f:
        episodes_data = json.load(f)

    # Extract ML metadata
    ml_model = episodes_data.get("llm_model", "unknown")
    processing_timestamp = episodes_data.get("processing_timestamp", "unknown")

    for episode in episodes_data["episodes"]:
        for i, step_text in enumerate(episode["steps"]):
            step = ExecutionStep(
                action_type="ml_inferred",  # ← Honest provenance
                action_details={
                    "description": step_text,
                    "provenance": "ml_inferred",
                    "model": ml_model,
                    "confidence": episode["boundary_confidence"],
                    "processing_timestamp": processing_timestamp,
                },
                reasoning=f"ML interpretation ({ml_model}): {step_text}",
            )

    # Also provide raw event access for transparency
    conn = sqlite3.connect(capture_path / "capture.db")
    raw_events = load_raw_events(conn)  # Make raw data available

    return BenchmarkRun(
        tasks=tasks,
        executions=executions,
        config={
            "data_provenance": {
                "episodes_source": str(capture_path / "episodes.json"),
                "episodes_provenance": "ml_inferred",
                "episodes_model": ml_model,
                "raw_events_source": str(capture_path / "capture.db"),
                "raw_events_count": len(raw_events),
            }
        }
    )

For Viewer HTML

<!-- Show provenance badges -->
<div class="oa-action">
    <span class="oa-badge oa-badge-ml" title="Generated by GPT-4o with 92% confidence">
        ML-INFERRED
    </span>
    <span class="oa-action-details">
        Click System Settings icon in dock
    </span>
</div>

<!-- Provide raw data in expandable section -->
<details class="oa-raw-data">
    <summary>View Raw Event Data</summary>
    <pre>
Event Type: mouse.down
Coordinates: (1248.32, 701.73)
Timestamp: 1765672655.397
Button: left
    </pre>
</details>

<!-- Show metadata -->
<div class="oa-metadata">
    <div class="oa-metadata-item">
        <span class="oa-label">Model:</span>
        <span class="oa-value">GPT-4o</span>
    </div>
    <div class="oa-metadata-item">
        <span class="oa-label">Confidence:</span>
        <span class="oa-value">0.92</span>
    </div>
    <div class="oa-metadata-item">
        <span class="oa-label">Processed:</span>
        <span class="oa-value">2026-01-17 12:00:00</span>
    </div>
</div>

CSS for Provenance Badges

/* Provenance badges */
.oa-badge-raw {
    background: var(--oa-info-bg);
    color: var(--oa-info);
}

.oa-badge-ml {
    background: var(--oa-accent-dim);
    color: var(--oa-accent);
}

.oa-badge-human {
    background: var(--oa-success-bg);
    color: var(--oa-success);
}

.oa-badge-derived {
    background: var(--oa-warning-bg);
    color: var(--oa-warning);
}

Handling Missing Data

When Data Doesn't Exist

DO:

Show null, None, or empty string
Display "No data available"
Hide the section entirely

DON'T:

Fill with placeholder text like "Unknown"
Make assumptions like "Probably clicked"
Show "N/A" (implies data should exist but doesn't)

When Data Is Ambiguous

DO:

Show all possible interpretations
Display confidence scores
Provide raw event data

DON'T:

Pick the "most likely" option without indicating uncertainty
Average or merge ambiguous values
Hide low-confidence interpretations

Testing Data Fidelity

Every viewer MUST pass these tests:

Test 1: Trace Data Lineage

def test_data_lineage():
    """Verify every displayed value can be traced to source."""
    viewer = load_viewer("benchmark_viewer.html")
    for step in viewer.steps:
        description = step.action_details["description"]

        # Can we find this in the source data?
        assert description in episodes_json["steps"] or \
               description == format_raw_event(capture_db_events[i])

Test 2: No Invented Data

def test_no_invented_data():
    """Verify no data was created by the viewer code."""
    viewer_data = extract_displayed_data("benchmark_viewer.html")
    source_data = load_all_source_data()

    for value in viewer_data:
        assert value in source_data.values() or \
               value is_derived_from(source_data), \
               f"Invented data detected: {value}"

Test 3: Provenance Labels Present

def test_provenance_labels():
    """Verify all data has provenance labels."""
    viewer = load_viewer("benchmark_viewer.html")

    for step in viewer.steps:
        assert "provenance" in step.action_details, \
               f"Missing provenance for step {step.step_number}"

        assert step.action_details["provenance"] in [
            "raw", "ml_inferred", "human_labeled", "derived"
        ], f"Invalid provenance: {step.action_details['provenance']}"

Documentation Requirements

Every data loader MUST document:

What data it loads (files, tables, APIs)
How it transforms data (raw → semantic)
What it DOESN'T invent (explicit list)
Provenance labels used (raw, ML, human, derived)

Example Documentation

def load_real_capture_data(capture_path: Path) -> BenchmarkRun:
    """Load real capture data from openadapt-capture recording.

    DATA SOURCES:
    - capture.db: Raw hardware events (mouse, keyboard, screen)
    - episodes.json: ML-generated semantic episode descriptions

    DATA TRANSFORMATIONS:
    - Raw events → Count statistics (e.g., "1046 mouse moves")
    - Episodes → ExecutionStep objects (pass-through, no modification)

    DATA NOT INVENTED:
    - Step descriptions (from episodes.json, generated by GPT-4o)
    - Action types (from episodes.json)
    - Screenshots (from recording, not generated)

    PROVENANCE:
    - action_type: "ml_inferred" (from episodes.json)
    - model: "gpt-4o" (from episodes.json metadata)
    - confidence: 0.92 (from episodes.json boundary_confidence)
    """

Common Violations

Violation 1: Hiding Provenance

❌ WRONG:

<span>Click System Settings icon in dock</span>

✅ RIGHT:

<span class="oa-badge-ml">ML-INFERRED (GPT-4o, 0.92)</span>
<span>Click System Settings icon in dock</span>

Violation 2: Assuming Intent

❌ WRONG:

# Don't assume what the user was trying to do
description = "User opens settings to change display preferences"

✅ RIGHT:

# Use what the ML model inferred, with confidence
description = episode["description"]  # "User opens System Settings application"
confidence = episode["boundary_confidence"]  # 0.92

Violation 3: Filling Gaps

❌ WRONG:

# Don't make up data for missing screenshots
if not screenshot_path:
    screenshot_path = "placeholder.png"  # ← NO!

✅ RIGHT:

# Be honest about missing data
if not screenshot_path:
    return None  # or display "No screenshot available"

Review Checklist

Before merging any code that displays data, verify:

All displayed values traced to source (episodes.json, capture.db, etc.)
No hardcoded descriptions invented by code
Provenance labels present (RAW, ML-INFERRED, HUMAN-LABELED, DERIVED)
ML data includes model name + confidence
Missing data shown as missing (not filled with placeholders)
Documentation explains data sources and transformations
Tests verify no invented data

Questions?

If you're unsure whether something violates data fidelity:

Ask: "Where did this value come from?"
If the answer is "I calculated/inferred/assumed it" → Label as DERIVED or ML-INFERRED
If the answer is "I made it up for demo purposes" → Use ONLY in test data, mark clearly
If the answer is "It's in the source file" → Include source metadata

When in doubt, show raw.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Fidelity Policy

Core Principle

Policy Statement

1. NEVER Invent Data

2. ALWAYS Label Provenance

3. Distinguish Data Source vs Data Content

Example: The Nightshift Recording

4. When in Doubt, Show Raw

5. Preserve Metadata

Implementation Guidelines

For real_data_loader.py

For Viewer HTML

CSS for Provenance Badges

Handling Missing Data

When Data Doesn't Exist

When Data Is Ambiguous

Testing Data Fidelity

Test 1: Trace Data Lineage

Test 2: No Invented Data

Test 3: Provenance Labels Present

Documentation Requirements

Example Documentation

Common Violations

Violation 1: Hiding Provenance

Violation 2: Assuming Intent

Violation 3: Filling Gaps

Review Checklist

See Also

Questions?

FilesExpand file tree

DATA_FIDELITY_POLICY.md

Latest commit

History

DATA_FIDELITY_POLICY.md

File metadata and controls

Data Fidelity Policy

Core Principle

Policy Statement

1. NEVER Invent Data

2. ALWAYS Label Provenance

3. Distinguish Data Source vs Data Content

Example: The Nightshift Recording

4. When in Doubt, Show Raw

5. Preserve Metadata

Implementation Guidelines

For real_data_loader.py

For Viewer HTML

CSS for Provenance Badges

Handling Missing Data

When Data Doesn't Exist

When Data Is Ambiguous

Testing Data Fidelity

Test 1: Trace Data Lineage

Test 2: No Invented Data

Test 3: Provenance Labels Present

Documentation Requirements

Example Documentation

Common Violations

Violation 1: Hiding Provenance

Violation 2: Assuming Intent

Violation 3: Filling Gaps

Review Checklist

See Also

Questions?