The Recording Catalog System provides automatic discovery and indexing of all OpenAdapt recordings, segmentation results, and episodes. It eliminates manual file selection across the ecosystem by providing a centralized SQLite database that tracks all available data.
- Automatic Discovery: Scan directories to find recordings and segmentation results
- Centralized Database: Single source of truth at
~/.openadapt/catalog.db - Zero Configuration: Works out-of-the-box with default paths
- Ecosystem-Wide: All viewers can query the catalog
- Fast Queries: SQLite with indexes for efficient lookups
- Metadata-Rich: Tracks frames, events, duration, timestamps, tags
┌─────────────────────────────────────────────────────────────┐
│ OpenAdapt Ecosystem │
├─────────────────────────────────────────────────────────────┤
│ │
│ openadapt-capture/ │
│ ├── turn-off-nightshift/ │
│ │ ├── capture.db ←────┐ │
│ │ ├── screenshots/ │ │
│ │ └── video.mp4 │ │
│ └── demo_new/ │ │
│ └── capture.db │ Scanner discovers │
│ │ recordings │
│ openadapt-ml/ │ │
│ └── segmentation_output/ │ │
│ ├── turn-off-nightshift_ │ │
│ │ episodes.json ←─────┤ │
│ └── demo_new_episodes.json │ │
│ │ │
│ ┌────────────────┘ │
│ │ │
│ ▼ │
│ ~/.openadapt/catalog.db │
│ (Centralized Catalog) │
│ ├── recordings (9 rows) │
│ ├── segmentation_results (2 rows) │
│ └── episodes (0 rows) │
│ │ │
│ ▼ │
│ ┌──────────────────────────┐ │
│ │ Viewers Query Catalog │ │
│ ├──────────────────────────┤ │
│ │ • Segmentation Viewer │ │
│ │ • Capture Viewer │ │
│ │ • Training Dashboard │ │
│ │ • Retrieval Viewer │ │
│ └──────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Stores metadata about each recording.
CREATE TABLE recordings (
id TEXT PRIMARY KEY, -- Recording identifier (e.g., "turn-off-nightshift")
name TEXT NOT NULL, -- Display name (e.g., "Turn Off Nightshift")
description TEXT, -- Optional description
path TEXT NOT NULL, -- Absolute path to recording directory
created_at REAL NOT NULL, -- Unix timestamp
duration_seconds REAL, -- Recording duration
frame_count INTEGER, -- Number of screenshots
event_count INTEGER, -- Number of events (clicks, keys, etc.)
task_description TEXT, -- Task description from capture
tags TEXT, -- JSON array of tags
metadata TEXT -- JSON for additional metadata
);Tracks segmentation results for recordings.
CREATE TABLE segmentation_results (
id TEXT PRIMARY KEY, -- Unique segmentation ID
recording_id TEXT NOT NULL, -- FK to recordings.id
path TEXT NOT NULL, -- Path to episodes JSON file
created_at REAL NOT NULL, -- Unix timestamp
episode_count INTEGER DEFAULT 0, -- Number of episodes
boundary_count INTEGER DEFAULT 0, -- Number of boundaries
status TEXT DEFAULT 'complete', -- 'complete', 'partial', 'failed'
llm_model TEXT, -- Model used (e.g., "gpt-4o")
metadata TEXT, -- JSON for additional metadata
FOREIGN KEY (recording_id) REFERENCES recordings(id)
);Stores individual episodes within segmentation results.
CREATE TABLE episodes (
id TEXT PRIMARY KEY, -- Unique episode ID
segmentation_result_id TEXT NOT NULL, -- FK to segmentation_results.id
recording_id TEXT NOT NULL, -- FK to recordings.id
name TEXT, -- Episode name
description TEXT, -- Episode description
start_time REAL, -- Start timestamp
end_time REAL, -- End timestamp
start_frame INTEGER, -- Start frame index
end_frame INTEGER, -- End frame index
confidence REAL, -- Segmentation confidence (0-1)
metadata TEXT, -- JSON for additional metadata
FOREIGN KEY (segmentation_result_id) REFERENCES segmentation_results(id),
FOREIGN KEY (recording_id) REFERENCES recordings(id)
);Automatically discover and index recordings from default or specified directories:
# Scan default directories
openadapt-viewer catalog scan
# Scan specific directories
openadapt-viewer catalog scan \
--capture-dir /path/to/openadapt-capture \
--segmentation-dir /path/to/segmentation_output
# Scan multiple directories
openadapt-viewer catalog scan \
--capture-dir /path/to/recordings1 \
--capture-dir /path/to/recordings2 \
--segmentation-dir /path/to/seg_outputDefault paths searched (if no directories specified):
- Recordings:
~/oa/src/openadapt-capture./recordings~/.openadapt/recordings
- Segmentation results:
~/oa/src/openadapt-ml/segmentation_output./segmentation_output~/.openadapt/segmentation_output
View all indexed recordings:
# Human-readable list
openadapt-viewer catalog list
# Include segmentation results
openadapt-viewer catalog list --with-segmentations
# JSON output
openadapt-viewer catalog list --json
# JSON with segmentations
openadapt-viewer catalog list --json --with-segmentationsExample output:
Found 9 recordings:
Turn Off Nightshift
ID: turn-off-nightshift
Path: /Users/abrichr/oa/src/openadapt-capture/turn-off-nightshift
Created: 2025-12-13 19:37
Duration: 60.3s
Frames: 21
Demo New
ID: demo_new
Path: /Users/abrichr/oa/src/openadapt-capture/demo_new
Created: 2025-12-12 15:41
Duration: 13.3s
Frames: 14
Display catalog summary:
openadapt-viewer catalog statsOutput:
Catalog Statistics:
Recordings: 9
Segmentation Results: 2
Episodes: 0
Database: /Users/abrichr/.openadapt/catalog.db
Manually add a recording to the catalog:
openadapt-viewer catalog register /path/to/recording --name "My Recording"Remove entries for files that no longer exist:
openadapt-viewer catalog cleanfrom openadapt_viewer import get_catalog
# Get the global catalog instance
catalog = get_catalog()
# Get all recordings
recordings = catalog.get_all_recordings()
for recording in recordings:
print(f"{recording.name}: {recording.frame_count} frames")
# Get a specific recording
rec = catalog.get_recording("turn-off-nightshift")
if rec:
print(f"Path: {rec.path}")
print(f"Duration: {rec.duration_seconds}s")
# Get segmentation results for a recording
seg_results = catalog.get_segmentation_results("turn-off-nightshift")
for seg in seg_results:
print(f"Episodes: {seg.episode_count}")from openadapt_viewer import scan_and_update_catalog
# Scan default directories
counts = scan_and_update_catalog()
print(f"Indexed {counts['recordings']} recordings")
print(f"Indexed {counts['segmentations']} segmentation results")
# Scan specific directories
counts = scan_and_update_catalog(
capture_dirs=["/path/to/recordings"],
segmentation_dirs=["/path/to/segmentation_output"]
)from openadapt_viewer import get_catalog
catalog = get_catalog()
# Register a recording
recording = catalog.register_recording(
recording_id="my-recording",
name="My Recording",
path="/path/to/recording",
duration_seconds=45.2,
frame_count=50,
event_count=23,
)
# Register segmentation results
seg_result = catalog.register_segmentation(
segmentation_id="my-recording-seg-001",
recording_id="my-recording",
path="/path/to/episodes.json",
episode_count=5,
boundary_count=10,
)from openadapt_viewer import get_catalog
catalog = get_catalog()
# Search by text
results = catalog.search_recordings(query="nightshift")
# Filter by tags
results = catalog.search_recordings(tags=["demo", "testing"])
# Get statistics
stats = catalog.get_stats()
print(f"Total recordings: {stats['recording_count']}")Generate a segmentation viewer with automatic recording dropdown:
# Generate viewer with catalog integration
openadapt-viewer segmentation --output viewer.html
# Auto-load a specific recording
openadapt-viewer segmentation \
--auto-load turn-off-nightshift \
--output viewer.html \
--openThe generated viewer:
- Shows dropdown of all recordings with segmentation results
- Auto-loads the specified recording (if
--auto-loadis used) - Embeds catalog data directly in the HTML
- Works offline (file:// protocol)
from openadapt_viewer import generate_segmentation_viewer
# Generate viewer
output_path = generate_segmentation_viewer(
output_path="segmentation_viewer.html",
auto_load_recording="turn-off-nightshift"
)
print(f"Generated: {output_path}")The viewer generator embeds catalog data as JavaScript:
// Automatically available in generated viewers
window.OPENADAPT_CATALOG = {
recordings: [...],
stats: {...}
};
// Helper API
window.OpenAdaptCatalog.getRecordings()
window.OpenAdaptCatalog.getRecording(recordingId)
window.OpenAdaptCatalog.getSegmentationResults(recordingId)# 1. Scan for recordings (one-time or after new recordings)
openadapt-viewer catalog scan
# 2. Verify catalog
openadapt-viewer catalog list
openadapt-viewer catalog stats
# 3. Generate segmentation viewer
openadapt-viewer segmentation --output viewer.html --open# Option 1: Automatic discovery (recommended)
openadapt-viewer catalog scan
# Option 2: Manual registration
openadapt-viewer catalog register /path/to/new-recording# Scan segmentation output directory
openadapt-viewer catalog scan --segmentation-dir /path/to/segmentation_output
# Regenerate viewer to include new results
openadapt-viewer segmentation --output viewer.html# Remove stale entries (if recordings were deleted)
openadapt-viewer catalog clean
# Re-scan everything
openadapt-viewer catalog scan- Catalog Database:
~/.openadapt/catalog.db - Default Recording Paths:
~/oa/src/openadapt-capture/./recordings/~/.openadapt/recordings/
- Default Segmentation Paths:
~/oa/src/openadapt-ml/segmentation_output/./segmentation_output/~/.openadapt/segmentation_output/
After capturing a recording:
# Automatically register the new recording
openadapt-viewer catalog scan --capture-dir /path/to/openadapt-captureFuture enhancement: Add post-capture hook to auto-register.
After segmentation:
# Index new segmentation results
openadapt-viewer catalog scan --segmentation-dir /path/to/segmentation_outputFuture enhancement: Auto-register results after segmentation completes.
Query catalog to find recordings for evaluation:
from openadapt_viewer import get_catalog
catalog = get_catalog()
recordings = catalog.get_all_recordings()
for rec in recordings:
# Use recording.path for evaluation
run_evaluation(rec.path)- Zero configuration: No server setup required
- Single file: Easy to backup and move
- Fast queries: Efficient for 1000s of recordings
- ACID guarantees: Data integrity
- Legacy compatibility: Similar to original openadapt.db
- User-scoped: Each user has their own catalog
- Persistent: Survives project directory changes
- Conventional: Follows Unix dotfile conventions
- Accessible: All tools can find it
Scan-on-demand was chosen because:
- No background process required
- Simple and reliable
- Works across all platforms
- User controls when indexing happens
- Can be integrated into CI/CD pipelines
File watching can be added later if real-time updates are needed.
For the file:// protocol (offline viewing):
- CORS limitations: Can't fetch JSON files via fetch()
- Offline support: Viewers work without a server
- Portability: Single HTML file contains everything
- Fast loading: No network requests needed
Alternative: Use HTTP server for dynamic loading (future enhancement).
- ~100ms per recording directory
- ~10ms per segmentation JSON file
- Total scan time: ~1-2 seconds for 10 recordings
All queries use indexed lookups:
get_all_recordings(): ~1ms for 100 recordingsget_recording(id): ~0.1ms (primary key lookup)search_recordings(query): ~5ms for 100 recordings (full-text search)
Approximate sizes:
- Empty catalog: 24 KB
- 10 recordings: 32 KB
- 100 recordings: 80 KB
- 1000 recordings: 500 KB
- Auto-register from openadapt-capture post-capture hook
- Auto-register from openadapt-ml after segmentation
- HTTP API server for dynamic catalog queries
- Episode indexing from segmentation JSON files
- File watcher for real-time catalog updates
- Thumbnail generation and caching
- Full-text search on task descriptions
- Tags and custom metadata UI
- Export catalog as JSON for sharing
- Multi-user catalog sync
- Web dashboard for browsing catalog
- Duplicate detection
- Recording validation and health checks
- Disk usage tracking and cleanup recommendations
- Integration with cloud storage (S3, GCS)
# Verify catalog location
openadapt-viewer catalog stats
# Re-scan directories
openadapt-viewer catalog scan
# Check recording structure (must have capture.db)
ls -la /path/to/recording/# Verify file naming (must be *_episodes.json)
ls /path/to/segmentation_output/*_episodes.json
# Re-scan segmentation directory
openadapt-viewer catalog scan --segmentation-dir /path/to/segmentation_output
# Check recording exists first
openadapt-viewer catalog list | grep recording-name# Clean up missing files
openadapt-viewer catalog clean
# Re-scan everything
openadapt-viewer catalog scan# Check catalog file permissions
ls -la ~/.openadapt/catalog.db
# If corrupted, delete and rebuild
rm ~/.openadapt/catalog.db
openadapt-viewer catalog scanWhen adding new features that use recordings:
- Query the catalog instead of hardcoding paths
- Use
get_catalog()for the singleton instance - Call
scan_and_update_catalog()if discovery is needed - Document integration in this file
- openadapt-viewer: Main package containing catalog implementation
- SQLite Documentation: https://www.sqlite.org/docs.html
- Legacy openadapt.db:
/Users/abrichr/oa/src/OpenAdapt/openadapt.db