Skip to content

Add indexed_dataset_version_time OTel gauge#1129

Open
leonhandreke wants to merge 1 commit intomainfrom
leonhandreke/index-freshness-metric
Open

Add indexed_dataset_version_time OTel gauge#1129
leonhandreke wants to merge 1 commit intomainfrom
leonhandreke/index-freshness-metric

Conversation

@leonhandreke
Copy link
Copy Markdown
Contributor

@leonhandreke leonhandreke commented Apr 29, 2026

Tracks Unix timestamp of max last_seen across all documents per dataset, set after each successful catalog refresh. We'd ideally use last_export for this, but that's only available in the catalog index before the new index is loaded — once the catalog refresh completes, it's gone. So we derive freshness by max-aggregating last_seen across all indexed documents instead.

Changes

  • SearchProvider.get_index_max_date() — new method on the base class that runs a max aggregation on any date field; works for both ES and OpenSearch via the existing search() abstraction
  • yente/data/metrics.py — new module holding the gauge and update_dataset_version_metric()
  • indexer.py — calls the update function after alias rollover completes
  • tests/test_search_provider.py — covers empty index (→ None) and the max-of-two case

🤖 Generated with Claude Code

#1079

Tracks Unix timestamp of max last_seen across all documents per dataset,
set after each successful catalog refresh. We'd ideally use last_export for
this, but that's only available in the catalog index before the new index
is loaded — once the catalog refresh completes, it's gone. So we derive
freshness by max-aggregating last_seen across all indexed documents instead.

- SearchProvider.get_index_max_date() runs a max aggregation on any date
  field, works for both ES and OpenSearch via the existing search() abstraction
- yente/data/metrics.py holds the gauge and the update function
- Triggered from indexer.py after alias rollover completes

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@leonhandreke leonhandreke requested a review from pudo April 29, 2026 15:52
@leonhandreke
Copy link
Copy Markdown
Contributor Author

@pudo how much do you hate this mechanism and do you have a better idea? Basically, I want to have a metric that makes it possible to alert on "default older than X". Right now, we can only export stale/not stale (index_version == catalog.version). That's another option, to just alert on "stale longer than X hours", but not as elegant. WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant