Summary
Collapse ISO-8601 datetime strings to integer epoch values inside DictMinimizer and ListMinimizer. Currently only LogMinimizer strips timestamps (aggressive mode), and dict keys like timestamp/created_at get shortened to ts, but the values pass through untouched.
Motivation
ISO-8601 with timezone ("2025-04-17T21:05:33.123Z") burns ~13 tokens on cl100k. The epoch int (1760763933) burns 3-4. Multiplied across every row in a typical API response, this is one of the highest-ROI compressions available — and agents almost always want a number, not a human-readable string.
Reported by a user on Reddit as the single biggest win in their stack.
Proposed behavior
- Detect ISO-8601 values (date, datetime, with/without tz, with/without fractional seconds) in
DictMinimizer and ListMinimizer
- Convert to
int epoch seconds by default
- Use epoch milliseconds when sub-second precision is present in the source
- Opt-out flag for cases where the agent genuinely needs the string form
- Consider a companion
epoch_to_iso reverse helper for downstream tools that want readable dates back
Open questions
- Assume UTC for naive datetimes, or skip conversion?
- Handle
datetime.datetime / datetime.date Python objects in addition to strings?
- Should this be on by default or behind a flag? (Leaning: on by default — fits the zero-config ethos.)
Scope
- New: ISO-8601 detection regex (precompiled at module level, per project rules)
- Touch:
src/ptk/minimizers/_dict.py, src/ptk/minimizers/_list.py
- Tests: unit coverage for tz-aware, tz-naive, fractional seconds, date-only, invalid strings (must pass through), and adversarial ReDoS cases
- Docs: README note + CHANGELOG entry
Summary
Collapse ISO-8601 datetime strings to integer epoch values inside
DictMinimizerandListMinimizer. Currently onlyLogMinimizerstrips timestamps (aggressive mode), and dict keys liketimestamp/created_atget shortened tots, but the values pass through untouched.Motivation
ISO-8601 with timezone (
"2025-04-17T21:05:33.123Z") burns ~13 tokens on cl100k. The epoch int (1760763933) burns 3-4. Multiplied across every row in a typical API response, this is one of the highest-ROI compressions available — and agents almost always want a number, not a human-readable string.Reported by a user on Reddit as the single biggest win in their stack.
Proposed behavior
DictMinimizerandListMinimizerintepoch seconds by defaultepoch_to_isoreverse helper for downstream tools that want readable dates backOpen questions
datetime.datetime/datetime.datePython objects in addition to strings?Scope
src/ptk/minimizers/_dict.py,src/ptk/minimizers/_list.py