Skip to content

Cost estimation v1#2489

Merged
onmyraedar merged 52 commits into
mainfrom
humanize_file_upload
Jun 9, 2026
Merged

Cost estimation v1#2489
onmyraedar merged 52 commits into
mainfrom
humanize_file_upload

Conversation

@onmyraedar

Copy link
Copy Markdown
Collaborator

No description provided.

onmyraedar added 30 commits June 1, 2026 14:22
So we aren't confused about input_tokens vs. total_input_tokens
- Can generate a report for the user
Before, the override was completely erasing the base estimate description. If you only overrode comment_tokens, there was no way of telling how the answer tokens were computed.
@onmyraedar

Copy link
Copy Markdown
Collaborator Author

@greptileai review

@greptile-apps

greptile-apps Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces a first-pass cost estimation system for EDSL jobs, adding a new edsl/jobs/cost_estimation/ module with per-question-type, per-file, and per-provider token estimators, a reach-probability propagator for skip logic, and a calibration helper that derives token overrides from pilot runs.

  • Token estimation pipeline: JobCostEstimator iterates over all interviews, computes per-question token estimates using pluggable QuestionEstimator and FileStoreEstimator instances (images via tile/patch/crop-unit formulas; PDFs via per-provider empirical rates), applies optional TokenOverride values, reach-weights each question's cost by its forward-propagated branch probability, and returns a JobCostEstimate with detail, markdown, and model/question summary views.
  • Calibration: calibrate_from_results derives percentile-based TokenOverride dicts from a completed pilot Results object, optionally stratified by service/model, using corrected linear-interpolation percentiles.
  • Jobs.estimate_cost(): A new thin wrapper on Jobs exposes the estimator directly on the job object.

Confidence Score: 3/5

The cost computation itself is correct, but the markdown summary table presents unweighted token totals alongside reach-weighted costs in a way that will produce misleading results for any job with skip logic.

The core reach-weighting on cost_usd is correctly implemented. However, summary_by_model() sums raw, unweighted token counts while summing reach-weighted costs into the same table, meaning the displayed 'Total input × Input $/M' arithmetic won't match 'Total cost' whenever branch_weights are supplied. This is the primary output surface users will rely on to understand and audit their estimates.

edsl/jobs/cost_estimation/job_cost_estimate.py — specifically the summary_by_model() method and the resulting to_markdown() 'Cost by model' table

Important Files Changed

Filename Overview
edsl/jobs/cost_estimation/job_cost_estimate.py New result container for cost estimates; contains an inconsistency where summary_by_model sums unweighted token totals alongside reach-weighted costs, which misleads users trying to verify the estimate math.
edsl/jobs/cost_estimation/job_cost_estimator.py Core orchestration for cost estimation; reach-weighted cost computation is correct; docstring incorrectly references a non-existent .assumptions attribute on the return type.
edsl/jobs/cost_estimation/image_token_estimators.py Provider-specific image token estimators for OpenAI (tile and patch), Google, and Anthropic; ZeroDivisionError guard (max(1,...)) added in both _tile_tokens and breakdown paths; patch formula clamping for sub-1-patch dimensions is now correctly handled.
edsl/jobs/cost_estimation/cost_estimate_calibration.py Derives TokenOverride from pilot Results; empty-values guard (if not output_vals: continue) present in the non-by_model path; linear-interpolation percentile now correctly computes medians.
edsl/jobs/cost_estimation/file_store_estimator.py Dispatches file token estimation by MIME type with offloaded-file fallbacks and a dimensions/page-count cache; logic is sound; describe_for_file correctly reflects post-estimation cache state.
edsl/jobs/cost_estimation/question_estimators.py Per-question-type estimators with a pluggable registry; fallback DefaultEstimator emits a warning; all estimator classes expose describe() and repr for transparency.
edsl/jobs/cost_estimation/pdf_token_estimators.py PDF token estimators for OpenAI (additive gpt-4 vs max gpt-5 vs reasoning-only), Anthropic, and Google; empirically derived rates are documented with source references.
edsl/jobs/jobs.py Adds estimate_cost() method that delegates to JobCostEstimator; docstring is accurate and no longer references the non-existent .assumptions attribute.
edsl/jobs/cost_estimation/token_override.py Dataclass for partial token overrides scoped by service/model; specificity() tie-breaking for multi-match scenarios is correct.
edsl/jobs/cost_estimation/question_token_estimate.py Token breakdown dataclass with merge and apply_override helpers; billable flag correctly preserved through overrides.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    JCE[JobCostEstimator.estimate_cost] --> RP[_compute_reach_probabilities\nforward propagation]
    JCE --> IL[generate_interviews]
    IL --> IQ[per question: get prompts via FetchInvigilator]
    IQ --> QE[QuestionEstimator.estimate\nper question-type formula]
    IQ --> FSE[FileStoreEstimator.estimate\nimage / PDF / text / audio]
    FSE --> IMG[ImageEstimator\nOpenAI tile/patch · Anthropic · Google]
    FSE --> PDF[PdfEstimator\nOpenAI · Anthropic · Google]
    QE --> FE[QuestionTokenEstimate\nprompt + file + memory + answer + comment + thinking]
    FSE --> FE
    FE --> OV[apply TokenOverride\nmost-specific match wins]
    OV --> COST[cost_usd = compute_cost × reach]
    RP --> COST
    COST --> JCR[JobCostEstimate\n.total_cost_usd · .detail · .to_markdown]

    style COST fill:#f9f,stroke:#333
    style JCR fill:#bbf,stroke:#333
Loading

Reviews (8): Last reviewed commit: "Fix chars_per_token inconsistency with P..." | Re-trigger Greptile

Comment thread edsl/jobs/cost_estimation/file_store_estimator.py Outdated
- Fix potential ZeroDivisionError with Google estimator
- Fix percentile function; add regression tests
- Fix branch weight description in docstrings
- Fix reference to non-existent "assumptions" attribute in docstring
- Remove debug logs
- Fix potential ZeroDivisionError with OpenAI estimator
- Fix memory plan + branch weight combination issue; add regression test
Comment thread edsl/jobs/cost_estimation/job_cost_estimator.py
@onmyraedar onmyraedar marked this pull request as ready for review June 9, 2026 02:27
Comment thread edsl/jobs/cost_estimation/image_token_estimators.py
Comment thread edsl/jobs/cost_estimation/cost_estimate_calibration.py
Comment thread edsl/jobs/cost_estimation/job_cost_estimate.py
@onmyraedar onmyraedar merged commit 79ae5c9 into main Jun 9, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants