Cost estimation v1 by onmyraedar · Pull Request #2489 · expectedparrot/edsl

onmyraedar · 2026-06-06T23:34:20Z

No description provided.

So we aren't confused about input_tokens vs. total_input_tokens

- Can generate a report for the user

…file

Before, the override was completely erasing the base estimate description. If you only overrode comment_tokens, there was no way of telling how the answer tokens were computed.

- Add file token averages to question breakdown table, when applicable - Make estimates more comprehensible

Different models should have their own calibrated output estimates.

This is especially useful since we don't have good upfront thinking token estimation.

onmyraedar · 2026-06-06T23:34:36Z

@greptileai review

greptile-apps · 2026-06-06T23:39:52Z

Greptile Summary

This PR introduces a first-pass cost estimation system for EDSL jobs, adding a new edsl/jobs/cost_estimation/ module with per-question-type, per-file, and per-provider token estimators, a reach-probability propagator for skip logic, and a calibration helper that derives token overrides from pilot runs.

Token estimation pipeline: JobCostEstimator iterates over all interviews, computes per-question token estimates using pluggable QuestionEstimator and FileStoreEstimator instances (images via tile/patch/crop-unit formulas; PDFs via per-provider empirical rates), applies optional TokenOverride values, reach-weights each question's cost by its forward-propagated branch probability, and returns a JobCostEstimate with detail, markdown, and model/question summary views.
Calibration: calibrate_from_results derives percentile-based TokenOverride dicts from a completed pilot Results object, optionally stratified by service/model, using corrected linear-interpolation percentiles.
Jobs.estimate_cost(): A new thin wrapper on Jobs exposes the estimator directly on the job object.

Confidence Score: 3/5

The cost computation itself is correct, but the markdown summary table presents unweighted token totals alongside reach-weighted costs in a way that will produce misleading results for any job with skip logic.

The core reach-weighting on cost_usd is correctly implemented. However, summary_by_model() sums raw, unweighted token counts while summing reach-weighted costs into the same table, meaning the displayed 'Total input × Input $/M' arithmetic won't match 'Total cost' whenever branch_weights are supplied. This is the primary output surface users will rely on to understand and audit their estimates.

edsl/jobs/cost_estimation/job_cost_estimate.py — specifically the summary_by_model() method and the resulting to_markdown() 'Cost by model' table

Important Files Changed

Filename	Overview
edsl/jobs/cost_estimation/job_cost_estimate.py	New result container for cost estimates; contains an inconsistency where summary_by_model sums unweighted token totals alongside reach-weighted costs, which misleads users trying to verify the estimate math.
edsl/jobs/cost_estimation/job_cost_estimator.py	Core orchestration for cost estimation; reach-weighted cost computation is correct; docstring incorrectly references a non-existent .assumptions attribute on the return type.
edsl/jobs/cost_estimation/image_token_estimators.py	Provider-specific image token estimators for OpenAI (tile and patch), Google, and Anthropic; ZeroDivisionError guard (max(1,...)) added in both _tile_tokens and breakdown paths; patch formula clamping for sub-1-patch dimensions is now correctly handled.
edsl/jobs/cost_estimation/cost_estimate_calibration.py	Derives TokenOverride from pilot Results; empty-values guard (if not output_vals: continue) present in the non-by_model path; linear-interpolation percentile now correctly computes medians.
edsl/jobs/cost_estimation/file_store_estimator.py	Dispatches file token estimation by MIME type with offloaded-file fallbacks and a dimensions/page-count cache; logic is sound; describe_for_file correctly reflects post-estimation cache state.
edsl/jobs/cost_estimation/question_estimators.py	Per-question-type estimators with a pluggable registry; fallback DefaultEstimator emits a warning; all estimator classes expose describe() and repr for transparency.
edsl/jobs/cost_estimation/pdf_token_estimators.py	PDF token estimators for OpenAI (additive gpt-4 vs max gpt-5 vs reasoning-only), Anthropic, and Google; empirically derived rates are documented with source references.
edsl/jobs/jobs.py	Adds estimate_cost() method that delegates to JobCostEstimator; docstring is accurate and no longer references the non-existent .assumptions attribute.
edsl/jobs/cost_estimation/token_override.py	Dataclass for partial token overrides scoped by service/model; specificity() tie-breaking for multi-match scenarios is correct.
edsl/jobs/cost_estimation/question_token_estimate.py	Token breakdown dataclass with merge and apply_override helpers; billable flag correctly preserved through overrides.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    JCE[JobCostEstimator.estimate_cost] --> RP[_compute_reach_probabilities\nforward propagation]
    JCE --> IL[generate_interviews]
    IL --> IQ[per question: get prompts via FetchInvigilator]
    IQ --> QE[QuestionEstimator.estimate\nper question-type formula]
    IQ --> FSE[FileStoreEstimator.estimate\nimage / PDF / text / audio]
    FSE --> IMG[ImageEstimator\nOpenAI tile/patch · Anthropic · Google]
    FSE --> PDF[PdfEstimator\nOpenAI · Anthropic · Google]
    QE --> FE[QuestionTokenEstimate\nprompt + file + memory + answer + comment + thinking]
    FSE --> FE
    FE --> OV[apply TokenOverride\nmost-specific match wins]
    OV --> COST[cost_usd = compute_cost × reach]
    RP --> COST
    COST --> JCR[JobCostEstimate\n.total_cost_usd · .detail · .to_markdown]

    style COST fill:#f9f,stroke:#333
    style JCR fill:#bbf,stroke:#333

_{Reviews (8): Last reviewed commit: "Fix chars_per_token inconsistency with P..." | Re-trigger Greptile}

- Fix potential ZeroDivisionError with Google estimator - Fix percentile function; add regression tests - Fix branch weight description in docstrings - Fix reference to non-existent "assumptions" attribute in docstring - Remove debug logs

- Fix potential ZeroDivisionError with OpenAI estimator - Fix memory plan + branch weight combination issue; add regression test

onmyraedar added 30 commits June 1, 2026 14:22

Messing with a better estimation system

08a15de

Fix type hints

2e310d3

Improved branch weighting algorithm

d0be13f

Clean up job cost estimator

c59da6e

Small file renaming + import updates

672f015

Allow characters per token overrides; unify logic

d240892

Smarter question estimation based on type

426cc6a

Token dataclass + QTE tests

d4c453c

Rename input_tokens to prompt_tokens

a3d6cd7

So we aren't confused about input_tokens vs. total_input_tokens

Question estimator tests

e9f368c

Reach probability & job cost estimate tests

e53eabc

Fix test

496887d

Fix functional test

6c03358

Add compute test

9209f7a

Add .md method to JobCostEstimator

7d71e92

- Can generate a report for the user

Add credits + model summary to job cost estimate

b1f10ad

Add .describe() methods to estimators for use in generating Markdown …

2f42550

…file

QE describe tests + better descriptions for manual overrides

4977f3d

Take out assumptons section now that we have .describe() methods

27fdb79

Description should reflect that overrides are merged with base estimate

6486c4f

Before, the override was completely erasing the base estimate description. If you only overrode comment_tokens, there was no way of telling how the answer tokens were computed.

Don't show skip logic warning if the survey has no skip rules

24aa0d5

Estimate clarifications

30d5cc8

Accurate estimator description for offloaded files

65ea28a

Fall back to 1,000 tokens for offloaded files by default

67163a1

Get image dimensions for estimates, when we can

8ed0010

Use proper OpenAI image estimation

59fce5c

Refactor FileStoreEstimator to use type-based classes

3d13dfb

Add AnthropicImageEstimator

e1ce17f

Add GoogleImageEstimator

b3ade3e

Separate file for service-based image estimators

1683b40

onmyraedar added 13 commits June 4, 2026 15:40

Update tests

eb3b3fe

OpenAI PDF estimator v1

99b1a5a

Anthropic PDF estimator v1

372463c

Add Google PDF estimator v1

629dcf5

More PDF algorithm calibration

3815961

Improve file estimates: general improvements, PDFs, images

33229e7

- Add file token averages to question breakdown table, when applicable - Make estimates more comprehensible

Fix tests

51575d0

Model calibration should be True by default

9ff19f6

Different models should have their own calibrated output estimates.

Calibrate thinking tokens

3aa1c5b

This is especially useful since we don't have good upfront thinking token estimation.

Better skill

8012785

Delete skill (moved to ep-agent)

e4962e6

Merge remote-tracking branch 'origin/main' into humanize_file_upload

10bdb65

Update estimate_remote_job_cost docstring & types

2b533db

greptile-apps Bot reviewed Jun 6, 2026

View reviewed changes

Comment thread edsl/jobs/cost_estimation/file_store_estimator.py Outdated

onmyraedar added 2 commits June 6, 2026 20:55

Greptile fixes

f642094

- Fix potential ZeroDivisionError with Google estimator - Fix percentile function; add regression tests - Fix branch weight description in docstrings - Fix reference to non-existent "assumptions" attribute in docstring - Remove debug logs

More small fixes

294b482

- Fix potential ZeroDivisionError with OpenAI estimator - Fix memory plan + branch weight combination issue; add regression test

greptile-apps Bot reviewed Jun 7, 2026

View reviewed changes

Comment thread edsl/jobs/cost_estimation/job_cost_estimator.py

onmyraedar added 2 commits June 6, 2026 21:32

Fix reach -> cost impact; add regression test

235bc2a

Fix EOS reach double-count

ada6742

onmyraedar marked this pull request as ready for review June 9, 2026 02:27

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread edsl/jobs/cost_estimation/image_token_estimators.py

Comment thread edsl/jobs/cost_estimation/cost_estimate_calibration.py

onmyraedar added 3 commits June 8, 2026 22:49

Add image estimator tests; ensure minimum of one patch

e67270e

Skip calibration if all values are None

f9d01a1

Fix chars_per_token inconsistency with PDF estimator

b33a2fc

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Comment thread edsl/jobs/cost_estimation/job_cost_estimate.py

Include reach probabilities in summary; add regression test

46204a7

rbyh approved these changes Jun 9, 2026

View reviewed changes

onmyraedar merged commit 79ae5c9 into main Jun 9, 2026
23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cost estimation v1#2489

Cost estimation v1#2489
onmyraedar merged 52 commits into
mainfrom
humanize_file_upload

onmyraedar commented Jun 6, 2026

Uh oh!

onmyraedar commented Jun 6, 2026

Uh oh!

greptile-apps Bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

onmyraedar commented Jun 6, 2026

Uh oh!

onmyraedar commented Jun 6, 2026

Uh oh!

greptile-apps Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Jun 6, 2026 •

edited

Loading