Skip to content

feat(deepseek): add DeepSeek V4 Pro and V4 Flash model metadata#26380

Open
neo1027144-creator wants to merge 1 commit intoBerriAI:litellm_internal_stagingfrom
neo1027144-creator:feat/deepseek-v4-model-metadata
Open

feat(deepseek): add DeepSeek V4 Pro and V4 Flash model metadata#26380
neo1027144-creator wants to merge 1 commit intoBerriAI:litellm_internal_stagingfrom
neo1027144-creator:feat/deepseek-v4-model-metadata

Conversation

@neo1027144-creator
Copy link
Copy Markdown

Summary

Add model pricing, context window, and capability metadata for the new DeepSeek V4 model series.

Models Added

Model Context Max Output Input (cache miss) Input (cache hit) Output
deepseek-v4-flash 1,000,000 384,000 .14/M .028/M .28/M
deepseek-v4-pro 1,000,000 384,000 .74/M .14/M .48/M

Capabilities

Both models support:

  • ✅ Function calling / Tool choice
  • ✅ Reasoning (thinking mode)
  • ✅ Response schema
  • ✅ Prompt caching
  • ✅ Native streaming
  • ✅ System messages
  • ✅ Assistant prefill

Changes

  • model_prices_and_context_window.json: Added 4 entries (bare + provider-prefixed for each model)
  • ** ests/test_litellm/test_deepseek_model_metadata.py**: Added 14 mock tests verifying V4 metadata correctness

Notes

  • deepseek-chat and deepseek-reasoner are being deprecated by DeepSeek, mapping to deepseek-v4-flash non-thinking and thinking modes respectively.
  • Pricing sourced from official DeepSeek API docs: https://api-docs.deepseek.com/quick_start/pricing

AI Disclosure

This PR was authored with AI assistance.

Add model pricing, context window, and capability metadata for the new DeepSeek V4 model series:

- deepseek-v4-flash: 1M context, 384K output, .14/M input, .28/M output

- deepseek-v4-pro: 1M context, 384K output, .74/M input, .48/M output

Both models support function calling, tool choice, reasoning, response schema, prompt caching, and streaming.

Entries added with both bare and provider-prefixed keys.

Source: https://api-docs.deepseek.com/quick_start/pricing
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@veria-ai
Copy link
Copy Markdown

veria-ai Bot commented Apr 24, 2026

Low: No security issues found

This PR adds static model metadata entries (pricing, context window, capability flags) for DeepSeek V4 Pro and V4 Flash to a JSON configuration file, along with corresponding tests. No executable code, no user input handling, no secrets, no auth changes.


Status: 0 open
Risk: 1/10

Posted by Veria AI · 2026-04-24T03:49:36.997Z

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR adds pricing and capability metadata for two new DeepSeek V4 models (deepseek-v4-flash and deepseek-v4-pro) to model_prices_and_context_window.json, along with 14 tests verifying the entries. The main concern before merging is a pricing inconsistency for deepseek-v4-pro: the PR description table lists $0.74/M input and $0.48/M output, but the JSON encodes $1.74/M and $3.48/M respectively — a meaningful billing discrepancy that should be verified against the official DeepSeek pricing docs before merging.

Confidence Score: 3/5

Not safe to merge until the deepseek-v4-pro pricing values are verified against official docs — the description and JSON diverge by more than 2× on both input and output cost per token.

A P1 pricing mismatch on deepseek-v4-pro (description says $0.74/M input / $0.48/M output, JSON encodes $1.74/M / $3.48/M) must be resolved before merging, as it will either over-charge or under-charge users. The JSON values look internally self-consistent (2× output ratio, same as v4-flash), but cannot be confirmed without checking the live DeepSeek pricing page.

model_prices_and_context_window.json — the deepseek-v4-pro input/output cost fields need verification.

Important Files Changed

Filename Overview
model_prices_and_context_window.json Adds 4 model entries (bare + provider-prefixed) for deepseek-v4-flash and deepseek-v4-pro; v4-flash pricing matches description, but v4-pro JSON prices ($1.74/M in, $3.48/M out) differ significantly from the PR description table ($0.74/M in, $0.48/M out) — needs verification.
tests/test_litellm/test_deepseek_model_metadata.py Adds 14 tests verifying V4 metadata existence and values; tests are mock-safe (no network calls) but load JSON from disk via path construction rather than using litellm.model_cost like the existing test patterns.

Reviews (1): Last reviewed commit: "feat(deepseek): add DeepSeek V4 Pro and ..." | Re-trigger Greptile

Comment on lines +9665 to +9670
"supports_response_schema": true,
"supports_system_messages": true,
"supports_tool_choice": true
},
"deepseek-v4-pro": {
"cache_creation_input_token_cost": 0.0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Pricing mismatch between description and JSON for deepseek-v4-pro

The PR description table states input_cost_per_token = $0.74/M and output_cost_per_token = $0.48/M, but the JSON contains:

  • input_cost_per_token: 1.74e-06$1.74/M
  • output_cost_per_token: 3.48e-06$3.48/M

The JSON values look internally consistent (same 2× ratio as v4-flash, and output > input), suggesting the PR description contains typos (.74 instead of 1.74, and .48 instead of 3.48). However, please verify these values against the official DeepSeek pricing page before merging to avoid billing users at the wrong rate.

Comment on lines +188 to +196
class TestDeepSeekV4ModelMetadata:
"""Verify that DeepSeek V4 Flash and V4 Pro model entries exist
and have correct metadata in the main JSON file."""

def _load_main_json(self) -> dict:
main_path = os.path.join(
os.path.dirname(os.path.dirname(litellm.__file__)),
"model_prices_and_context_window.json",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Tests bypass litellm.model_cost and load JSON directly from disk

The existing TestBareModelFallback class above uses litellm.model_cost (in-memory, already loaded), but the new _load_main_json() method reconstructs a filesystem path on every test call using os.path.dirname. This is fragile — it silently resolves to a different file if the package is installed as a wheel, or when the repo is checked out in an unexpected structure. Consider using litellm.model_cost consistently:

def _get_entry(self, key: str) -> dict:
    return litellm.model_cost[key]

@@ -9641,6 +9641,56 @@
"supports_system_messages": true,
"supports_tool_choice": false
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 cache_creation_input_token_cost: 0.0 — verify free cache writes

Both v4 entries set cache_creation_input_token_cost to 0.0. Most DeepSeek models do charge for writing to cache (e.g., existing deepseek-r1 entries). If DeepSeek V4 genuinely offers free cache creation, this is fine, but it should be explicitly confirmed against the pricing page to avoid under-billing.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants