Fix Anthropic temperature for newer Claude models by johnjosephhorton · Pull Request #2490 · expectedparrot/edsl

johnjosephhorton · 2026-06-09T17:16:12Z

Summary

Normalize outbound Anthropic API temperature to 1.0 for Claude models released after Opus 4.6
Preserve the user-configured Model.temperature and serialized parameters for EDSL/cache behavior
Add Anthropic service tests covering affected and legacy model names

Closes #2458

Testing

pytest -q tests/inference_services/test_anthropic_service.py
pytest -q tests/inference_services

Before production

Test this with local Anthropic inference using a newer affected Claude model and a legacy Claude model before moving to production.

greptile-apps · 2026-06-09T17:23:28Z

Greptile Summary

This PR primarily adds temperature normalization for newer Anthropic Claude models (post-Opus 4.6), intercepting the outbound API call to force temperature=1.0 while preserving the user-configured value for EDSL serialization and caching. It also bundles several unrelated changes: the default object store location moves from a stable platformdirs path to CWD-relative .edsl_objects/, the typer version constraint is widened, and jobs.py now raises JobsRunError instead of silently falling back to local execution when remote inference is unavailable.

Temperature fix (anthropic_service.py): _requires_temperature_one uses an 8-digit date check (priority) plus a family/version check to decide whether to override temperature with 1.0 in the API payload; model.temperature and model.parameters[\"temperature\"] are left unchanged.
Jobs remote-inference guard (jobs.py): _remote_results now raises JobsRunError when use_remote_inference() returns False and disable_remote_inference is not set — changing the default behavior for users without EP credentials from a silent local fallback to an error.
Object store root (store.py): Default storage moves from platformdirs.user_data_dir(\"edsl\")/objects to Path.cwd()/\".edsl_objects\", making the store project-scoped but no longer persistent across different working directories.

Confidence Score: 3/5

The temperature fix itself is correct and well-tested, but the bundled jobs.py change silently removes the local-execution fallback that users without EP credentials currently rely on.

The jobs.py change removes the silent local fallback when remote inference is unavailable. Because offload_execution defaults to True and disable_remote_inference defaults to False, any user calling job.run() without EP remote inference configured will now get a JobsRunError where they previously ran locally. The docstring for run() still promises local fallback, making the behavior change easy to miss. The temperature normalization logic in anthropic_service.py is correct and the accompanying tests are thorough, but the test file is missing coverage of undated model strings that exercise the version-based branch directly.

edsl/jobs/jobs.py — the removal of the silent local fallback in _remote_results needs explicit verification that the default user journey (no EP credentials, no flags) is handled gracefully.

Important Files Changed

Filename	Overview
edsl/inference_services/services/anthropic_service.py	Adds `_requires_temperature_one` and `_api_temperature` class methods to normalize temperature to 1.0 for newer Claude models; logic is sound but the date-first branch can override the opus-4-6 version exemption for dated model names released after the cutoff.
edsl/jobs/jobs.py	Changes `_remote_results` to raise `JobsRunError` when remote inference is unavailable and `disable_remote_inference` is not set — a breaking behavioral change for users without EP credentials who previously fell back silently to local execution.
tests/inference_services/test_anthropic_service.py	New test file covering temperature normalization for affected and legacy models; misses versionless model-name cases (e.g. `claude-opus-4-6`, `claude-sonnet-4-6`) that exercise the version-based branch independently of date matching.
tests/jobs/test_Jobs.py	Adds two tests explicitly validating the new `_remote_results` behavior: error on unavailable remote with `offload_execution=True`, and successful local execution with `disable_remote_inference=True`.
edsl/object_store/store.py	Replaces stable `platformdirs`-based `DEFAULT_ROOT` class attribute with a CWD-relative `default_root()` static method; changes default object store location from a persistent OS-standard path to `.edsl_objects/` in the current working directory.

Sequence Diagram

sequenceDiagram
    participant User
    participant Jobs
    participant RemoteHandler
    participant AnthropicService

    User->>Jobs: job.run()
    Jobs->>RemoteHandler: use_remote_inference(disable_remote_inference)
    alt remote available
        RemoteHandler-->>Jobs: True
        Jobs->>Jobs: run remotely
        Jobs-->>User: Results
    else remote unavailable
        RemoteHandler-->>Jobs: False
        alt "disable_remote_inference=True"
            Jobs-->>User: (None, None) local execution
        else "disable_remote_inference=False default"
            Jobs-->>User: raises JobsRunError
        end
    end

    User->>AnthropicService: async_execute_model_call(prompt)
    AnthropicService->>AnthropicService: _requires_temperature_one(model_name)
    alt model after Opus 4.6 cutoff
        AnthropicService->>AnthropicService: _api_temperature returns 1.0
    else legacy model
        AnthropicService->>AnthropicService: _api_temperature returns user temperature
    end
    AnthropicService->>AnthropicService: "client.messages.create(temperature=api_temp)"
    AnthropicService-->>User: response model.temperature preserved

_{Reviews (1): Last reviewed commit: "Fix Anthropic temperature for newer mode..." | Re-trigger Greptile}

greptile-apps · 2026-06-09T17:23:31Z

        else:
-            return None, None
+            if self.run_config.parameters.disable_remote_inference:
+                return None, None
+
+            from .exceptions import JobsRunError
+
+            raise JobsRunError(
+                "Remote execution was requested, but remote inference is not "
+                "available. Check EXPECTED_PARROT_URL, EXPECTED_PARROT_API_KEY, "
+                "and the remote inference setting. To run locally, pass "
+                "disable_remote_inference=True or offload_execution=False."
+            )


Silent local-execution fallback is now removed by default

Previously, when use_remote_inference() returned False (e.g. no EP API key, connection error, or user's remote-inference setting disabled), the code returned (None, None) and the caller fell through to local execution. Now a JobsRunError is raised for the same condition unless disable_remote_inference=True is explicitly set. Because offload_execution defaults to True (line 111 of data_structures.py) and disable_remote_inference defaults to False, any user who calls job.run() without EP credentials will now get an error instead of running locally. The existing docstring in run() still says "If remote inference is not available, it will run locally", which is now incorrect.

greptile-apps · 2026-06-09T17:23:32Z

+def test_requires_temperature_one_for_models_after_opus_46():
+    assert AnthropicService._requires_temperature_one(
+        "claude-sonnet-4-6-20260217"
+    )
+    assert AnthropicService._requires_temperature_one("claude-opus-4-7-20260416")
+    assert AnthropicService._requires_temperature_one("claude-opus-4-7")
+    assert not AnthropicService._requires_temperature_one("claude-opus-4-6-20260205")
+    assert not AnthropicService._requires_temperature_one("claude-opus-4-5-20251124")
+    assert not AnthropicService._requires_temperature_one("claude-3-5-sonnet-20241022")


Missing test cases for versionless model names

The test suite validates dated model strings (e.g. claude-opus-4-6-20260205) but omits the undated variants that users commonly pass. For example, claude-opus-4-6 (no date) should return False and claude-sonnet-4-6 (no date) should return True per the version-based branch — but neither is exercised. Without these tests, a regression in the else branch of _requires_temperature_one (the path reached when no 8-digit date is present) would go undetected.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

greptile-apps · 2026-06-09T17:23:33Z

+    @classmethod
+    def _requires_temperature_one(cls, model_name: str) -> bool:
+        """Return whether Anthropic only accepts temperature=1.0 for this model."""
+        model_name = model_name.lower()
+        date_match = re.search(r"(?<!\d)(\d{8})(?!\d)", model_name)
+        if date_match:
+            return int(date_match.group(1)) > cls._temperature_deprecation_date
+
+        version_match = re.search(
+            r"claude-(?P<family>opus|sonnet|haiku)-(?P<major>\d+)-(?P<minor>\d+)",
+            model_name,
+        )
+        if not version_match:
+            return False
+
+        version = (
+            int(version_match.group("major")),
+            int(version_match.group("minor")),
+        )
+        if version > cls._temperature_deprecation_version:
+            return True
+
+        family = version_match.group("family")
+        return version == cls._temperature_deprecation_version and family != "opus"


Date-first check can override version-family exemption for opus-4-6

When a model name includes an 8-digit date, the date check short-circuits before the family/version logic. This means a hypothetical claude-opus-4-6-20260206 (any opus-4-6 snapshot released the day after the cutoff) would return True via the date path, contradicting the version-based branch that explicitly exempts opus-4-6 regardless of date. In practice this edge case is unlikely, but the two branches can yield opposite answers for the same logical model depending solely on whether a date suffix is present.

johnjosephhorton added 5 commits May 24, 2026 09:20

Throw an error if remove inference was requested but not possible

692bc54

Merge branch 'main' of github.com:expectedparrot/edsl

65a5475

Merge branch 'main' of github.com:expectedparrot/edsl

6783aac

Merge branch 'main' of github.com:expectedparrot/edsl

f1a5ef8

Fix Anthropic temperature for newer models

5bce224

greptile-apps Bot reviewed Jun 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Anthropic temperature for newer Claude models#2490

Fix Anthropic temperature for newer Claude models#2490
johnjosephhorton wants to merge 5 commits into
mainfrom
fix/anthropic-temperature-deprecation-2458

johnjosephhorton commented Jun 9, 2026

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Uh oh!

greptile-apps Bot Jun 9, 2026

Uh oh!

greptile-apps Bot Jun 9, 2026

Uh oh!

greptile-apps Bot Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnjosephhorton commented Jun 9, 2026

Summary

Testing

Before production

Uh oh!

greptile-apps Bot commented Jun 9, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant