Skip to content

fix: only retry transient I/O errors in should_retry#2235

Open
suhr25 wants to merge 3 commits intoconda:mainfrom
suhr25:fix/should-retry-non-transient-io-errors
Open

fix: only retry transient I/O errors in should_retry#2235
suhr25 wants to merge 3 commits intoconda:mainfrom
suhr25:fix/should-retry-non-transient-io-errors

Conversation

@suhr25
Copy link
Copy Markdown
Contributor

@suhr25 suhr25 commented Mar 17, 2026

Description

Improve ExtractError::should_retry() to only retry transient I/O errors. Currently, permanent errors like PermissionDenied and NotFound are retried, causing slow failures and confusing logs during installs. This change makes it fail fast and resolves the existing TODO.

Fixes

Retries are now limited to transient errors (e.g. ConnectionReset, TimedOut), while permanent errors like PermissionDenied, NotFound, and CouldNotCreateDestination no longer retry, reducing delays and improving error clarity.

How Has This Been Tested?

Tests were updated to reflect the new behavior. Also verified manually with a read-only directory, which now fails immediately without retries.

AI Disclosure

  • This PR includes AI-assisted content
  • All AI-generated contributions were tested
  • I take full responsibility for the changes

Tools: ChatGPT, Claude

Checklist

  • Tests updated
  • No breaking changes
  • Small, self-contained change
  • Improves real-world behavior

@baszalmstra
Copy link
Copy Markdown
Collaborator

Can you look at CI?

@suhr25
Copy link
Copy Markdown
Contributor Author

suhr25 commented Mar 17, 2026

Yes, found out some issues.
I am fixing it right now.
Thanks!

@suhr25 suhr25 closed this Mar 17, 2026
@suhr25 suhr25 force-pushed the fix/should-retry-non-transient-io-errors branch from 1d7a2b8 to ff19e9d Compare March 17, 2026 09:10
@suhr25 suhr25 reopened this Mar 17, 2026
@suhr25 suhr25 force-pushed the fix/should-retry-non-transient-io-errors branch from ff19e9d to 1dcaff1 Compare March 17, 2026 09:24
suhr25 added 2 commits March 17, 2026 09:44
rattler_index 0.27.16 is not yet published to the conda channel.
Downgrade to 0.27.14 so pixi install --locked succeeds.

Signed-off-by: Suhrid Marwah <suhridmarwah07@gmail.com>
PermissionDenied and NotFound are permanent filesystem errors that will
never succeed on retry. Restrict retries to genuinely transient network
error kinds (BrokenPipe, ConnectionReset, UnexpectedEof, etc.) and stop
retrying CouldNotCreateDestination entirely.

Signed-off-by: Suhrid Marwah <suhridmarwah07@gmail.com>
@suhr25 suhr25 force-pushed the fix/should-retry-non-transient-io-errors branch from 1dcaff1 to 1757a5c Compare March 17, 2026 17:05
…r errors

Decompressors and zip readers (e.g. async_zip) surface unexpected EOF from
a truncated network stream as `std::io::Error::other(...)` (kind = Other).
The previous narrowing of should_retry() unintentionally stopped retrying
these errors, breaking the FailAfterBytes path in test_flaky.

Permanent errors like PermissionDenied and NotFound have their own explicit
ErrorKind variants and are never wrapped as Other, so this addition does not
reintroduce the original bug.

Signed-off-by: Suhrid Marwah <suhridmarwah07@gmail.com>
@suhr25
Copy link
Copy Markdown
Contributor Author

suhr25 commented Mar 18, 2026

Hi @baszalmstra,
Sorry for the delay , I ran into a few unexpected issues.
Everything’s resolved now and all CI checks are passing.
Kindly take another look at the PR.
Thanks!

@suhr25
Copy link
Copy Markdown
Contributor Author

suhr25 commented Mar 20, 2026

Hi @baszalmstra,
Kindly take a look at this PR.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants