Introduce resumable downloads with --resume-retries #12991

gmargaritis · 2024-10-04T20:26:53Z

Resolves #4796

Introduced the --resume-retries option in order to allow resuming incomplete downloads incase of dropped or timed out connections.

This option additionally uses the values specified for --retries and --timeout for each resume attempt, since they are passed in the session.

Used 0 as the default in order to keep backward compatibility.

This PR is based on #11180

The downloader will make new requests and attempt to resume downloading using a Range header. If the initial response includes an ETag (preferred) or Date header, the downloader will ask the server to resume downloading only when it is safe (i.e., the file hasn't changed since the initial request) using an If-Range header.

If the server responds with a 200 (e.g. if the server doesn't support partial content or can't check if the file has changed), the downloader will restart the download (i.e. start from the very first byte); if the server responds with a 206 Partial Content, the downloader will resume the download from the partially downloaded file.

- Added —resume-retries option to allow resuming incomplete downloads - Setting —resume-retries=N allows pip to make N attempts to resume downloading, in case of dropped or timed out connections - Each resume attempt uses the values specified for —retries and —timeout internally Signed-off-by: gmargaritis <[email protected]>

gmargaritis · 2024-10-04T20:49:02Z

I'm guessing the CI fails because of the new linter rules introduced in 102d818

thk686 · 2024-10-04T21:01:04Z

Does this do rsync-style checksums? That would increase reliability.

notatallshaw · 2024-10-04T23:42:21Z

I'm guessing the CI fails because of the new linter rules introduced in 102d818

This is CI fix, failing until it's merged: #12964

Signed-off-by: gmargaritis <[email protected]>

gmargaritis · 2024-11-18T20:04:56Z

Hey @notatallshaw 👋

Is there anything that I can do to move this one forward?

notatallshaw · 2024-12-11T18:49:49Z

Is there anything that I can do to move this one forward?

A pip maintainer needs to take up the task of reviewing it, as we're all volunteers it's a matter of finding time.

I think my main concern would be the behavior when interacting with index servers that behave badly, e.g. give the wrong content length (usually 0). Your description looks good to me, but I haven't had time to look over the code yet.

gmargaritis · 2024-12-11T21:45:47Z

A pip maintainer needs to take up the task of reviewing it, as we're all volunteers it's a matter of finding time.

Yeah, I know how it goes, so no worries!

If you need any clarifications or would like me to make changes, I'd be happy to help!

art-ignatev · 2025-01-13T08:42:30Z

any chances that it'll be merged soon?

notatallshaw · 2025-02-01T05:59:02Z

I've had an initial cursory glace at this PR and it appears to be sufficiently high quality.

I've also run the functionality locally (select a large wheel to download and then disconnect my WiFi midway through the download) and it has a good UX.

My main concern, although this is a ship that has probably sailed, is it would be nice for pip not to have to directly handle HTTP intricacies and leave that to a separate library.

I can’t promise a full review or other maintainers will agree, but I am adding it to the 25.1 milestone for it to be tracked.

pfmoore · 2025-02-01T10:39:07Z

The PR looks good, although I’m not a http expert so I can’t comment on details like status and header handling. Like @notatallshaw I wish we could leave this sort of detail to a 3rd party library, but that would be a major refactoring. Add this PR (along with cert handling, parallel downloads, etc) to the list of reasons we should consider such a refactoring, but in the meantime I’m in favour of adding this.

pfmoore · 2025-02-01T10:41:31Z

There isn’t an “approve with conditions” button, but I approve this change on the basis that someone who understands http should check the header and status handling.

ichard26 · 2025-04-01T21:41:59Z

What's the status on this? It seems close to ready, but there are still some design questions being asked.

@pfmoore we've decided that this feature should be opt-in upon release, only to be enabled by default in pip 25.2 or later once we've gotten some feedback. Thus, I'm not too worried about the exact implementation details. Those can change in a future release if needed. I'm also pretty happy with the code as-is.

The sticking point I have is that I'm still not sure of the UI of resumable downloads. --resume-retries is a weird flag.¹ As someone who understands the implementation, it makes sense, but it's likely to be rather obtuse for users. How many resumes should I allow? How does it work differently than --retries? Part of me wants to suggest that we reuse the --retries flag to enable resumable downloads² to keep the UI simpler. OTOH, all of the networking flags (except for --proxy and --timeout) are already "advanced" features so maybe it's fine to expose these fine knobs to the users. If that's the case, I'm also not really happy with the current name. See #12991 (review) for more.

While the rest of the feature can be reworked in future releases, it's likely not feasible to rename a flag once released.

Any thoughts @pfmoore @notatallshaw? I'm rather torn and can't make up my mind.

Does any other tool that accesses the network have a similar flag? ↩
Although once automatic resuming is the default, reusing the --retries default of 5 does seem a bit high... Also to handle the opt-in phase, we'd need to switch to using --use-feature=resume-downloads (and set an reasonable internal limit [10?] since it wouldn't be configurable). ↩

ichard26 · 2025-04-01T21:53:25Z

My current feeling is that it's better to stick with a simpler UI for pip 25.1. If we get complaints, then we can consider giving the users more control later (à la --resume-retries or whatever name we end up choosing) before it's enabled by default. It's harder to remove/restrict the UI after the fact (as packaging standardization has shown time after time).

Proposal:

The --resume-retries flag is removed
To opt-in into resumable/restartable downloads, one must pass --use-feature=retry-downloads¹. This has the benefit that --use-feature is explicitly meant for experimental features
The resume/restart limit is hard-coded to some value (10? - given it's opt-in, it's fine to err on the higher end)
Once it's made the default, the current proposal would be to link it to the --retries flag, unless feedback received after the pip 25.1 release indicates a separate flag is necessary.

I'm using the word "retry" over "resume" as restarting will be done if range requests aren't supported. ↩

pfmoore · 2025-04-01T22:27:06Z

Any thoughts @pfmoore @notatallshaw?

I'm going to keep out of the design discussions. I have a bunch of other things on my plate, and not much spare time, so I don't want to add anything else.

I will say, though, that I don't like the idea of changing the UI once it becomes default. That's not (in my mind) what the --use-feature flag is for - it should be an opt-in to using a new feature that's complete, and won't change, to give people a chance to try it out before it's made the default¹.

I'm going to say that I'd rather not have it in 25.1 unless it's complete (so assuming it's gated behind --use-feature, the only change needed to make it the default is to remove the need for --use-feature).

I'll note that as described, this is a weird feature to use --use-feature on. Because the default is no retries, --use-feature=retry-downloads on its own is a no-op. So why have --use-feature at all? But as I said, I don't want to get sucked into design discussions, so I'm not really looking for an explanation here, just pointing out the oddity. ↩

notatallshaw · 2025-04-01T22:29:47Z

@ichard26 I'm not a fan of this proposal for a couple of reasons:

Firstly, I really dislike changing how features are enabled between pip versions, it makes guides outdated quickly and it more difficult to write scripts against pip.

Secondly, I mildly dislike overloading flag with multiple meanings, particularly to simplify reading the help at the cost of user control. What if user is working against an index they need to disable resumable retries but enable regular retries , will that be possible under the new scheme?

In terms of having nuanced options, I think most users should be served well by good defaults, and users who really need something other than the default will learn the names and meanings of those additional options.

gmargaritis · 2025-04-02T17:05:44Z

Thank you all for your input and efforts!

There have been various discussions about the naming convention in the past (#11180, #4796).

In terms of having nuanced options, I think most users should be served well by good defaults, and users who really need something other than the default will learn the names and meanings of those additional options.

I fully agree, and I suggest keeping the existing implementation of --resume-retries.

Merging the behavior with --retries requires a deeper technical and UX discussion, so keeping them separate for now ensures clarity without limiting future improvements.

ichard26 · 2025-04-03T03:20:56Z

@gmargaritis I didn't have time to follow up on this like I wanted today, so I'll just say that I have no more blocking concerns. I've dropped all design questions having thought about it more. I do think this could benefit from some documentation, but I can handle that later. Right now, what I would appreciate is that you take a look at the three commits I pushed and confirm that I didn't do anything you object to. You can see my review for an explanation of the changes.

Once I have time—ideally tomorrow—I'll follow up properly and clear this for merge. Thanks for your tenacity on this PR!

gmargaritis · 2025-04-03T10:21:55Z

@ichard26 I've reviewed the code once more and it looks great! Thank you for implementing these improvements.

If I can suggest one thing, it would be to update the help text for the--retries and--resume-retries options to eliminate any potential confusion.

Specifically, I recommend changing the help text for --retries from:

  --retries <retries>         Maximum number of retries each connection should attempt (default 5 times).

to:

  --retries <retries>         Maximum attempts to establish a new connection (default: 5).

This change clarifies that --retries applies to new connection attempts, instead of retries within an already established connection.

Similarly, I suggest updating the text for --resume-retries to:

  --resume-retries <resume_retries>         Maximum attempts to resume an incomplete download (default: 0).

I believe this will help our users better understand the purpose of these options and how they differ. Overall, I feel confident that this PR is in great shape and will be a valuable improvement.

Let me know if you need anything, I'd be happy to help!

…http_get_download" This reverts commit 66f68ca.

ichard26

(Note: I'd like to merge this PR myself. I'll wait a little bit to give people a chance to share concerns after reading my update, but otherwise, this is good to land.)

OK, here's my promised follow-up.

First off, good news. I'm approving this PR 🎉. It's amazing to see resumable downloads finally come to fruition. This has been a long-standing point of friction for numerous pip users. Their lives will be easier because of this improvement.

Now, I want to respond to the earlier discussion and also explain what has happened since @pfmoore and @notatallshaw formally reviewed this PR.

The code has been significantly refactored to be more concise and easier to read.¹ ff2ccd2 and eed205a
The help text for --retries and --resume-retries has been updated with @gmargaritis's suggestions 616cde5
The diagnostic error raised for an incomplete download has been reworked to easier to parse while including more information (see #12991 (comment) for a screenshot) 2808134
Other user messaging changes, most notably telling the user what attempt # they are on so they can get a better sense of how many retries they may need to configure (see aforementioned screenshot) c146e81

There has been a lot of changes since those reviews so I wanted to make sure everyone is on the same page to avoid future surprises.

ALSO, there is one more I want to call out: this feature will still be experimental in pip 25.1. However, the logic is (or, rather, should be) feature-ready. Once we get feedback and confirm that this feature is working out in the wild, we can flip the default number of resume retries to a non-zero value (exact value TBD). Otherwise, no extra changes are needed. I hope this alleviates your concerns @pfmoore.

In terms of having nuanced options, I think most users should be served well by good defaults, and users who really need something other than the default will learn the names and meanings of those additional options.

Having thought more about it, I've come around to this conclusion as well. I did state that I wasn't really sure earlier (I could've gone either way), but I did have some concerns. I agree that this is an advanced feature and most users won't need to care about --resume-retries once it's turned on by default. Thus, it should get its own flag and if it's a bit technical, that's fine!

My issue with the naming is that I simply read the --resume-retries flag totally opposite to how seemingly everyone else is reading it. I read it as "The maximum number of retries for resuming a download" which would imply that pip will attempt to resume once and bail if that's not enough by default. I realized at some point that the words should be interpreted in the other order ("The maximum number of resuming (download) retries"). I agree that --resume-retries is probably the best name even if I still don't quite like it.²

I'm going to keep out of the design discussions.

No worries! Thanks anyway for asking questions about the code and design. Even though they may have been sparse, they let me know where I needed to put in extra consideration and thought.

I'm sorry for dragging this out. I do wish this didn't take as long as it did, but having discussed seemingly everything under the sun, I do feel confident approving this. I hope the rest of you agree with this sentiment, even if this was admittedly a lot of work.

Thank you all for your patience.

I know it's probably not recommended practice for a PR reviewer to directly push changes, but I did so because A) it was easier, and B) I expect code refactoring to be relatively uncontroversial. In addition, I fully expect that whenever this code needs to be updated, it will be me doing that thus it's important that I can (more) easily follow this code. Anyway, I am calling out these changes so if you want to take a look and feel strongly, you can object. ↩
Naming is hard 🙂 ↩

ichard26 · 2025-04-05T00:23:35Z

Aargh, of course it's Black—the very tool I used to maintain—that's causing CI to be red 😅

I just pushed a commit to fix the formatting. Now, this should be good to land.

gmargaritis · 2025-04-05T14:19:35Z

Awesome to hear, @ichard26! Thank you all for your efforts, this is going to be a real improvement for everyone!

pfmoore · 2025-04-11T14:26:29Z

@notatallshaw your comment here suggested that you still have reservations about this. Is that still the case? If so, how do we resolve this? We have a week, maybe 2 at the absolute most, to get this ready to merge if it's to go into 25.1.

@ichard26 unless @notatallshaw still objects, can you do the honours and merge this? I think that if anyone else had any reservations, they would have raised them by now.

notatallshaw · 2025-04-11T15:27:23Z

I'm generally happy with the state of the PR but I found a minor bug.

Running:

pip install --dry-run --no-cache torch --resume-retries 4 --no-deps

And the disconnecting my WiFi a couple of times during download I get:

Downloading torch-2.6.0-cp313-cp313-manylinux1_x86_64.whl (766.6 MB)
   ━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 192.9/766.6 MB 35.5 MB/s eta 0:00:17
WARNING: Connection timed out while downloading.
WARNING: Attempting to resume incomplete download (192.9 MB/766.6 MB, attempt 1)
Resuming download torch-2.6.0-cp313-cp313-manylinux1_x86_64.whl (192.9 MB/573.7 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━╸━━━━━━━━━━━━━━━━ 341.8/573.7 MB 44.1 MB/s eta 0:00:06
WARNING: Connection timed out while downloading.
WARNING: Attempting to resume incomplete download (341.8 MB/766.6 MB, attempt 2)
Resuming download torch-2.6.0-cp313-cp313-manylinux1_x86_64.whl (341.8 MB/424.8 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 766.6/424.8 MB 41.4 MB/s eta 0:00:00

Notice on right of the progress bar the "progress/total" column the total goes down each time to the amount remaining on each try but the progress is the amount of progress across all trys:

192.9/766.6 MB
341.8/573.7 MB
766.6/424.8 MB

gmargaritis · 2025-04-11T15:52:13Z

@notatallshaw It seems that this issue originates from eed205a. Previously, we were passing down the total_length in _process_response, hence always keeping the original size of the request. In the current implementation, however, we’re using the total_length from each individual request.

@ichard26 Since you introduced these changes, do you want to refactor this as well? If you’re short on time, I’d be happy to take a look!

ichard26 · 2025-04-11T23:50:23Z

Nice catch @notatallshaw and thanks for taking a look @gmargaritis!

I've elected to revert eed205a (except for the comment changes) as that seems safer and more expedient.

ichard26 · 2025-04-11T23:53:53Z

I also pushed e6bacb4 to explain why we need to keep track of the original total_length so if someone comes by later, they won't try to refactor this because it looks safe.

Thank you so much @gmargaritis! It feels good to see this land 🎉

gmargaritis · 2025-04-12T00:04:47Z

Great work @ichard26!

Thank you all for your efforts, looking forward to making this the default!

yichi-yang and others added 3 commits September 26, 2024 21:26

Add support to resume incomplete download

0617d7c

Better incomplete download error message

a091ca1

gmargaritis force-pushed the introduce-resuming-downloads branch from 16fb735 to dbc6a64 Compare October 4, 2024 20:29

psf-chronographer bot added the bot:chronographer:provided label Oct 4, 2024

gmargaritis mentioned this pull request Oct 4, 2024

[Improvement] Pip could resume download package at halfway the connection is poor #4796

Closed

gmargaritis added 5 commits October 9, 2024 15:51

Merge branch 'main' into introduce-resuming-downloads

7e9ea50

Merge branch 'main' into introduce-resuming-downloads

889ac6b

Merge branch 'main' into introduce-resuming-downloads

0b86d14

Add initial_progress to _raw_progress_bar

2cfd8fe

Signed-off-by: gmargaritis <[email protected]>

Merge branch 'main' into introduce-resuming-downloads

1a9c23b

Merge branch 'main' into introduce-resuming-downloads

64bd385

gmargaritis added 5 commits December 18, 2024 23:32

Merge branch 'main' into introduce-resuming-downloads

d265d53

Merge branch 'main' into introduce-resuming-downloads

d4e2da2

Merge branch 'main' into introduce-resuming-downloads

0dbb4bd

Merge branch 'main' into introduce-resuming-downloads

9b0bb5d

Merge branch 'main' into introduce-resuming-downloads

68a7b05

gmargaritis added 2 commits January 26, 2025 10:32

Merge branch 'main' into introduce-resuming-downloads

a6576b3

Merge branch 'main' into introduce-resuming-downloads

eb6a8db

notatallshaw added this to the 25.1 milestone Feb 1, 2025

gmargaritis and others added 4 commits April 4, 2025 01:38

Merge branch 'main' into introduce-resuming-downloads

1f65cc4

More refactoring

eed205a

Reword retries flag CLI help

616cde5

Revert "Enforce simultaneous use of 'range_start' and 'if_range' in _…

7b1d0a3

…http_get_download" This reverts commit 66f68ca.

ichard26 approved these changes Apr 5, 2025

View reviewed changes

Please show me mercy, black

42af9a8

Merge branch 'main' into introduce-resuming-downloads

9cc70e1

notatallshaw approved these changes Apr 11, 2025

View reviewed changes

ichard26 added 2 commits April 11, 2025 19:21

Partially revert "More refactoring"

9b83908

Add comment to explain the revert

e6bacb4

ichard26 merged commit 4c2e8ea into pypa:main Apr 11, 2025
29 checks passed

ichard26 mentioned this pull request Apr 11, 2025

Resume incomplete download #11180

Closed

ichard26 mentioned this pull request Apr 12, 2025

interrupted download reports as hash failure #11153

Closed

1 task

ichard26 mentioned this pull request Apr 21, 2025

--retries has no effect during streaming downloads #12383

Closed

1 task

github-actions bot locked as resolved and limited conversation to collaborators Apr 27, 2025

Introduce resumable downloads with --resume-retries #12991

Introduce resumable downloads with --resume-retries #12991

Uh oh!

Conversation

gmargaritis commented Oct 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gmargaritis commented Oct 4, 2024

Uh oh!

thk686 commented Oct 4, 2024

Uh oh!

notatallshaw commented Oct 4, 2024

Uh oh!

gmargaritis commented Nov 18, 2024

Uh oh!

notatallshaw commented Dec 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gmargaritis commented Dec 11, 2024

Uh oh!

art-ignatev commented Jan 13, 2025

Uh oh!

notatallshaw commented Feb 1, 2025

Uh oh!

pfmoore commented Feb 1, 2025

Uh oh!

pfmoore commented Feb 1, 2025

Uh oh!

ichard26 commented Apr 1, 2025

Footnotes

Uh oh!

ichard26 commented Apr 1, 2025

Footnotes

Uh oh!

pfmoore commented Apr 1, 2025

Footnotes

Uh oh!

notatallshaw commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gmargaritis commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ichard26 commented Apr 3, 2025

Uh oh!

gmargaritis commented Apr 3, 2025

Uh oh!

ichard26 left a comment

Choose a reason for hiding this comment

Footnotes

Uh oh!

ichard26 commented Apr 5, 2025

Uh oh!

gmargaritis commented Apr 5, 2025

Uh oh!

pfmoore commented Apr 11, 2025

Uh oh!

notatallshaw commented Apr 11, 2025

Uh oh!

gmargaritis commented Apr 11, 2025

Uh oh!

ichard26 commented Apr 11, 2025

Uh oh!

Uh oh!

ichard26 commented Apr 11, 2025

Uh oh!

gmargaritis commented Apr 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

gmargaritis commented Oct 4, 2024 •

edited

Loading

notatallshaw commented Dec 11, 2024 •

edited

Loading

notatallshaw commented Apr 1, 2025 •

edited

Loading

gmargaritis commented Apr 2, 2025 •

edited

Loading