Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add test to reproduce issue in impl Stream for Entries causing filename truncation #1

Merged
merged 1 commit into from
Feb 9, 2025

Conversation

RazerM
Copy link
Contributor

@RazerM RazerM commented Dec 13, 2024

This is from edera-dev/tokio-tar#3


I tracked down this issue in astral-sh/uv#5450 (comment)

https://github.com/edera-dev/tokio-tar/blob/4ee357285b5053e6bfada7f117e530b4da94b74a/src/archive.rs#L317

            if is_recognized_header && entry.header().entry_type().is_pax_local_extensions() {
                if self.pax_extensions.is_some() {
                    return Poll::Ready(Some(Err(other(
                        "two pax extensions entries describing \
                         the same member",
                    ))));
                }
                let mut ef = EntryFields::from(entry);
                let val = ready_err!(Pin::new(&mut ef).poll_read_all(cx));
                self.pax_extensions = Some(val);
                continue;
            }

if Pin::new(&mut ef).poll_read_all(cx) is Poll::Pending then ready_err! returns it, so the Pax extension is lost. The same would apply to a pending poll that occurs while a > longlink or longname is being prepared. When poll_next is called again the next entry header is parsed.

This PR demonstrates the issue by creating an AsyncRead impl which pends every second time it is polled.

Commenting out this line makes the test pass, because the reader doesn't enter a pending state in the "wrong" place.

It is probably also the cause of dignifiedquire/async-tar#39

@endbr64
Copy link

endbr64 commented Dec 24, 2024

Please take a look at dignifiedquire/async-tar#55

@charliermarsh
Copy link
Member

I applied dignifiedquire/async-tar#55 locally, but unfortunately the test is still failing for me.

@charliermarsh
Copy link
Member

Ok, going to start looking into this more deeply since I've now determined that it's the cause of these spurious failures: astral-sh/uv#2235

@charliermarsh
Copy link
Member

@endbr64 -- If you see https://github.com/astral-sh/tokio-tar/pull/40/files#diff-bc2127a92faf1e62dd213ca21bbd92e42da37f12597ab58ee95a1abcbdf25710R326, I had to make one additional change to your patch, which is: we need to store the entry that we didn't finish populating. Otherwise, we advance past it on the next iteration.

@charliermarsh charliermarsh merged commit 878399a into astral-sh:edera Feb 9, 2025
@charliermarsh
Copy link
Member

I merged this into the wrong branch (oops). Will amend.

charliermarsh pushed a commit that referenced this pull request Feb 9, 2025
…name truncation (#1)

This is from edera-dev/tokio-tar#3

---

I tracked down this issue in
astral-sh/uv#5450 (comment)

>
https://github.com/edera-dev/tokio-tar/blob/4ee357285b5053e6bfada7f117e530b4da94b74a/src/archive.rs#L317
> 
> ```rust
> if is_recognized_header &&
entry.header().entry_type().is_pax_local_extensions() {
>                 if self.pax_extensions.is_some() {
>                     return Poll::Ready(Some(Err(other(
>                         "two pax extensions entries describing \
>                          the same member",
>                     ))));
>                 }
>                 let mut ef = EntryFields::from(entry);
> let val = ready_err!(Pin::new(&mut ef).poll_read_all(cx));
>                 self.pax_extensions = Some(val);
>                 continue;
>             }
> ```
> 
> if `Pin::new(&mut ef).poll_read_all(cx)` is `Poll::Pending` then
`ready_err!` returns it, so the Pax extension is lost. The same would
apply to a pending poll that occurs while a > longlink or longname is
being prepared. When `poll_next` is called again the next entry header
is parsed.

This PR demonstrates the issue by creating an AsyncRead impl which pends
every second time it is polled.

Commenting out [this
line](https://github.com/RazerM/tokio-tar/blob/15466052f63c47cf47decd4409a9b0e936302773/tests/all.rs#L816)
makes the test pass, because the reader doesn't enter a pending state in
the "wrong" place.

It is probably also the cause of
dignifiedquire/async-tar#39
charliermarsh added a commit that referenced this pull request Feb 9, 2025
…name truncation (#41)

Re-merging #1

Co-authored-by: Frazer McLean <[email protected]>
charliermarsh added a commit that referenced this pull request Feb 9, 2025
## Summary

Right now, if we hit a pending read while reading an entry, we end up
discarding the data rather than preserving it for the next poll (e.g.,
for a PAX extension). You can also see this reported at
dignifiedquire/async-tar#39.

This PR takes dignifiedquire/async-tar#55, but
applies an additional change as that PR didn't work on its own, in my
testing. Atop dignifiedquire/async-tar#55, we
also store the pending `Entry` to ensure that if we're pending, we don't
advance to the next entry on the next poll.

For more context, see: #1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants