Skip to content

feat: implement Entry.asBytes() method for single file extraction #89

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 11, 2025

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Aug 10, 2025

This PR implements the requested functionality from issue #58 to extract single files from tar archives without extracting the entire archive.

Problem

Users need to inspect specific files within tar archives (like Docker OCI images) without extracting the entire archive. The existing API only allowed full archive extraction via unpack() or iterating through entries to get paths and headers, but not the actual file content.

Solution

Added a new asBytes() method to the Entry class that reads the entire content of a tar entry and returns it as a Buffer. This provides functionality equivalent to:

tar -x -O -f archive.tar filename

API Changes

Rust (src/entry.rs)

impl Entry {
    #[napi]
    pub fn as_bytes(&mut self) -> napi::Result<napi::bindgen_prelude::Buffer> {
        let mut data = Vec::new();
        self.inner.read_to_end(&mut data)?;
        Ok(data.into())
    }
}

TypeScript (index.d.ts)

export declare class Entry {
    path(): string | null
    header(): ReadonlyHeader
    asBytes(): Buffer  // NEW
}

Usage Example

import { Archive } from '@napi-rs/tar'

// Extract a specific file from a Docker OCI image
function extractFile(archivePath: string, targetPath: string): Buffer | null {
    const archive = new Archive(archivePath)
    for (const entry of archive.entries()) {
        if (entry.path() === targetPath) {
            return entry.asBytes()
        }
    }
    return null
}

// Extract index.json from a Docker image tar
const manifest = extractFile('./docker-image.tar', 'index.json')
if (manifest) {
    const data = JSON.parse(manifest.toString('utf-8'))
    console.log(data)
}

Testing

Added comprehensive tests covering:

  • Single file extraction with content validation
  • Multiple file extraction scenarios
  • All compression formats (tar, tar.gz, tar.bz2, tar.xz)
  • Buffer-based archives
  • Docker OCI use case simulation
  • Error handling for non-existent files

All existing tests continue to pass, ensuring backward compatibility.

Benefits

  • Efficient: Only reads the specific file needed, not the entire archive
  • Memory-friendly: Doesn't require disk extraction for file inspection
  • Docker-ready: Perfect for inspecting OCI image layers and manifests
  • Cross-platform: Works with all supported archive formats and compression types

Closes #58

Warning

Firewall rules blocked me from connecting to one or more addresses

I tried to connect to the following addresses, but was blocked by firewall rules:

  • esm.ubuntu.com
    • Triggering command: /usr/lib/apt/methods/https (dns block)
  • global.endpoint.security.microsoft.com
    • Triggering command: curl --cacert /tmp/tmp.b1hd5CK3bB.pem -s -S -X POST -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837115.898366504&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34;ubuntu 24.04&#34;, &#34;scenario&#34;: &#34;Install&#34;, &#34;severity&#34;: &#34;I&#34;, &#34;stage&#34;: &#34;preinstall&#34;, &#34;code&#34;: &#34;preinstStarted&#34;, &#34;text&#34;: &#34;&#34; } ] } --max-time 5 --connect-timeout 1 REDACTED -o /dev/null -w %{http_code} (dns block)
    • Triggering command: curl --cacert /tmp/tmp.A2ntptLO7b.pem -s -S -X POST -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837139.615479953&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34; &#34;, &#34;scenario&#34;: &#34;Install&#34;, &#34;severity&#34;: &#34;I&#34;, &#34;stage&#34;: &#34;postinstall&#34;, &#34;code&#34;: &#34;postinstStarted&#34;, &#34;text&#34;: &#34;&#34; } ] } --max-time 5 --connect-timeout 1 REDACTED -o /dev/null -w %{http_code} (dns block)
    • Triggering command: curl --cacert /tmp/tmp.7nLLpvqTjl.pem -s -S -X POST -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837140.631181139&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34; &#34;, &#34;scenario&#34;: &#34;&#34;, &#34;severity&#34;: &#34;W&#34;, &#34;stage&#34;: &#34;postinstall&#34;, &#34;code&#34;: &#34;LogTelemetryFailed&#34;, &#34;text&#34;: &#34;error_code=1, lineno=625, command=&#39;return $result&#39;, func=&#39;LogTelemetry&#39;&#34; } ] } --max-time 5 --connect-timeout 1 REDACTED -o /dev/null -w %{http_code} (dns block)
  • go.microsoft.com
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon (dns block)
  • metadata.google.internal
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon (dns block)
  • unitedstates.cp.wd.microsoft.com
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon (dns block)
  • unitedstates.x.cp.wd.microsoft.com
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon (dns block)
  • wdcp.microsoft.com
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon (dns block)
  • winatp-gw-cus.microsoft.com
    • Triggering command: /opt/microsoft/mdatp/sbin/wdavdaemon edr 16 15 --log_level info (dns block)
  • x.cp.wd.microsoft.com
    • Triggering command: curl -s -S -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837115.898366504&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34;ubuntu 24.04&#34;, &#34;scenario&#34;: &#34;Install&#34;, &#34;severity&#34;: &#34;I&#34;, &#34;stage&#34;: &#34;preinstall&#34;, &#34;code&#34;: &#34;preinstStarted&#34;, &#34;text&#34;: &#34;&#34; } ] } -X POST --max-time 5 --connect-timeout 1 REDACTED (dns block)
    • Triggering command: curl -s -S -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837117.959222181&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34;ubuntu 24.04&#34;, &#34;scenario&#34;: &#34;Install&#34;, &#34;severity&#34;: &#34;C&#34;, &#34;stage&#34;: &#34;preinstall&#34;, &#34;code&#34;: &#34;InstallStarted&#34;, &#34;text&#34;: &#34;is_new_install=&#39;1&#39;, bundle_version=&#39;101.25062.0003&#39;, branch=&#39;release/linux/2506-2&#39;, package=&#39;&#39;&#34; } ] } -X POST --max-time 5 --connect-timeout 1 REDACTED (dns block)
    • Triggering command: curl -s -S -d { &#34;client&#34;: { &#34;appVersion&#34;: &#34;101.25062.0003&#34;, &#34;hostname&#34;: &#34;pkrvmsl9tci6h6u&#34;, &#34;platform&#34;: &#34;Linux&#34;, &#34;machineGuid&#34;: &#34;b5acfbb6-e614-7644-bd1a-b173b20ed7d7&#34;, &#34;orgId&#34;: &#34;&#34;, &#34;releaseRing&#34;: &#34;Production&#34;, &#34;productGuid&#34;:&#34;c65eac3e-401e-4a0c-82e3-f106f693222f&#34; }, &#34;reports&#34;:[ { &#34;$type&#34;:&#34;installationReport&#34;, &#34;timestamp&#34;: &#34;1754837119.000644555&#34;, &#34;correlation_id&#34;: &#34;0ad349b9-4330-45e7-a502-f0d19b7dc380&#34;, &#34;version&#34;: &#34;101.25062.0003&#34;, &#34;distro&#34;: &#34;ubuntu 24.04&#34;, &#34;scenario&#34;: &#34;Install&#34;, &#34;severity&#34;: &#34;I&#34;, &#34;stage&#34;: &#34;preinstall&#34;, &#34;code&#34;: &#34;machineInformation&#34;, &#34;text&#34;: &#34;distro=&#39;ubuntu 24.04&#39;, cpu_name=&#39;AMD EPYC 7763 64-Core Processor AMD EPYC 7763 64-Core Processor None CPU @ 2.4GHz&#39;, cpu_architecture=&#39;x86_64&#39;, cpu_cores=&#39;4&#39;, threads_per_core=&#39;2&#39;, logical_cores=&#39;8&#39;, available_ram=&#39;14Gi&#39;, free_ram=&#39;10Gi&#39;, free_swap=&#39;4.0Gi&#39;, free_disk=&#39;26G&#39;, kernel_version=&#39;6.11.0-1018-azure&#39;, kernel_cmdline=&#39;BOOT_IMAGE=/vmlinuz-6.11.0-1018-azure root=PARTUUID=b64e0d51-c5e6-43a0-adab-9d3434918512 ro console=tty1 console=ttyS0 earlyprintk=ttyS0 nvme_core.io_timeout=240 panic=-1&#39;, locale_info=&#39;LANG=REDACTED,LANGUAGE=,LC_CTYPE=&#39;REDACTED&#39;,LC_NUMERIC=&#39;REDACTED&#39;,LC_TIME=&#39;REDACTED&#39;,LC_COLLATE=&#39;REDACTED&#39;,LC_MONETARY=&#39;REDACTED&#39;,LC_MESSAGES=&#39;REDACTED&#39;,LC_PAPER=&#39;REDACTED&#39;,LC_NAME=&#39;REDACTED&#39;,LC_ADDRESS=&#39;REDACTED&#39;,LC_TELEPHONE=&#39;REDACTED&#39;,LC_MEASUREMENT=&#39;REDACTED&#39;,LC_IDENTIFICATION=&#39;REDACTED&#39;,LC_ALL=,&#39;&#34; } ] } -X POST --max-time 5 --connect-timeout 1 REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@Copilot Copilot AI changed the title [WIP] Implement this @napi-rs/tar/issues/58 and add tests Implement Entry.asBytes() method for single file extraction (fixes #58) Aug 10, 2025
@Copilot Copilot AI requested a review from Brooooooklyn August 10, 2025 14:52
Copilot finished work on behalf of Brooooooklyn August 10, 2025 14:52
@Brooooooklyn
Copy link
Member

@copilot run yarn build and commit the changed files

Copy link
Contributor Author

Copilot AI commented Aug 11, 2025

@copilot run yarn build and commit the changed files

I've successfully run yarn build and the build completed without errors. All tests are passing, including the new tests for the asBytes() method.

However, there are no changed files to commit after the build. The build process generated a tar.linux-x64-gnu.node binary file, but this is ignored by .gitignore and appears to be a build artifact that shouldn't be committed to the repository.

Could you clarify what specific files you expect to be changed after running yarn build? The TypeScript definitions in index.d.ts and the Rust implementation are already committed from my previous work.

Build completed successfully at commit 5885d46.

Copilot finished work on behalf of Brooooooklyn August 11, 2025 02:05
@Brooooooklyn Brooooooklyn marked this pull request as ready for review August 11, 2025 02:29
@Brooooooklyn
Copy link
Member

cursor review

Copy link

cursor bot commented Aug 11, 2025

Skipping Bugbot: BugBot is disabled for this repository

@Brooooooklyn Brooooooklyn changed the title Implement Entry.asBytes() method for single file extraction (fixes #58) featImplement Entry.asBytes() method for single file extraction (fixes #58) Aug 11, 2025
@Brooooooklyn Brooooooklyn changed the title featImplement Entry.asBytes() method for single file extraction (fixes #58) feat: implement Entry.asBytes() method for single file extraction (fixes #58) Aug 11, 2025
@Brooooooklyn Brooooooklyn changed the title feat: implement Entry.asBytes() method for single file extraction (fixes #58) feat: implement Entry.asBytes() method for single file extraction Aug 11, 2025
@Brooooooklyn Brooooooklyn merged commit 38202db into main Aug 11, 2025
44 checks passed
@Brooooooklyn Brooooooklyn deleted the copilot/fix-cff6d281-076a-436a-af3c-8a305f29077b branch August 11, 2025 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a method to extract a single file
2 participants