Merge zip archives #106
-
As I'm uploading a large amount of small files, the mount of workers created crashes the page. Is there an easy way to remove this header or to merge zips archives?
Am I in the right direction? Or would there be a smarter solution? Or a better way to merge the zipped chunks? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 4 replies
-
First of all, it's very difficult to merge ZIP files efficiently. Unfortunately there's a footer for each ZIP file that is both required and impossible to merge with another footer, so what you're doing won't work. Even if it were possible, please note that the following line is probably going to cause a memory issue: tmpZippedData = new Uint8Array([...tmpZippedData, ...data]); The spread operator is efficient for copying arrays but is horrendous for this situation because it creates a standard JS array before converting to Uint8Array; since JS numbers take 32 bits at minimum in V8, you more than quadruple memory usage every time you do this, and its slow. If you want to concatenate const newZippedData = new Uint8Array(tmpZippedData.length + data.length);
newZippedData.set(tmpZippedData);
newZippedData.set(data, tmpZippedData.length);
tmpZippedData = newZippedData; The above still won't work because again, you can't concatenate ZIP files. If the files are all tiny, just use the synchronous API to avoid creating so many workers. This shouldn't freeze the browser with sufficiently small files. const toZip = {};
await Promise.all(files.map(async file => {
toZip[file.name] = new Uint8Array(await file.arrayBuffer());
}));
const zipped = zipSync(toZip); Small files also benefit less from compression, so you may consider using If you experience freezing in browsers, let me know and I'll give a more robust solution. |
Beta Was this translation helpful? Give feedback.
-
I also need to merge zip files on the browser side. I made a POC on the node.js side with the help of an additional library called import { writeFile } from 'node:fs/promises';
import * as fflate from 'fflate';
import AdmZip from 'adm-zip';
function mergeZips(zipsToMerge: Uint8Array[]) {
let zip: fflate.Zip;
const toBufferPromise = new Promise<Buffer>((resolve, reject) => {
const chunks: Uint8Array[] = [];
zip = new fflate.Zip((err, chunk, final) => {
if (err) return reject(err);
chunks.push(chunk);
if (final) resolve(Buffer.concat(chunks));
});
});
for (const zipToMerge of zipsToMerge) {
const admZip = new AdmZip(Buffer.from(zipToMerge));
const entries = admZip.getEntries();
for (const entry of entries) {
const file = new fflate.ZipPassThrough(entry.name);
Object.assign(file, {
crc: entry.header.crc,
size: entry.header.size,
compression: entry.header.method,
flag: entry.header.flags,
mtime: entry.header.time,
});
zip!.add(file);
file.ondata(null, entry.getCompressedData(), true);
}
}
zip!.end();
return toBufferPromise;
}
const zip1 = fflate.zipSync({
'file-1.json': fflate.strToU8(JSON.stringify({ some: 'content 1' })),
'file-2.json': fflate.strToU8(JSON.stringify({ some: 'content 2' })),
});
const zip2 = fflate.zipSync({
'file-3.txt': fflate.strToU8('content 3'),
});
const merged = await mergeZips([ zip1, zip2 ]);
await writeFile('./merged.zip', merged); The code above is pretty fast - I was able to merge a dozen .zip files into one single .zip file in less than 6 seconds. @101arrowz what do you think? Is there a chance that |
Beta Was this translation helpful? Give feedback.
-
Python Implementation |
Beta Was this translation helpful? Give feedback.
First of all, it's very difficult to merge ZIP files efficiently. Unfortunately there's a footer for each ZIP file that is both required and impossible to merge with another footer, so what you're doing won't work. Even if it were possible, please note that the following line is probably going to cause a memory issue:
The spread operator is efficient for copying arrays but is horrendous for this situation because it creates a standard JS array before converting to Uint8Array; since JS numbers take 32 bits at minimum in V8, you more than quadruple memory usage every time you do this, and its slow. If you want to concatenate
Uint8…