Conversation
|
Thanks for the pull request, @kingoftech-v01! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Submit a signed contributor agreement (CLA)
If you've signed an agreement in the past, you may need to re-sign. Once you've signed the CLA, please allow 1 business day for it to be processed. 🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
safe_extractall at openedx/core/lib/extract_archive.py enforces symlink, path-traversal, and dev-file guards but never checks the decompressed size, member count, or compression ratio of the archive before calling archive.extractall. A small, highly-compressible zip or tar.gz can declare gigabytes of members -- the canonical "zip bomb" pattern -- and exhaust disk / memory on the CMS worker during course import. Course authors can trigger this via the Studio course import UI, so the trust boundary is relevant (CWE-409). Add _check_archive_bomb helper that iterates the ZipInfo / TarInfo list once and enforces three operator-overridable thresholds before extraction begins: * COURSE_IMPORT_MAX_EXTRACTED_SIZE default 2 GB * COURSE_IMPORT_MAX_EXTRACTED_FILES default 50 000 * COURSE_IMPORT_MAX_COMPRESSION_RATIO default 200x The helper short-circuits on the first violation and raises SuspiciousOperation, matching the existing failure mode of _checkmembers. Because the guard runs *before* extractall writes anything, a pathological archive never materializes a byte on disk. The thresholds are chosen to accept every realistic edX course export (real exports observed ~100 MB / ~5 000 members / ratio <~10x for mixed XML+media) with a ~10x headroom while blocking 42.zip-class bombs whose ratio is upwards of 10^6. Add unit tests for the helper covering size, count, ratio, and the happy path, plus an integration test that builds a real on-disk zip and confirms safe_extractall rejects the bomb without writing any output files.
Closed.