Skip to content

Uploading a non-ZIP file to /scan causes an unhandled 500 instead of a validation error #6

@ionfwsrijan

Description

@ionfwsrijan

Description

The /scan endpoint accepts any file upload without validating the MIME type or extension. If a user accidentally uploads a .pdf, .docx, or a corrupted file, the server tries to extract it as a ZIP, fails with a Python exception deep in the extraction logic, and returns a raw 500 with a stack trace which leaks internal path information.

Steps to reproduce

Upload any non-ZIP file (e.g. a .pdf) via the frontend scan input
Observe the response

Expected behaviour

A 400 response: "Invalid file type. Please upload a ZIP archive of your codebase."

What to implement

  • At the top of the /scan handler, check file.content_type and the filename extension before attempting extraction
  • Accepted: application/zip, application/x-zip-compressed, .zip extension
  • Return 400 with a clear message for anything else
  • Catch zipfile.BadZipFile as a fallback and return 400 instead of 500

Acceptance criteria

  • Non-ZIP upload returns 400 with a human-readable message
  • Corrupted ZIP returns 400 with "File appears to be corrupted or is not a valid ZIP archive"
  • No stack trace or internal path is ever exposed in the response
  • Frontend shows the error message to the user

Metadata

Metadata

Assignees

No one assigned

    Labels

    SSoC26backendBackend issuesbugSomething isn't workingeasyEasy difficulty

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions