A practical, production-focused project to automatically crop scanned images using classical computer vision.
This project focuses on reliability, determinism, and scalability, inspired by professional archival tools.
- Project Goal
- Quick Start (Web UI)
- Quick Start (Command Line)
- How the Crop Engine Works
- Project Structure
- Safety and Reliability
- Why Not FFmpeg
- AI Upscaling (Optional)
- Credits
- License
To build a robust batch image cropping system that:
- Works on scanned photos and documents
- Handles uneven lighting and borders
- Avoids cutting faces or important content
- Works without GPU or heavy AI models
- Can be safely used in automation pipelines
This is a deterministic system relying on proven computer vision techniques. No AI guessing, no vision LLMs, no black-box magic, no heavy GPU usage.
The app runs locally in your browser. Dark theme with a drag-and-drop interface for uploading scanned images or pointing at a local folder.
This is the easiest way to use the tool. It gives you a drag-and-drop interface in your browser.
You need Python 3.10 or newer installed on your system. You can check by running:
python --version
If you do not have Python, download it from python.org.
Open a terminal (Command Prompt, PowerShell, or any terminal) and navigate to the app folder inside this project:
cd "path/to/this/project/app"
Then install the required Python packages:
pip install -r requirements.txt
This only needs to be done once.
From the same app folder, run:
python run.py
You should see output like this:
[Scan Auto-Crop] Starting server at http://127.0.0.1:8000
[Scan Auto-Crop] Opening browser...
Your default browser will open automatically to the app.
You have two options inside the web app:
Option A: Drag and Drop
- Drag your scanned image files from a folder and drop them onto the upload area in the browser.
- The app uploads and processes them using the 5-tier crop engine.
- You will see a before/after comparison for every image.
- Click "Download" on any individual image to save the cropped version.
Option B: Paste a Folder Path
- At the bottom of the page, there is a text field labeled "Crop Folder".
- Paste the full path to your folder of scanned images. For example:
C:\Users\YourName\Desktop\My Scans\Purple Flowers - Click the "Crop Folder" button.
- The app processes every image in that folder and saves the cropped versions into a new folder next to the original, named
{original folder} - Cropped. - You will see statistics showing how many images were cropped successfully and which strategy was used for each.
- A
crop-report.txtfile is automatically saved inside the Cropped folder alongside the images.
Understanding the Strategy Badges
After processing, you will see colored badges like "Pro Contour", "Canny Edge", etc. Hover over any badge to see a tooltip explaining what that strategy does and when it is used.
Exporting the Report
Click the green "Export Report" button in the results view to download a .txt report file to your Downloads folder. This report contains:
- Summary statistics (total, cropped, unchanged, success rate)
- Strategy breakdown with percentages
- Full glossary explaining what each strategy means
- Per-image table with filenames, strategy used, and original vs cropped dimensions
If you used the "Crop Folder" option, the report is also automatically saved as crop-report.txt inside the Cropped output folder.
To stop the server, go back to the terminal where it is running and press Ctrl + C.
To restart the server (for example, after an update), press Ctrl + C to stop it, then run the start command again:
python run.py
The server must be restarted whenever you change any of the Python files (server.py, cropper.py). Changes to the frontend files (index.html, style.css, app.js) take effect on a browser refresh without needing a server restart.
If you prefer using the terminal without a browser, you can run the crop engine directly from the command line.
Same as above. Navigate to app/ and run:
pip install -r requirements.txt
From the project root (not the app folder), run:
python app/core/cropper.py "path/to/your/scanned/images"
For example:
python app/core/cropper.py "scan project SRM/part2/Purple Flowers"
This will:
- Read every image file (JPG, PNG, BMP, TIFF) in that folder.
- Crop each one using the 5-tier fallback engine.
- Save all cropped images into a new folder called
{folder name} - Croppedright next to the original. - Print a summary of results showing which strategy was used for each image.
If you want the cropped images saved to a specific location, add a second argument:
python app/core/cropper.py "path/to/input/folder" "path/to/output/folder"
Processing: scan project SRM\part2\Purple Flowers
==================================================
BATCH RESULTS: 265 images
pro_contour : 221 (83.4%)
canny_edge : 10 ( 3.8%)
pro_rect : 5 ( 1.9%)
original : 29 (10.9%)
==================================================
The core insight is simple:
Do not detect the photo. Detect the background.
Instead of trying to find the photo content (which varies wildly between images), we detect the flat, uniform scanner background and crop it away. This is the standard principle used in professional document digitization.
The engine tries five different strategies in order. If one fails, it falls back to the next. If all five fail, the original image is kept untouched (zero data loss).
| Tier | Strategy | How It Works | When It Helps |
|---|---|---|---|
| 1 | Otsu + RETR_EXTERNAL | Binarizes the image using Otsu's automatic threshold, cleans up with morphological closing, then finds the outermost contour only. | Works for most scanned photos with clear borders. This is the primary strategy. |
| 2 | Canny Edge Detection | Detects physical edges regardless of fill color using gradient-based edge detection. | Catches "snow photos" where the photo content is white-on-white. |
| 3 | Variance-based | Calculates local pixel variance. Scanner background has near-zero variance, while real photos have texture. | Handles stubborn cases where both Otsu and Canny fail. |
| 4 | Saturation-based | Scanner backgrounds are pure neutral gray (zero color saturation). Real photos, even snow scenes, have slight color casts. | Works when the image is almost entirely white but has subtle color. |
| 5 | Gradient Line Scan | Scans from each edge inward looking for the first row or column with significant gradient changes. | Last resort for finding a physical photo border. |
Before the strategies run, every image goes through:
- Grayscale conversion - Simplifies the pixel data.
- Gaussian blur - Reduces scanner noise that could create false edges.
- Edge clearing - Zeros out the outermost 10 pixels of the mask to break scanner noise that "tethers" the photo to the edge.
project root/
|
|-- app/ (the web app - everything you need)
| |-- core/
| | |-- __init__.py
| | |-- cropper.py (the crop engine, also works as CLI)
| |
| |-- frontend/
| | |-- index.html (web UI)
| | |-- style.css (green + black theme)
| | |-- app.js (drag and drop logic)
| |
| |-- server.py (FastAPI backend)
| |-- run.py (starts server + opens browser)
| |-- requirements.txt (Python dependencies)
|
|-- batch_crop_pro.py (original standalone crop script)
|-- batch_crop_opencv.py (legacy - multi-strategy approach)
|-- batch_crop_safe.py (legacy - conservative border trim)
|-- batch_crop_final.py (legacy - PIL-based approach)
|-- batch_crop_aggressive.py (legacy - aggressive mean threshold)
|-- batch_processor.py (legacy - ImageChops fuzz method)
|-- batch_processor_fixed.py (legacy - ImageMagick wrapper)
|-- analyze_crops.py (compare original vs cropped sizes)
|-- list_uncropped.py (find images that were not cropped)
|-- enhance_images.py (AI upscaling helper)
|-- upscale_scans.py (AI upscaling with Upscayl)
|-- setup_upscayl.py (extract Upscayl binary from zip)
|
|-- tools/ (external binaries, gitignored)
|-- README.md
|-- .gitignore
Note: The app/ folder is the main working directory for the web app. The Python scripts at the root level are the original standalone batch processors from earlier development. They still work, but app/core/cropper.py is the refined version of the same engine.
This pipeline uses multi-level fallback logic:
- Primary: Try external contour crop (exact shape).
- Fallback: If the contour is irregular, switch to bounding-box crop (rectangular safety).
- Fail-safe: If still uncertain, keep the original image unchanged.
This guarantees zero accidental data loss. An image is never destroyed or over-cropped. The worst case is that it stays untouched and you crop it manually.
Contributors might ask: "Why build a custom Python script instead of using FFmpeg's cropdetect filter?"
We analyzed FFmpeg, and while it is excellent for video, it is unsafe for scanned photo archives:
- Simple thresholding failure: FFmpeg relies on simple color difference. It fails on snow photos (white content on white background) or tethered edges (scanner noise), leading to aggressive over-cropping.
- No edge clearing: FFmpeg cannot distinguish between the actual photo edge and scanner dust or artifacts.
- Risk of data loss: Our custom pipeline uses a multi-tier fallback and refuses to crop if uncertain. FFmpeg would simply chop the image, potentially destroying original data.
Verdict: Our custom OpenCV pipeline achieves approximately 89% automated accuracy with 100% safety (zero data loss), whereas FFmpeg poses a high risk of data destruction for this specific dataset.
We included a helper script upscale_scans.py for those who want to enhance the original uncropped scans using AI upscaling via Upscayl.
The upscaling tools are not included in this repo (too large). You must download the upscayl-bin executable and models separately.
- Download Upscayl binary and models.
- Place them in
tools/ext/so the structure looks like:
tools/ext/
|-- upscayl-bin.exe
|-- models/
|-- upscayl-lite-4x/
|-- upscayl-standard-4x/
|-- ...
The script is currently configured for low VRAM systems (such as integrated graphics):
- Model: Defaults to
upscayl-lite-4x(fast, low memory). - Tiling: Uses
-t 200to prevent memory crashes on large scans.
To improve quality (if you have a dedicated GPU):
- Open
upscale_scans.py. - Change
source_model_nameto"upscayl-standard-4x"or"ultrasharp-4x". - Reduce tiling (for example,
-t 32or remove-t) if you have more than 4GB of VRAM.
This project relies on standard open-source libraries:
- OpenCV: All image processing and computer vision tasks.
- NumPy: High-performance matrix and array operations.
- Pillow: Image file I/O and thumbnail generation.
- FastAPI: Web API framework for the server.
- Uvicorn: ASGI server to run FastAPI.
Logic and methods used (Otsu's Thresholding, Canny Edge Detection, Morphological Operations) are standard algorithms in the field of computer vision.
MIT -- free to use, modify, and improve.
