OltekOCR Desktop is an offline-first logistics document processing application built for desktop workflows. It ingests files, runs OCR/extraction locally, supports operator review, and exports approved results to Excel/CSV/JSON.
- Runs fully local on your machine (no cloud dependency for core processing).
- Supports multiple extraction modes: OCR, table/field extraction, and PDF contract extraction.
- Processes files through a FIFO queue with pause/resume/cancel.
- Streams live processing status to the UI through WebSocket updates.
- Provides review workflows (approve/reject/reprocess) before export.
- Stores runtime data in local SQLite via Prisma.
| Layer | Technology |
|---|---|
| Desktop shell | Electron 28 |
| Frontend | React 18 + TypeScript + React Router + Vite |
| UI | TailwindCSS v3 + Radix UI + shadcn patterns |
| Backend (in-process) | NestJS v10 (embedded in Electron main process) |
| Realtime | WebSocket (ws) via @nestjs/platform-ws |
| Database | SQLite + Prisma ORM |
| OCR sidecar | Python + RapidOCR (ocr_rapidocr.py) |
| PDF extraction sidecar | Python dispatcher (pdf_extract.py) with docling / pdfplumber / pymupdf / unstructured |
| Contract extraction sidecar | Python (pdf_contract_extract_dynamic.py / pdf_contract_extract.py) |
| TABLE_EXTRACT QA sidecar | Python + Ollama (qa_ollama.py) |
Electron Main Process
|- NestJS API server: http://localhost:3847/api
|- WebSocket gateway: ws://localhost:3847/ws
|- Prisma -> SQLite (./data/oltekocr.db)
|- Python sidecars spawned per processing task
Electron Renderer (React)
|- Calls REST API on localhost:3847/api
|- Subscribes to WebSocket events for queue/doc updates
| Module | Responsibility |
|---|---|
sessions |
Session CRUD, ingestion, mode/schema management |
documents |
Document CRUD, serving source files/thumbnails, status ops |
queue |
FIFO processing orchestration, pause/resume/cancel |
ocr |
OCR routing and sidecar orchestration |
contract-extraction |
PDF contract extraction pipeline |
extraction |
TABLE_EXTRACT field answering via Ollama sidecar |
scanner |
Folder watching / scanner stub endpoints |
export |
Export selected or all approved docs to Excel/CSV/JSON |
settings |
Persistent app settings in data/settings.json |
models |
PDF/Ollama model status + install helpers |
flowchart TD
A[Create Session] --> B[Ingest Files or Folder]
B --> C[Documents -> QUEUED]
C --> D[QueueService Dequeues FIFO]
D --> E[SCANNING / type detection]
E --> F[PROCESSING in sidecar]
F --> G[REVIEW]
G --> H[APPROVED]
G --> I[REJECTED]
H --> J[EXPORTED]
F --> K[ERROR]
QUEUED: waiting in queueSCANNING: pre-processing / extraction type detectionPROCESSING: OCR or extraction in progressCANCELLING: transitional state while cancel request is handledREVIEW: processed, waiting for operator decisionAPPROVED: accepted for exportREJECTED: rejected by reviewerEXPORTED: exported to output fileERROR: pipeline failed
On startup, transient statuses (CANCELLING, SCANNING, PROCESSING) are recovered back to QUEUED.
| Mode | Purpose | Main Processor |
|---|---|---|
OCR_EXTRACT |
OCR text extraction for images/scanned PDFs | ocr_rapidocr.py or routed PDF extractor |
TABLE_EXTRACT |
Answer user-defined field questions from OCR text | OCR pipeline + qa_ollama.py |
PDF_EXTRACT |
Structured logistics PDF contract extraction | contract-extraction service + python contract sidecar |
JSON_EXTRACT |
Reserved mode (currently mocked in new-session flow) | Not production-wired yet |
| Route | View |
|---|---|
/ |
Sessions home (PDF extract focus) |
/ocr-extract |
Sessions home for OCR mode |
/keyword-extract |
Sessions home for TABLE mode |
/pdf-sessions/:id |
PDF session detail |
/sessions/:id |
General session detail |
- Windows 10/11 recommended (project is Windows-first).
- Node.js 18+.
- npm 9+.
- Python 3.10 to 3.13.
- Ollama running locally for
TABLE_EXTRACT. - Additional Python packages depending on extraction model.
git clone https://github.com/One-Team-One-Goal/oltekocr-desktop.git
cd oltekocr-desktopnpm installpostinstall runs electron-builder install-app-deps automatically.
python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip setuptools wheelpython3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheelpip install rapidocr-onnxruntime pymupdfpip install pdfplumber pymupdfOptional model backends:
pip install docling unstructuredqa_ollama.py uses the local Ollama HTTP API. Install/run Ollama and pull at least one model:
ollama serve
ollama pull qwen2.5:7bYou can select model/version from the app settings/models UI.
Current behavior note: TABLE_EXTRACT execution is handled through the Ollama sidecar (qa_ollama.py).
npx prisma generate
npx prisma db pushThis creates data/oltekocr.db if it does not exist.
npm run devThis starts Electron and the embedded NestJS server on port 3847.
npm run buildPackaged artifacts are written under dist/.
Created automatically under data/:
data/
oltekocr.db
settings.json
scans/
incoming/
thumbnails/
exports/
Base URL: http://localhost:3847/api
POST /sessionsGET /sessionsGET /sessions/:idPOST /sessions/:id/duplicatePATCH /sessions/:id/columnsPATCH /sessions/:id/renamePATCH /sessions/:id/extraction-modelDELETE /sessions/:idPOST /sessions/:id/ingest/filesPOST /sessions/:id/ingest/folderGET /sessions/:id/documentsGET /sessions/:id/statsGET /sessions/schema-presetsPOST /sessions/schema-presets
GET /documentsGET /documents/statsGET /documents/:idGET /documents/:id/imageGET /documents/:id/thumbnailPOST /documents/loadPOST /documents/load-folderPOST /documents/analyze-pdf-contentPOST /documents/extract-pdf-textPATCH /documents/:idPATCH /documents/:id/approvePATCH /documents/:id/rejectPATCH /documents/:id/reprocessDELETE /documents/:id
GET /queue/statusPOST /queue/addPOST /queue/pausePOST /queue/resumePOST /queue/cancelDELETE /queue
POST /exportPOST /export/all-approvedGET /export/history
GET /settingsGET /settings/defaultsPATCH /settingsGET /scanner/listPOST /scanner/scanPOST /scanner/watch/startPOST /scanner/watch/stopGET /scanner/watch/status
- Endpoint:
ws://localhost:3847/ws - Events:
queue:updatedocument:statusprocessing:progressprocessing:log
Swagger/OpenAPI docs are available at http://localhost:3847/api/docs while the app is running.
{
"scanner": {
"watchFolder": "./data/scans/incoming"
},
"ocr": {
"engine": "rapidocr",
"pdfModel": "pdfplumber",
"confidenceThreshold": 85,
"timeout": 120,
"pythonPath": "python"
},
"export": {
"defaultFormat": "excel"
},
"llm": {
"provider": "groq",
"defaultModel": "qwen3:30b"
}
}npm run dev
npm run build
npm run preview
npm run typecheck:node
npm run typecheck:web
npm run prisma:generate
npm run prisma:push
npm run prisma:studio- Python not found or OCR sidecar fails:
- Activate the virtual environment.
- Set OCR python path in settings to the exact interpreter path.
TABLE_EXTRACTreturns errors:- Ensure
ollama serveis running and selected model is installed.
- Ensure
pdf_extract.pymodel import errors:- Install the package for that model (
pdfplumber,pymupdf,docling,unstructured).
- Install the package for that model (
- Prisma/client mismatch:
- Run
npx prisma generatethennpx prisma db push.
- Run
- Port conflict on
3847:- Stop existing process using the port and restart app.
- The application is designed for local desktop execution.
- Queue processing is intentionally sequential for predictable resource usage.
JSON_EXTRACTmode is present in types/UI but is not yet a production-complete pipeline.
MIT