feat: iOS OCR Server with hOCR support + preserve document owner, permissions and document_type by vistalba · Pull Request #973 · icereed/paperless-gpt

vistalba · 2026-05-18T10:47:46Z

Summary

Two features plus a bug fix:

iOS OCR Server (Apple Vision Framework) — new ios_ocr provider with
hOCR support for searchable PDF generation. Image-mode only.
Owner/permissions preservation — new PDF_PRESERVE_OWNER_PERMISSIONS
env var restores the original document's owner and permissions on uploaded
PDFs via an async background queue.
Bug fix — variable shadowing in ocr.go:239 that silently skipped PDF
generation for all hOCR-capable providers in image mode.

Files Changed (14 files, +1076 / -41)

File	Description
`ocr/iosocr_provider.go`	New — iOS OCR provider with multipart upload, hOCR support
`ocr/iosocr_provider_test.go`	New — 9 test cases
`ocr/provider.go`	Config struct + factory case
`permissions.go`	New — async in-memory restore queue
`paperless.go`	`PatchDocument`, `GetTaskStatus` array fix, `full_perms=true`
`types.go`	Permission types, `PendingPermissionRestore` struct
`ocr.go`	Validation, async enqueue, helpers, variable shadowing fix
`main.go`	Env vars, App fields, mode compatibility
`background.go` / `jobs.go`	Hook into background loop
`README.md`	Provider docs, env vars table, trade-off note
Test files	Mocks + test cases

Design Decision & Trade-off

The permission restore queue is in-memory. If paperless-gpt crashes between
upload and restore, the new document keeps default permissions. Acceptable
because the vulnerability window is small, the API key holder already has full
access, and the rest of the OCR pipeline is also in-memory.

Configuration

Variable	Default
`IOS_OCR_SERVER_URL`	(required for `ios_ocr`)
`IOS_OCR_SERVER_TIMEOUT`	60
`PDF_PRESERVE_OWNER_PERMISSIONS`	false

Summary by CodeRabbit

New Features
- Added iOS OCR Server as a new OCR provider (image-mode only).
- Option to preserve and asynchronously restore PDF owner/permissions after processed-PDF upload.
Documentation
- Updated OCR docs, provider compatibility, and env var docs; hOCR/PDF text-layer support noted for both Google Document AI and iOS OCR Server.
Tests
- Added unit tests for the iOS OCR provider, HOCR generation, and permission-restore flow.
Notes
- Permission-restore uses an in-memory retry queue and can be lost on crashes/restarts.

…adProcessedPDF

Add a new OCR provider (ios_ocr) that sends images to the iOS OCR Server app running on iPhone via its /upload API and extracts the recognized text. - New environment variable: IOS_OCR_SERVER_URL (required) and IOS_OCR_SERVER_TIMEOUT (optional, default 60s) - Supports image processing mode only - Returns plain OCR text, mirroring all existing provider behavior - Full test coverage following existing patterns

Bump musl-dev to 1.2.5-r11 and mupdf/mupdf-dev to 1.24.10-r1 to match current Alpine 3.21 repository.

Removes pinned package versions from Dockerfile to avoid build failures when Alpine repos update.

This reverts commit bc6cbf8.

…POST retries Also adds docstring to newIosOcrProvider to satisfy docstring coverage threshold. The background loop in background.go already handles document-level retry via exponential backoff, so HTTP-level retry is redundant and harmful for non-idempotent POST /upload.

Implements the HOCRCapable interface on IosOcrProvider to translate the iOS OCR Server's ocr_boxes array into hOCR Page structures. - Adds IosOcrBox struct for typed box parsing - Adds buildHOCRPage helper: sorts words top-to-bottom, groups into lines by Y-proximity, builds hOCR hierarchy (Page → Lines → Words) - Adds parseOcrBoxes helper for safe interface{} parsing via marshal/unmarshal round-trip - Implements IsHOCREnabled, GetHOCRPages, GetHOCRDocument, ResetHOCR for the HOCRCapable interface - hOCR page creation is gated by config.EnableHOCR (ENABLE_HOCR env var) - Includes unit tests for hOCR page building, box parsing, and end-to-end ProcessImage hOCR flow Follow-up to the base ios_ocr provider PR.

The := at line 239 created a local imagePaths variable that shadowed the outer declaration (line 137), causing processed_pages=0 at line 366. This prevented PDF generation in image mode for all hOCR-capable providers. Changed to pre-declare imgPageCount and err, then use = for assignment, so the outer imagePaths is used consistently. This is a behavior change: PDF generation in image mode now works when it was previously silently broken.

…pport

… parseOcrBoxes, README contradictions

… related_document

… owner and permissions on upload - New env var PDF_PRESERVE_OWNER_PERMISSIONS (bool, default false) - Poll task + PATCH owner/permissions independently of ReplaceOriginal - Deletion still gated by ReplaceOriginal only - Added validation: PreserveOwnerPermissions requires UploadPDF=true - Updated all OCROptions construction sites and tests

- Replace synchronous 60s timeout polling with async background queue - New file permissions.go with enqueuePermissionRestore and processPendingPermissionRestores - uploadProcessedPDF enqueues restore task when PreserveOwnerPermissions=true - Background loop processes queue every ~10s, retries indefinitely with 24h expiry - ReplaceOriginal block unchanged (sync poll + PATCH + delete for crash-safety) - extractDocIDFromTask / buildPatchFields / patchNewDocumentPermissions helpers added - PR note: queue is in-memory, lost on pod restart during the upload-to-PATCH window

… and helpers - PendingPermissionRestore.Permissions was *PermissionSet, should be *Permissions - Added missing 'time' import to types.go - Fixed enqueuePermissionRestore, buildPatchFields, patchNewDocumentPermissions signatures

- New env var table entry - Trade-off warning in Metadata Copying Limitations section - Added success log in processPendingPermissionRestores

coderabbitai · 2026-05-18T10:48:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an iOS OCR Server provider with optional hOCR, threads owner/permissions through types and the Paperless client, enqueues in-memory async permission-restore jobs after processed-PDF uploads, patches permissions on ReplaceOriginal success, and wires config, background processing, tests, and README updates.

Changes

iOS OCR Server & Permission Restoration

Layer / File(s)	Summary
Documentation `README.md`	Documents `ios_ocr` provider, adds TOC entry, provider compatibility row, hOCR/enhanced-PDF notes, and `PDF_PRESERVE_OWNER_PERMISSIONS` trade-off and env vars.
Data Model & Contracts `types.go`	Adds `PermissionSet`/`Permissions`, `Owner`/`Permissions` to API/document structs, `PendingPermissionRestore`, `OCROptions.PreserveOwnerPermissions`, and `ClientInterface.PatchDocument`.
Paperless API Client Integration `paperless.go`	`GetDocument` requests `?full_perms=true`; new `PatchDocument` method; `GetTaskStatus` parses array responses and returns first element.
iOS OCR Provider Implementation `ocr/iosocr_provider.go`, `ocr/provider.go`	New `IosOcrProvider` with configurable timeout, multipart upload, response validation, OCRResult, optional hOCR page generation, and provider wiring.
iOS OCR Provider Tests `ocr/iosocr_provider_test.go`	Tests multipart uploads, success/failure/JSON/error handling, conditional hOCR creation, `buildHOCRPage` and `parseOcrBoxes`.
OCR Processing Pipeline `ocr.go`, `ocr_test.go`	Validates `PreserveOwnerPermissions` requires `UploadPDF`; include `document_type` in upload metadata; enqueue permission-restore after upload; immediate patch on ReplaceOriginal SUCCESS; helper functions added; tests updated for task-array and new OCROptions cases.
Background Permission Restoration Queue `permissions.go`, `background.go`	`enqueuePermissionRestore` records restores; `processPendingPermissionRestores` drains queue, expires >24h, checks task status, patches on SUCCESS, logs on FAILURE, re-queues pending entries; integrated into StartBackgroundTasks cycle.
Application Config & Initialization `main.go`	Parses `IOS_OCR_SERVER_URL`/`IOS_OCR_SERVER_TIMEOUT` and `PDF_PRESERVE_OWNER_PERMISSIONS`; adds `App` fields for pending restores; enforces `ios_ocr` image-only mode and validates required env var.
Test Infrastructure & Mocks `app_llm_test.go`, `background_test.go`	Adds `PatchDocument` mocks; updates task-response fixtures and OCROptions validation tests to cover PreserveOwnerPermissions scenarios.
Code Formatting `jobs.go`, `background.go`	Reformats `OCROptions` struct literal alignment without logic changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

icereed/paperless-gpt#357: Also extends OCR provider wiring and adds a new provider variant (docling), related at provider-factory level.
icereed/paperless-gpt#321: Background task orchestration changes touching StartBackgroundTasks and related tests.
icereed/paperless-gpt#348: Related hOCR pipeline work and multi-page OCR provider integration.

Suggested labels

safe-to-test

Suggested reviewers

chetan

"I hop, I parse, I send a bite,
Images whisper letters bright.
Queued restores keep owners near,
Pages stitched and permissions clear.
A rabbit cheers—OCR takes flight!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically summarizes the main changes: iOS OCR Server support with hOCR capability, plus the ability to preserve document owner and permissions.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

types.go (1)

79-79: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make document_type nullable in these structs.

These fields are decoded directly from the Paperless document payload, and this file already hints that document_type may be absent/null. If that happens, unmarshalling into int will fail on Line 79, and Line 102 also loses the distinction between “unset” and a real ID. Using *int here is safer for the new preserve-document-type flow.

💡 Suggested change

 type GetDocumentApiResponse struct {
 	ID               int                   `json:"id"`
 	Correspondent    int                   `json:"correspondent"`
-	DocumentType     int                   `json:"document_type"`
+	DocumentType     *int                  `json:"document_type"`
 	Title            string                `json:"title"`
 	Content          string                `json:"content"`
 	Tags             []int                 `json:"tags"`
 	CreatedDate      string                `json:"created_date"`
 	OriginalFileName string                `json:"original_file_name"`
 	Notes            []interface{}         `json:"notes"`
 	CustomFields     []CustomFieldResponse `json:"custom_fields"`
 	Owner            *int                  `json:"owner"`
 	Permissions      *Permissions          `json:"permissions,omitempty"`
 }

 type Document struct {
 	ID               int                   `json:"id"`
 	Title            string                `json:"title"`
 	Content          string                `json:"content"`
 	Tags             []string              `json:"tags"`
 	Correspondent    string                `json:"correspondent"`
 	CreatedDate      string                `json:"created_date"`
 	OriginalFileName string                `json:"original_file_name"`
 	DocumentTypeName string                `json:"document_type_name"`
-	DocumentType     int                   `json:"document_type"`
+	DocumentType     *int                  `json:"document_type"`
 	CustomFields     []CustomFieldResponse `json:"custom_fields"`
 	Owner            *int                  `json:"owner"`
 	Permissions      *Permissions          `json:"permissions,omitempty"`
 }

Also applies to: 102-102

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@types.go` at line 79, Change the DocumentType fields from int to *int in the
affected structs so JSON null/absent values decode without error: locate the
DocumentType field declarations (currently `DocumentType int
\`json:"document_type"\``) and change their type to `*int`; update any places
that read/write those fields (e.g., the preserve-document-type flow and any code
that compares or assigns DocumentType) to handle nil vs non-nil (use nil-checks
before dereferencing or treat nil as “unset”) and keep the json tag intact.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ocr_test.go`:
- Line 92: The for-loop range uses the undefined identifier `tc`; change both
occurrences of `for _, tc := range tc {` in ocr_test.go to iterate over the
actual slice `testCases` (i.e., `for _, tc := range testCases {`) so the loop
variable `tc` is defined; update both instances (around the loops that iterate
testCases in the test functions) to fix the compile error.

In `@ocr.go`:
- Around line 633-643: The function extractDocIDFromTask currently falls back to
reading taskStatus["id"] (the task's internal ID) which is incorrect; change it
to only parse related_document and support both string and numeric forms: remove
the fallback branch that checks taskStatus["id"], keep and expand the existing
related_document handling in extractDocIDFromTask to accept numeric types (e.g.,
float64/int) as well as strings, converting them to int and returning (id,
true); if related_document is absent or cannot be parsed return (0, false).

In `@ocr/iosocr_provider.go`:
- Around line 234-252: GetHOCRDocument currently uses p.hocrPages in append
order which can be out-of-order; after copying p.hocrPages into the local pages
slice (inside GetHOCRDocument) sort the pages by the PageNumber field (e.g.,
using sort.Slice or sort.SliceStable on pages comparing pages[i].PageNumber <
pages[j].PageNumber) before constructing the hocr.HOCR struct so the resulting
document pages are in numeric order; ensure you add the sort import if missing
and reference p.hocrPages and the local pages slice in your change.

In `@permissions.go`:
- Around line 71-78: The loop currently calls
app.patchNewDocumentPermissions(ctx, taskStatus, entry.Owner, entry.Permissions,
logger) but always marks the queue entry as processed (processed++ and
logger.Info) even if patching failed; change patchNewDocumentPermissions to
return an error (or success bool) instead of swallowing failures, then check
that return value in the SUCCESS case before calling logger.Info and
incrementing processed; if the call returns an error, log the error via the
existing logger and do not increment processed so the entry remains for retry.
Ensure you update callers and the patchNewDocumentPermissions signature
accordingly.

In `@README.md`:
- Around line 388-390: The README now documents a fifth provider ("### 5. iOS
OCR Server") but the earlier description still says the project supports "four
different OCR providers"; update that sentence to either increment the count to
five or, better, rephrase it to avoid hardcoding a number (e.g., "multiple OCR
providers" or "several OCR providers") so the text stays correct when providers
are added; look for the sentence that contains "four different OCR providers"
and change it accordingly.
- Around line 397-407: The README update about iOS hOCR/searchable PDF is
missing the prerequisite flag; update the relevant sections (around the iOS OCR
configuration block and the later Enhanced OCR Features paragraph) to explicitly
state that ENABLE_HOCR must be set to true (e.g., "ENABLE_HOCR=true") for hOCR
output to be produced by the iOS OCR provider, and add it as an item in the
Configuration block alongside OCR_PROVIDER/IOS_OCR_SERVER_URL and in the
explanation near the Enhanced OCR Features reference so users know the gate is
required.

---

Outside diff comments:
In `@types.go`:
- Line 79: Change the DocumentType fields from int to *int in the affected
structs so JSON null/absent values decode without error: locate the DocumentType
field declarations (currently `DocumentType int \`json:"document_type"\``) and
change their type to `*int`; update any places that read/write those fields
(e.g., the preserve-document-type flow and any code that compares or assigns
DocumentType) to handle nil vs non-nil (use nil-checks before dereferencing or
treat nil as “unset”) and keep the json tag intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b3b10c4-3f0f-44be-8d02-aec1aca5a167

📥 Commits

Reviewing files that changed from the base of the PR and between 94c8428 and 8d1f665.

📒 Files selected for processing (14)

README.md
app_llm_test.go
background.go
background_test.go
jobs.go
main.go
ocr.go
ocr/iosocr_provider.go
ocr/iosocr_provider_test.go
ocr/provider.go
ocr_test.go
paperless.go
permissions.go
types.go

- Rename tc to testCases in ocr_test.go for clarity - Remove taskStatus[id] fallback in extractDocIDFromTask - Make patchNewDocumentPermissions return error for retry logic - Sort hocrPages by PageNumber in GetHOCRDocument - Fix 'four different OCR providers' in README

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ocr_test.go`:
- Around line 200-206: The mock task currently sets "id" equal to documentID so
the test won't catch regressions in extractDocIDFromTask; change the "id" field
in the json.NewEncoder(w).Encode payload to use the distinct taskID (e.g. "id":
fmt.Sprintf("%d", taskID) or the taskID variable) while keeping
"related_document" as fmt.Sprintf("%d", documentID) so id and related_document
differ and the test actually verifies extractDocIDFromTask's behavior.

In `@ocr.go`:
- Around line 607-609: The replacement flow always calls
patchNewDocumentPermissions causing owner/permission restoration even when
PreserveOwnerPermissions is false; guard that call by checking the
PreserveOwnerPermissions flag before invoking app.patchNewDocumentPermissions
(use the same ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions,
logger parameters), so patchNewDocumentPermissions runs only when
PreserveOwnerPermissions is true and otherwise is skipped.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dde6162c-1c36-49ed-aa10-770bd800ad7a

📥 Commits

Reviewing files that changed from the base of the PR and between 25e5689 and 1fa994d.

📒 Files selected for processing (5)

README.md
ocr.go
ocr/iosocr_provider.go
ocr_test.go
permissions.go

✅ Files skipped from review due to trivial changes (1)

README.md

🚧 Files skipped from review as they are similar to previous changes (2)

permissions.go
ocr/iosocr_provider.go

coderabbitai · 2026-05-18T11:21:50Z

+		json.NewEncoder(w).Encode([]map[string]interface{}{
+			{
+				"id":               documentID,
+				"status":           "SUCCESS",
+				"task_id":          taskID,
+				"related_document": fmt.Sprintf("%d", documentID),
+				"result":           fmt.Sprintf("Success. New document id %d created", documentID),


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make the mocked task id differ from related_document.

This fixture would still pass if extractDocIDFromTask regressed to using id, because both fields resolve to the same value here. Give id a distinct task-record value so the test actually guards the bug this PR fixed.

Suggested fix

json.NewEncoder(w).Encode([]map[string]interface{}{ { - "id": documentID, + "id": 9999, "status": "SUCCESS", "task_id": taskID, "related_document": fmt.Sprintf("%d", documentID), "result": fmt.Sprintf("Success. New document id %d created", documentID), },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

json.NewEncoder(w).Encode([]map[string]interface{}{

{

"id": documentID,

"status": "SUCCESS",

"task_id": taskID,

"related_document": fmt.Sprintf("%d", documentID),

"result": fmt.Sprintf("Success. New document id %d created", documentID),

json.NewEncoder(w).Encode([]map[string]interface{}{

{

"id": 9999,

"status": "SUCCESS",

"task_id": taskID,

"related_document": fmt.Sprintf("%d", documentID),

"result": fmt.Sprintf("Success. New document id %d created", documentID),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ocr_test.go` around lines 200 - 206, The mock task currently sets "id" equal to documentID so the test won't catch regressions in extractDocIDFromTask; change the "id" field in the json.NewEncoder(w).Encode payload to use the distinct taskID (e.g. "id": fmt.Sprintf("%d", taskID) or the taskID variable) while keeping "related_document" as fmt.Sprintf("%d", documentID) so id and related_document differ and the test actually verifies extractDocIDFromTask's behavior.

✅ Addressed in commit 957f65d

coderabbitai · 2026-05-18T11:21:51Z

+				// Restore owner and permissions on the new document
+				// Error is logged by patchNewDocumentPermissions — still proceed with deletion
+				_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate the replacement-time permission PATCH behind PreserveOwnerPermissions.

This runs on every successful replacement, even when PreserveOwnerPermissions is false. If the source document has owner/permission fields, the new option becomes ineffective and replacement silently preserves them anyway.

Suggested fix

- // Restore owner and permissions on the new document - // Error is logged by patchNewDocumentPermissions — still proceed with deletion - _ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger) + if options.PreserveOwnerPermissions { + // Restore owner and permissions on the new document + // Error is logged by patchNewDocumentPermissions — still proceed with deletion + _ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger) + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Restore owner and permissions on the new document

// Error is logged by patchNewDocumentPermissions — still proceed with deletion

_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)

if options.PreserveOwnerPermissions {

// Restore owner and permissions on the new document

// Error is logged by patchNewDocumentPermissions — still proceed with deletion

_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ocr.go` around lines 607 - 609, The replacement flow always calls patchNewDocumentPermissions causing owner/permission restoration even when PreserveOwnerPermissions is false; guard that call by checking the PreserveOwnerPermissions flag before invoking app.patchNewDocumentPermissions (use the same ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger parameters), so patchNewDocumentPermissions runs only when PreserveOwnerPermissions is true and otherwise is skipped.

✅ Addressed in commit 957f65d

…ssions flag; distinguish mock task id from document id in test

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

ocr.go (1)

583-625: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Only delete the source document after a confirmed SUCCESS.

Line 624 still deletes the original when polling times out or when status couldn't be confirmed. UploadDocument only means Paperless accepted the task; it does not guarantee the new document was created successfully. In a slow or delayed-processing case, this can drop the only copy of the document.

Suggested fix

-		for i := 0; i < maxRetries; i++ {
+		deleteOriginal := false
+		for i := 0; i < maxRetries; i++ {
 			taskStatus, err := app.Client.GetTaskStatus(ctx, taskID)
 			if err != nil {
-				logger.WithError(err).Warn("Failed to check task status, proceeding with deletion anyway")
+				logger.WithError(err).Warn("Failed to check task status, keeping original document")
 				break
 			}
@@
 			if status == "SUCCESS" {
 				logger.Info("Document processing completed successfully")
+				deleteOriginal = true
 
 				if options.PreserveOwnerPermissions {
 					_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
 				}
 
 				break
 			}
@@
-		// Delete original document (even if poll timed out — upload was successful)
-		if err := app.Client.DeleteDocument(ctx, documentID); err != nil {
+		if !deleteOriginal {
+			return fmt.Errorf("document processing did not reach SUCCESS, not deleting original document")
+		}
+		if err := app.Client.DeleteDocument(ctx, documentID); err != nil {
 			return fmt.Errorf("error deleting original document: %w", err)
 		}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ocr.go` around lines 583 - 625, Current logic may call
app.Client.DeleteDocument even when polling timed out or task status is unknown;
change the flow so DeleteDocument is only invoked after a confirmed "SUCCESS"
status from app.Client.GetTaskStatus. Specifically, in the ReplaceOriginal
branch (where options.ReplaceOriginal is true) update the polling loop around
app.Client.GetTaskStatus/taskStatus so that you set a local flag (e.g.,
processingSucceeded) when status == "SUCCESS" (and run
app.patchNewDocumentPermissions there if PreserveOwnerPermissions), return an
error immediately if status == "FAILURE", and after the loop only call
app.Client.DeleteDocument(ctx, documentID) if processingSucceeded is true;
otherwise skip deletion and return a non-destructive error indicating the
document was not confirmed successful.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@ocr.go`:
- Around line 583-625: Current logic may call app.Client.DeleteDocument even
when polling timed out or task status is unknown; change the flow so
DeleteDocument is only invoked after a confirmed "SUCCESS" status from
app.Client.GetTaskStatus. Specifically, in the ReplaceOriginal branch (where
options.ReplaceOriginal is true) update the polling loop around
app.Client.GetTaskStatus/taskStatus so that you set a local flag (e.g.,
processingSucceeded) when status == "SUCCESS" (and run
app.patchNewDocumentPermissions there if PreserveOwnerPermissions), return an
error immediately if status == "FAILURE", and after the loop only call
app.Client.DeleteDocument(ctx, documentID) if processingSucceeded is true;
otherwise skip deletion and return a non-destructive error indicating the
document was not confirmed successful.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4ca7cc2d-4aba-499e-b5b9-c8c61703ffa7

📥 Commits

Reviewing files that changed from the base of the PR and between 1fa994d and 957f65d.

📒 Files selected for processing (2)

ocr.go
ocr_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

ocr_test.go

vistalba · 2026-05-18T15:37:58Z

ocr.go (1)> 583-625: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Only delete the source document after a confirmed SUCCESS.
Line 624 still deletes the original when polling times out or when status couldn't be confirmed. UploadDocument only means Paperless accepted the task; it does not guarantee the new document was created successfully. In a slow or delayed-processing case, this can drop the only copy of the document.

NOTE (not addressed in this PR):
The unconditional DeleteDocument on polling timeout/error was pre-existing behavior. Fixing it would change the ReplaceOriginal contract for all providers and is outside the scope of this PR (iOS OCR + permission preservation).

Raise maxRetries from 12 to 720 (5s intervals) so that document processing has up to ~1 hour to complete before giving up on deleting the original. This accommodates slower paperless-ngx processing common with mobile uploads, AI-based classification, or large documents.

Previously, ReplaceOriginal would delete the original document even if the processing task timed out or returned an unknown status. This could cause data loss in edge cases where the new document was not fully processed. Now a deleteOriginal flag is set only on confirmed SUCCESS. The original document is preserved and an error is returned if processing did not reach SUCCESS.

vistalba added 21 commits May 17, 2026 11:38

fix: preserve owner and document_type when replacing document in uplo…

d3fa493

…adProcessedPDF

fix: preserve full owner and permissions via PATCH after document upload

d44450d

fix: update Alpine package versions in Dockerfile

3d14a8c

Bump musl-dev to 1.2.5-r11 and mupdf/mupdf-dev to 1.24.10-r1 to match current Alpine 3.21 repository.

fix: remove Alpine version pins to fix build

8c0862a

Removes pinned package versions from Dockerfile to avoid build failures when Alpine repos update.

docs: add iOS OCR Server to README

f7c2425

Revert "fix: remove Alpine version pins to fix build"

d23634f

This reverts commit bc6cbf8.

chore: restore original Dockerfile from upstream

9912d85

docs: add App Store link for iOS OCR Server

50e16af

fix: validate timeout > 0 and use bounded reader for iOS OCR responses

48306ba

docs: update Enhanced OCR Features section for iOS OCR Server hOCR su…

65102b7

…pport

fix: address code review feedback - brittle test string, nil guard in…

04dca66

… parseOcrBoxes, README contradictions

fix: add PatchDocument to ClientInterface and mock implementations

8248d97

fix: decode GetTaskStatus response as array, extract document ID from…

2b57f0a

… related_document

docs: add PDF_PRESERVE_OWNER_PERMISSIONS to README with trade-off note

8d1f665

- New env var table entry - Trade-off warning in Metadata Copying Limitations section - Added success log in processPendingPermissionRestores

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread ocr_test.go Outdated

Comment thread ocr.go

Comment thread ocr/iosocr_provider.go

Comment thread permissions.go Outdated

Comment thread README.md

Comment thread README.md

vistalba added 2 commits May 18, 2026 12:55

style: gofmt

25e5689

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

fix: guard ReplaceOriginal permission restore with PreserveOwnerPermi…

957f65d

…ssions flag; distinguish mock task id from document id in test

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

vistalba mentioned this pull request May 18, 2026

feat: add iOS OCR Server provider #970

Closed

vistalba mentioned this pull request May 18, 2026

fix: only delete original document on confirmed SUCCESS when ReplaceOriginal is true #974

Closed

vistalba added 2 commits May 18, 2026 19:28

vistalba force-pushed the feat/ios-ocr-server-hocr branch from 17fcfc6 to d486e45 Compare May 19, 2026 06:06

Uh oh!

Conversation

vistalba commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files Changed (14 files, +1076 / -41)

Design Decision & Trade-off

Configuration

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

vistalba commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vistalba commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading

coderabbitai Bot May 18, 2026 •

edited

Loading

coderabbitai Bot May 18, 2026 •

edited

Loading