Skip to content

feat: iOS OCR Server with hOCR support + preserve document owner, permissions and document_type#973

Open
vistalba wants to merge 26 commits into
icereed:mainfrom
vistalba:feat/ios-ocr-server-hocr
Open

feat: iOS OCR Server with hOCR support + preserve document owner, permissions and document_type#973
vistalba wants to merge 26 commits into
icereed:mainfrom
vistalba:feat/ios-ocr-server-hocr

Conversation

@vistalba

@vistalba vistalba commented May 18, 2026

Copy link
Copy Markdown

Summary

Two features plus a bug fix:

  1. iOS OCR Server (Apple Vision Framework) — new ios_ocr provider with
    hOCR support for searchable PDF generation. Image-mode only.
  2. Owner/permissions preservation — new PDF_PRESERVE_OWNER_PERMISSIONS
    env var restores the original document's owner and permissions on uploaded
    PDFs via an async background queue.
  3. Bug fix — variable shadowing in ocr.go:239 that silently skipped PDF
    generation for all hOCR-capable providers in image mode.

Files Changed (14 files, +1076 / -41)

File Description
ocr/iosocr_provider.go New — iOS OCR provider with multipart upload, hOCR support
ocr/iosocr_provider_test.go New — 9 test cases
ocr/provider.go Config struct + factory case
permissions.go New — async in-memory restore queue
paperless.go PatchDocument, GetTaskStatus array fix, full_perms=true
types.go Permission types, PendingPermissionRestore struct
ocr.go Validation, async enqueue, helpers, variable shadowing fix
main.go Env vars, App fields, mode compatibility
background.go / jobs.go Hook into background loop
README.md Provider docs, env vars table, trade-off note
Test files Mocks + test cases

Design Decision & Trade-off

The permission restore queue is in-memory. If paperless-gpt crashes between
upload and restore, the new document keeps default permissions. Acceptable
because the vulnerability window is small, the API key holder already has full
access, and the rest of the OCR pipeline is also in-memory.

Configuration

Variable Default
IOS_OCR_SERVER_URL (required for ios_ocr)
IOS_OCR_SERVER_TIMEOUT 60
PDF_PRESERVE_OWNER_PERMISSIONS false

Summary by CodeRabbit

  • New Features

    • Added iOS OCR Server as a new OCR provider (image-mode only).
    • Option to preserve and asynchronously restore PDF owner/permissions after processed-PDF upload.
  • Documentation

    • Updated OCR docs, provider compatibility, and env var docs; hOCR/PDF text-layer support noted for both Google Document AI and iOS OCR Server.
  • Tests

    • Added unit tests for the iOS OCR provider, HOCR generation, and permission-restore flow.
  • Notes

    • Permission-restore uses an in-memory retry queue and can be lost on crashes/restarts.

Review Change Stack

vistalba added 21 commits May 17, 2026 11:38
Add a new OCR provider (ios_ocr) that sends images to the iOS OCR Server app running on iPhone via its /upload API and extracts the recognized text.

- New environment variable: IOS_OCR_SERVER_URL (required) and IOS_OCR_SERVER_TIMEOUT (optional, default 60s)
- Supports image processing mode only
- Returns plain OCR text, mirroring all existing provider behavior
- Full test coverage following existing patterns
Bump musl-dev to 1.2.5-r11 and mupdf/mupdf-dev to 1.24.10-r1 to match current Alpine 3.21 repository.
Removes pinned package versions from Dockerfile to avoid build failures when Alpine repos update.
…POST retries

Also adds docstring to newIosOcrProvider to satisfy docstring coverage threshold.
The background loop in background.go already handles document-level retry via exponential backoff, so HTTP-level retry is redundant and harmful for non-idempotent POST /upload.
Implements the HOCRCapable interface on IosOcrProvider to translate the iOS OCR Server's ocr_boxes array into hOCR Page structures.

- Adds IosOcrBox struct for typed box parsing
- Adds buildHOCRPage helper: sorts words top-to-bottom, groups into lines by Y-proximity, builds hOCR hierarchy (Page → Lines → Words)
- Adds parseOcrBoxes helper for safe interface{} parsing via marshal/unmarshal round-trip
- Implements IsHOCREnabled, GetHOCRPages, GetHOCRDocument, ResetHOCR for the HOCRCapable interface
- hOCR page creation is gated by config.EnableHOCR (ENABLE_HOCR env var)
- Includes unit tests for hOCR page building, box parsing, and end-to-end ProcessImage hOCR flow

Follow-up to the base ios_ocr provider PR.
The := at line 239 created a local imagePaths variable that shadowed the
outer declaration (line 137), causing processed_pages=0 at line 366.
This prevented PDF generation in image mode for all hOCR-capable providers.

Changed to pre-declare imgPageCount and err, then use = for assignment,
so the outer imagePaths is used consistently.

This is a behavior change: PDF generation in image mode now works when
it was previously silently broken.
… owner and permissions on upload

- New env var PDF_PRESERVE_OWNER_PERMISSIONS (bool, default false)
- Poll task + PATCH owner/permissions independently of ReplaceOriginal
- Deletion still gated by ReplaceOriginal only
- Added validation: PreserveOwnerPermissions requires UploadPDF=true
- Updated all OCROptions construction sites and tests
- Replace synchronous 60s timeout polling with async background queue
- New file permissions.go with enqueuePermissionRestore and processPendingPermissionRestores
- uploadProcessedPDF enqueues restore task when PreserveOwnerPermissions=true
- Background loop processes queue every ~10s, retries indefinitely with 24h expiry
- ReplaceOriginal block unchanged (sync poll + PATCH + delete for crash-safety)
- extractDocIDFromTask / buildPatchFields / patchNewDocumentPermissions helpers added
- PR note: queue is in-memory, lost on pod restart during the upload-to-PATCH window
… and helpers

- PendingPermissionRestore.Permissions was *PermissionSet, should be *Permissions
- Added missing 'time' import to types.go
- Fixed enqueuePermissionRestore, buildPatchFields, patchNewDocumentPermissions signatures
- New env var table entry
- Trade-off warning in Metadata Copying Limitations section
- Added success log in processPendingPermissionRestores
@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an iOS OCR Server provider with optional hOCR, threads owner/permissions through types and the Paperless client, enqueues in-memory async permission-restore jobs after processed-PDF uploads, patches permissions on ReplaceOriginal success, and wires config, background processing, tests, and README updates.

Changes

iOS OCR Server & Permission Restoration

Layer / File(s) Summary
Documentation
README.md
Documents ios_ocr provider, adds TOC entry, provider compatibility row, hOCR/enhanced-PDF notes, and PDF_PRESERVE_OWNER_PERMISSIONS trade-off and env vars.
Data Model & Contracts
types.go
Adds PermissionSet/Permissions, Owner/Permissions to API/document structs, PendingPermissionRestore, OCROptions.PreserveOwnerPermissions, and ClientInterface.PatchDocument.
Paperless API Client Integration
paperless.go
GetDocument requests ?full_perms=true; new PatchDocument method; GetTaskStatus parses array responses and returns first element.
iOS OCR Provider Implementation
ocr/iosocr_provider.go, ocr/provider.go
New IosOcrProvider with configurable timeout, multipart upload, response validation, OCRResult, optional hOCR page generation, and provider wiring.
iOS OCR Provider Tests
ocr/iosocr_provider_test.go
Tests multipart uploads, success/failure/JSON/error handling, conditional hOCR creation, buildHOCRPage and parseOcrBoxes.
OCR Processing Pipeline
ocr.go, ocr_test.go
Validates PreserveOwnerPermissions requires UploadPDF; include document_type in upload metadata; enqueue permission-restore after upload; immediate patch on ReplaceOriginal SUCCESS; helper functions added; tests updated for task-array and new OCROptions cases.
Background Permission Restoration Queue
permissions.go, background.go
enqueuePermissionRestore records restores; processPendingPermissionRestores drains queue, expires >24h, checks task status, patches on SUCCESS, logs on FAILURE, re-queues pending entries; integrated into StartBackgroundTasks cycle.
Application Config & Initialization
main.go
Parses IOS_OCR_SERVER_URL/IOS_OCR_SERVER_TIMEOUT and PDF_PRESERVE_OWNER_PERMISSIONS; adds App fields for pending restores; enforces ios_ocr image-only mode and validates required env var.
Test Infrastructure & Mocks
app_llm_test.go, background_test.go
Adds PatchDocument mocks; updates task-response fixtures and OCROptions validation tests to cover PreserveOwnerPermissions scenarios.
Code Formatting
jobs.go, background.go
Reformats OCROptions struct literal alignment without logic changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

safe-to-test

Suggested reviewers

  • chetan

"I hop, I parse, I send a bite,
Images whisper letters bright.
Queued restores keep owners near,
Pages stitched and permissions clear.
A rabbit cheers—OCR takes flight!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically summarizes the main changes: iOS OCR Server support with hOCR capability, plus the ability to preserve document owner and permissions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
types.go (1)

79-79: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make document_type nullable in these structs.

These fields are decoded directly from the Paperless document payload, and this file already hints that document_type may be absent/null. If that happens, unmarshalling into int will fail on Line 79, and Line 102 also loses the distinction between “unset” and a real ID. Using *int here is safer for the new preserve-document-type flow.

💡 Suggested change
 type GetDocumentApiResponse struct {
 	ID               int                   `json:"id"`
 	Correspondent    int                   `json:"correspondent"`
-	DocumentType     int                   `json:"document_type"`
+	DocumentType     *int                  `json:"document_type"`
 	Title            string                `json:"title"`
 	Content          string                `json:"content"`
 	Tags             []int                 `json:"tags"`
 	CreatedDate      string                `json:"created_date"`
 	OriginalFileName string                `json:"original_file_name"`
 	Notes            []interface{}         `json:"notes"`
 	CustomFields     []CustomFieldResponse `json:"custom_fields"`
 	Owner            *int                  `json:"owner"`
 	Permissions      *Permissions          `json:"permissions,omitempty"`
 }

 type Document struct {
 	ID               int                   `json:"id"`
 	Title            string                `json:"title"`
 	Content          string                `json:"content"`
 	Tags             []string              `json:"tags"`
 	Correspondent    string                `json:"correspondent"`
 	CreatedDate      string                `json:"created_date"`
 	OriginalFileName string                `json:"original_file_name"`
 	DocumentTypeName string                `json:"document_type_name"`
-	DocumentType     int                   `json:"document_type"`
+	DocumentType     *int                  `json:"document_type"`
 	CustomFields     []CustomFieldResponse `json:"custom_fields"`
 	Owner            *int                  `json:"owner"`
 	Permissions      *Permissions          `json:"permissions,omitempty"`
 }

Also applies to: 102-102

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@types.go` at line 79, Change the DocumentType fields from int to *int in the
affected structs so JSON null/absent values decode without error: locate the
DocumentType field declarations (currently `DocumentType int
\`json:"document_type"\``) and change their type to `*int`; update any places
that read/write those fields (e.g., the preserve-document-type flow and any code
that compares or assigns DocumentType) to handle nil vs non-nil (use nil-checks
before dereferencing or treat nil as “unset”) and keep the json tag intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ocr_test.go`:
- Line 92: The for-loop range uses the undefined identifier `tc`; change both
occurrences of `for _, tc := range tc {` in ocr_test.go to iterate over the
actual slice `testCases` (i.e., `for _, tc := range testCases {`) so the loop
variable `tc` is defined; update both instances (around the loops that iterate
testCases in the test functions) to fix the compile error.

In `@ocr.go`:
- Around line 633-643: The function extractDocIDFromTask currently falls back to
reading taskStatus["id"] (the task's internal ID) which is incorrect; change it
to only parse related_document and support both string and numeric forms: remove
the fallback branch that checks taskStatus["id"], keep and expand the existing
related_document handling in extractDocIDFromTask to accept numeric types (e.g.,
float64/int) as well as strings, converting them to int and returning (id,
true); if related_document is absent or cannot be parsed return (0, false).

In `@ocr/iosocr_provider.go`:
- Around line 234-252: GetHOCRDocument currently uses p.hocrPages in append
order which can be out-of-order; after copying p.hocrPages into the local pages
slice (inside GetHOCRDocument) sort the pages by the PageNumber field (e.g.,
using sort.Slice or sort.SliceStable on pages comparing pages[i].PageNumber <
pages[j].PageNumber) before constructing the hocr.HOCR struct so the resulting
document pages are in numeric order; ensure you add the sort import if missing
and reference p.hocrPages and the local pages slice in your change.

In `@permissions.go`:
- Around line 71-78: The loop currently calls
app.patchNewDocumentPermissions(ctx, taskStatus, entry.Owner, entry.Permissions,
logger) but always marks the queue entry as processed (processed++ and
logger.Info) even if patching failed; change patchNewDocumentPermissions to
return an error (or success bool) instead of swallowing failures, then check
that return value in the SUCCESS case before calling logger.Info and
incrementing processed; if the call returns an error, log the error via the
existing logger and do not increment processed so the entry remains for retry.
Ensure you update callers and the patchNewDocumentPermissions signature
accordingly.

In `@README.md`:
- Around line 388-390: The README now documents a fifth provider ("### 5. iOS
OCR Server") but the earlier description still says the project supports "four
different OCR providers"; update that sentence to either increment the count to
five or, better, rephrase it to avoid hardcoding a number (e.g., "multiple OCR
providers" or "several OCR providers") so the text stays correct when providers
are added; look for the sentence that contains "four different OCR providers"
and change it accordingly.
- Around line 397-407: The README update about iOS hOCR/searchable PDF is
missing the prerequisite flag; update the relevant sections (around the iOS OCR
configuration block and the later Enhanced OCR Features paragraph) to explicitly
state that ENABLE_HOCR must be set to true (e.g., "ENABLE_HOCR=true") for hOCR
output to be produced by the iOS OCR provider, and add it as an item in the
Configuration block alongside OCR_PROVIDER/IOS_OCR_SERVER_URL and in the
explanation near the Enhanced OCR Features reference so users know the gate is
required.

---

Outside diff comments:
In `@types.go`:
- Line 79: Change the DocumentType fields from int to *int in the affected
structs so JSON null/absent values decode without error: locate the DocumentType
field declarations (currently `DocumentType int \`json:"document_type"\``) and
change their type to `*int`; update any places that read/write those fields
(e.g., the preserve-document-type flow and any code that compares or assigns
DocumentType) to handle nil vs non-nil (use nil-checks before dereferencing or
treat nil as “unset”) and keep the json tag intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3b3b10c4-3f0f-44be-8d02-aec1aca5a167

📥 Commits

Reviewing files that changed from the base of the PR and between 94c8428 and 8d1f665.

📒 Files selected for processing (14)
  • README.md
  • app_llm_test.go
  • background.go
  • background_test.go
  • jobs.go
  • main.go
  • ocr.go
  • ocr/iosocr_provider.go
  • ocr/iosocr_provider_test.go
  • ocr/provider.go
  • ocr_test.go
  • paperless.go
  • permissions.go
  • types.go

Comment thread ocr_test.go Outdated
Comment thread ocr.go
Comment thread ocr/iosocr_provider.go
Comment thread permissions.go Outdated
Comment thread README.md
Comment thread README.md
vistalba added 2 commits May 18, 2026 12:55
- Rename tc to testCases in ocr_test.go for clarity
- Remove taskStatus[id] fallback in extractDocIDFromTask
- Make patchNewDocumentPermissions return error for retry logic
- Sort hocrPages by PageNumber in GetHOCRDocument
- Fix 'four different OCR providers' in README

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ocr_test.go`:
- Around line 200-206: The mock task currently sets "id" equal to documentID so
the test won't catch regressions in extractDocIDFromTask; change the "id" field
in the json.NewEncoder(w).Encode payload to use the distinct taskID (e.g. "id":
fmt.Sprintf("%d", taskID) or the taskID variable) while keeping
"related_document" as fmt.Sprintf("%d", documentID) so id and related_document
differ and the test actually verifies extractDocIDFromTask's behavior.

In `@ocr.go`:
- Around line 607-609: The replacement flow always calls
patchNewDocumentPermissions causing owner/permission restoration even when
PreserveOwnerPermissions is false; guard that call by checking the
PreserveOwnerPermissions flag before invoking app.patchNewDocumentPermissions
(use the same ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions,
logger parameters), so patchNewDocumentPermissions runs only when
PreserveOwnerPermissions is true and otherwise is skipped.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dde6162c-1c36-49ed-aa10-770bd800ad7a

📥 Commits

Reviewing files that changed from the base of the PR and between 25e5689 and 1fa994d.

📒 Files selected for processing (5)
  • README.md
  • ocr.go
  • ocr/iosocr_provider.go
  • ocr_test.go
  • permissions.go
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • permissions.go
  • ocr/iosocr_provider.go

Comment thread ocr_test.go
Comment on lines +200 to +206
json.NewEncoder(w).Encode([]map[string]interface{}{
{
"id": documentID,
"status": "SUCCESS",
"task_id": taskID,
"related_document": fmt.Sprintf("%d", documentID),
"result": fmt.Sprintf("Success. New document id %d created", documentID),

@coderabbitai coderabbitai Bot May 18, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make the mocked task id differ from related_document.

This fixture would still pass if extractDocIDFromTask regressed to using id, because both fields resolve to the same value here. Give id a distinct task-record value so the test actually guards the bug this PR fixed.

Suggested fix
 		json.NewEncoder(w).Encode([]map[string]interface{}{
 			{
-				"id":               documentID,
+				"id":               9999,
 				"status":           "SUCCESS",
 				"task_id":          taskID,
 				"related_document": fmt.Sprintf("%d", documentID),
 				"result":           fmt.Sprintf("Success. New document id %d created", documentID),
 			},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
json.NewEncoder(w).Encode([]map[string]interface{}{
{
"id": documentID,
"status": "SUCCESS",
"task_id": taskID,
"related_document": fmt.Sprintf("%d", documentID),
"result": fmt.Sprintf("Success. New document id %d created", documentID),
json.NewEncoder(w).Encode([]map[string]interface{}{
{
"id": 9999,
"status": "SUCCESS",
"task_id": taskID,
"related_document": fmt.Sprintf("%d", documentID),
"result": fmt.Sprintf("Success. New document id %d created", documentID),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ocr_test.go` around lines 200 - 206, The mock task currently sets "id" equal
to documentID so the test won't catch regressions in extractDocIDFromTask;
change the "id" field in the json.NewEncoder(w).Encode payload to use the
distinct taskID (e.g. "id": fmt.Sprintf("%d", taskID) or the taskID variable)
while keeping "related_document" as fmt.Sprintf("%d", documentID) so id and
related_document differ and the test actually verifies extractDocIDFromTask's
behavior.

✅ Addressed in commit 957f65d

Comment thread ocr.go Outdated
Comment on lines +607 to +609
// Restore owner and permissions on the new document
// Error is logged by patchNewDocumentPermissions — still proceed with deletion
_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)

@coderabbitai coderabbitai Bot May 18, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate the replacement-time permission PATCH behind PreserveOwnerPermissions.

This runs on every successful replacement, even when PreserveOwnerPermissions is false. If the source document has owner/permission fields, the new option becomes ineffective and replacement silently preserves them anyway.

Suggested fix
-				// Restore owner and permissions on the new document
-				// Error is logged by patchNewDocumentPermissions — still proceed with deletion
-				_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
+				if options.PreserveOwnerPermissions {
+					// Restore owner and permissions on the new document
+					// Error is logged by patchNewDocumentPermissions — still proceed with deletion
+					_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
+				}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Restore owner and permissions on the new document
// Error is logged by patchNewDocumentPermissions — still proceed with deletion
_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
if options.PreserveOwnerPermissions {
// Restore owner and permissions on the new document
// Error is logged by patchNewDocumentPermissions — still proceed with deletion
_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ocr.go` around lines 607 - 609, The replacement flow always calls
patchNewDocumentPermissions causing owner/permission restoration even when
PreserveOwnerPermissions is false; guard that call by checking the
PreserveOwnerPermissions flag before invoking app.patchNewDocumentPermissions
(use the same ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions,
logger parameters), so patchNewDocumentPermissions runs only when
PreserveOwnerPermissions is true and otherwise is skipped.

✅ Addressed in commit 957f65d

…ssions flag; distinguish mock task id from document id in test

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ocr.go (1)

583-625: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Only delete the source document after a confirmed SUCCESS.

Line 624 still deletes the original when polling times out or when status couldn't be confirmed. UploadDocument only means Paperless accepted the task; it does not guarantee the new document was created successfully. In a slow or delayed-processing case, this can drop the only copy of the document.

Suggested fix
-		for i := 0; i < maxRetries; i++ {
+		deleteOriginal := false
+		for i := 0; i < maxRetries; i++ {
 			taskStatus, err := app.Client.GetTaskStatus(ctx, taskID)
 			if err != nil {
-				logger.WithError(err).Warn("Failed to check task status, proceeding with deletion anyway")
+				logger.WithError(err).Warn("Failed to check task status, keeping original document")
 				break
 			}
@@
 			if status == "SUCCESS" {
 				logger.Info("Document processing completed successfully")
+				deleteOriginal = true
 
 				if options.PreserveOwnerPermissions {
 					_ = app.patchNewDocumentPermissions(ctx, taskStatus, originalDoc.Owner, originalDoc.Permissions, logger)
 				}
 
 				break
 			}
@@
-		// Delete original document (even if poll timed out — upload was successful)
-		if err := app.Client.DeleteDocument(ctx, documentID); err != nil {
+		if !deleteOriginal {
+			return fmt.Errorf("document processing did not reach SUCCESS, not deleting original document")
+		}
+		if err := app.Client.DeleteDocument(ctx, documentID); err != nil {
 			return fmt.Errorf("error deleting original document: %w", err)
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ocr.go` around lines 583 - 625, Current logic may call
app.Client.DeleteDocument even when polling timed out or task status is unknown;
change the flow so DeleteDocument is only invoked after a confirmed "SUCCESS"
status from app.Client.GetTaskStatus. Specifically, in the ReplaceOriginal
branch (where options.ReplaceOriginal is true) update the polling loop around
app.Client.GetTaskStatus/taskStatus so that you set a local flag (e.g.,
processingSucceeded) when status == "SUCCESS" (and run
app.patchNewDocumentPermissions there if PreserveOwnerPermissions), return an
error immediately if status == "FAILURE", and after the loop only call
app.Client.DeleteDocument(ctx, documentID) if processingSucceeded is true;
otherwise skip deletion and return a non-destructive error indicating the
document was not confirmed successful.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@ocr.go`:
- Around line 583-625: Current logic may call app.Client.DeleteDocument even
when polling timed out or task status is unknown; change the flow so
DeleteDocument is only invoked after a confirmed "SUCCESS" status from
app.Client.GetTaskStatus. Specifically, in the ReplaceOriginal branch (where
options.ReplaceOriginal is true) update the polling loop around
app.Client.GetTaskStatus/taskStatus so that you set a local flag (e.g.,
processingSucceeded) when status == "SUCCESS" (and run
app.patchNewDocumentPermissions there if PreserveOwnerPermissions), return an
error immediately if status == "FAILURE", and after the loop only call
app.Client.DeleteDocument(ctx, documentID) if processingSucceeded is true;
otherwise skip deletion and return a non-destructive error indicating the
document was not confirmed successful.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4ca7cc2d-4aba-499e-b5b9-c8c61703ffa7

📥 Commits

Reviewing files that changed from the base of the PR and between 1fa994d and 957f65d.

📒 Files selected for processing (2)
  • ocr.go
  • ocr_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • ocr_test.go

@vistalba

Copy link
Copy Markdown
Author

ocr.go (1)> 583-625: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Only delete the source document after a confirmed SUCCESS.
Line 624 still deletes the original when polling times out or when status couldn't be confirmed. UploadDocument only means Paperless accepted the task; it does not guarantee the new document was created successfully. In a slow or delayed-processing case, this can drop the only copy of the document.

NOTE (not addressed in this PR):
The unconditional DeleteDocument on polling timeout/error was pre-existing behavior. Fixing it would change the ReplaceOriginal contract for all providers and is outside the scope of this PR (iOS OCR + permission preservation).

vistalba added 2 commits May 18, 2026 19:28
Raise maxRetries from 12 to 720 (5s intervals) so that document
processing has up to ~1 hour to complete before giving up on
deleting the original. This accommodates slower paperless-ngx
processing common with mobile uploads, AI-based classification,
or large documents.
Previously, ReplaceOriginal would delete the original document even if
the processing task timed out or returned an unknown status. This could
cause data loss in edge cases where the new document was not fully
processed.

Now a deleteOriginal flag is set only on confirmed SUCCESS. The original
document is preserved and an error is returned if processing did not
reach SUCCESS.
@vistalba vistalba force-pushed the feat/ios-ocr-server-hocr branch from 17fcfc6 to d486e45 Compare May 19, 2026 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant