Dev by jairomelo · Pull Request #12 · UCSB-AMPLab/digitization-toolkit-backend

jairomelo · 2026-02-21T08:21:46Z

This pull request introduces a major refactor to the backend data model, focusing on renaming and restructuring the "document" concept to "record," separating descriptive metadata from image data, and introducing collections as a new organizational entity. It also updates authentication logic and camera API endpoints to support these changes.

Data model refactor and migration:

Renamed the document_images table to record_images, updated all related foreign keys, indexes, and references in associated tables (camera_settings, exif_data) to use record_image_id instead of document_image_id.
Introduced a new records table to separate descriptive metadata from images, migrated existing data by creating one Record per RecordImage, and updated record_images to link to records. Old metadata columns were removed from record_images.
Added a collections table to organize records, with support for hierarchical collections and linking records to collections or projects. Updated record_images to support collection_id and enforced single-parent constraints.

API and authentication updates:

Modified authentication to allow tokens via query parameters as well as Authorization headers, improving support for browser image requests. [1] [2]
Changed the login endpoint to use a new UserLogin schema for clarity and security.

Camera API enhancements:

Updated camera API endpoints and models to reference Record and RecordImage, allowing capture requests to specify or create records, and to return record/image IDs in responses. [1] [2] [3] [4]
Refactored image capture logic to create or link records, associate captured images with records, and extract additional metadata such as file size and EXIF data. [1] [2]

Miscellaneous:

Updated Alembic environment to import new models (record, collection) for migration registration.
Added thumbnail generation utility import for future use in camera endpoints.

These changes lay the groundwork for more robust metadata management, collection organization, and improved API usability.

- Using lifespan context manager to replace deprecated startup event - Import pydantic_settings to resolve wrong pydantic import - Add required dependencies to requirements.txt - Mount only app/ subdirectory, remove reinstall to correct volume overwrite + race condition

…recated orm_mode

…event performance issues

…_camera_settings

…n CameraSettingsRead and ExifDataRead

…HOST, PORT, LOG)

- Introduces user authentication endpoints and models, project management endpoints and models, document and camera APIs with user and project associations. - Adds password hashing and token-based security utilities, updates Pydantic schemas for new fields - Includes a test script for API validation. - Includes API documentation and backend implementation details (located inside `digitization-toolkit-software/docs/developers`) for mantainability and clearer frontend integration. - Updates requirements and environment variables for new features.

…or image capture

…ce.py

…ndependent camera settings

…ervice

… capture_image function

… functions

…tenance

…or capture metadata

…unctions for capture metadata management

…ints

… models

…tion IDs

…and endpoints

…ate image metadata handling

…capture functionality

…rameters

…ain application

…ad of RecordImage

Copilot

Pull request overview

This pull request implements a comprehensive refactoring of the backend data model, transitioning from a flat "document" structure to a hierarchical "record/record_image" architecture with collection support. The changes enable better organization of digitized archival materials by separating descriptive metadata (records) from physical image files (record images), and introducing collections for hierarchical archival organization.

Changes:

Renamed and restructured the data model from DocumentImage to separate Record and RecordImage entities, with Records representing conceptual archival objects and RecordImages representing individual captured files
Introduced a Collection model for hierarchical organization with support for nested subcollections (e.g., fonds → series → box → folder)
Enhanced camera API to create or link captures to records, track manifest IDs (capture_id, pair_id), and return record/image IDs in responses
Modified authentication to accept tokens via query parameters in addition to Authorization headers for browser image requests

Reviewed changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
scripts/setup_camera_backends.sh	New setup script for Raspberry Pi camera dependencies
pixi.toml	New Pixi package manager configuration for the project
app/models/record.py	New model separating Record (metadata) from RecordImage (files)
app/models/collection.py	New hierarchical collection model for archival organization
app/models/camera.py	Updated to reference record_image_id instead of document_image_id
app/models/project.py	Updated relationships to use records and collections
app/models/document.py	Deleted - replaced by record.py
app/schemas/record.py	Refactored schemas for Record, RecordImage, ExifData, and CameraSettings
app/schemas/collection.py	New schemas for collection CRUD operations
app/schemas/camera.py	Updated to use record_image_id
app/schemas/user.py	Added UserLogin schema for login endpoint
app/api/records.py	New endpoint replacing documents.py with record-centric operations
app/api/collections.py	New endpoint for collection hierarchy management
app/api/cameras.py	Updated to create/link records and return record/image IDs
app/api/projects.py	Updated references from documents to records
app/api/auth.py	Added query parameter token support for image requests
app/api/system.py	New system monitoring endpoint for CPU temperature
app/api/documents.py	Deleted - replaced by records.py
app/main.py	Updated router imports and CORS origins
app/core/db.py	Updated model imports
app/core/thumbnail.py	Updated comments to reference records
alembic/env.py	Updated model imports for migrations
alembic/versions/c3d4e5f6a7b8_*.py	Migration to rename document_images to record_images
alembic/versions/48189f9482e3_*.py	Migration to add collections table
alembic/versions/19e2aefe5b17_*.py	Migration to separate Record from RecordImage
capture/service.py	Updated return values to include capture_id and pair_id
.gitignore	Added Pixi-related files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-21T08:28:01Z

app/api/auth.py

+    # Accept token from Authorization header OR ?token= query param (needed for <img src>)
+    raw_token = credentials.credentials if credentials else token
+    if not raw_token:
+        raise HTTPException(status_code=401, detail="Not authenticated")
+    payload = verify_access_token(raw_token)


Accepting authentication tokens via query parameters poses a security risk. Query parameters are often logged in server logs, browser history, and referrer headers, potentially exposing sensitive tokens. While this may be necessary for browser image src attributes, consider implementing short-lived temporary tokens specifically for image access, or use a separate endpoint that validates the token and redirects to a signed URL.

Copilot · 2026-02-21T08:28:02Z

alembic/versions/19e2aefe5b17_refactor_separate_record_from_.py

+    # Step 6: Drop old columns from record_images
+    op.drop_column('record_images', 'modified_at')
+    op.drop_column('record_images', 'material')
+    op.drop_column('record_images', 'collection_id')
+    op.drop_column('record_images', 'project_id')


The migration drops columns project_id and collection_id from record_images (lines 111-112) without first dropping the check_record_single_parent constraint that was created in the previous migration (48189f9482e3) which references these columns. This could cause the migration to fail. The constraint should be explicitly dropped before dropping the columns it references.

Copilot · 2026-02-21T08:28:02Z

app/api/collections.py

+    record_count = db.query(func.count(RecordImage.id)).filter(
+        RecordImage.collection_id == collection_id


The record count query is incorrect. It queries RecordImage.collection_id but RecordImage doesn't have a collection_id field according to the new data model. Records are linked to collections via Record.collection_id, and RecordImages are linked to Records via RecordImage.record_id. The query should be: db.query(func.count(Record.id)).filter(Record.collection_id == collection_id).scalar()

Suggested change

record_count = db.query(func.count(RecordImage.id)).filter(

RecordImage.collection_id == collection_id

record_count = db.query(func.count(Record.id)).filter(

Record.collection_id == collection_id

jairomelo and others added 30 commits January 17, 2026 12:17

fix: replacing deprecated method for utc time

1ad4ec4

fix: including validation for pagination parameters

4683862

fix: ensure model imports are mandatory for database initialization

9c3f5d3

fix: update DocumentRead config to use form_attributes instead of dep…

d793f6a

…recated orm_mode

missing foreing key constraint on document_image_id for ExifData model.

c4cec28

fix: change database connection pool from NullPool to QueuePool to pr…

68d85f0

…event performance issues

fix: update list_camera_settings to use Query parameters for pagination

8e85aae

fix: handle IntegrityError during document creation to prevent crashes

5e1e371

fix: add document existence check and handle IntegrityError in create…

adfde6b

…_camera_settings

fix: correct module import

f85e0e5

fix: update Config class to use form_attributes instead of orm_mode i…

fb15db2

…n CameraSettingsRead and ExifDataRead

fix: remove unused import of os in db.py

bdf24d8

fix: add .python-version file to manage versioning with pyenv

68cec85

feat: add Alembic configuration and initial migration setup

1ad61a1

added missing fields in class settings of app/core/config.py (UVCORN_…

7cda45c

…HOST, PORT, LOG)

feat: add configuration settings and initial service implementation f…

9e34eb9

…or image capture

feat: add dual shoot script for simultaneous image capture with logging

143cdf2

refactor: remove test project creation and subprocess call from servi…

cf96694

…ce.py

feat: update .gitignore to include _data/, _logs/, and *.pdf

d8876a7

feat: implement CameraConfig class and update capture functions for i…

91c808e

…ndependent camera settings

feat: enhance configuration settings and improve logging in capture s…

fa0c821

…ervice

fix: import Path

aa9223c

feat: add encoding and raw capture options to CameraConfig and update…

6cd3182

… capture_image function

feat: add image encoding parameter to filename generation and capture…

a91aaed

… functions

feat: refactor CameraConfig as module for better readability and main…

7c81d87

…tenance

feat: add app_version field to Settings for version tracking

1f4eb61

feat: add CaptureFile, CaptureCamera, and CaptureRecord dataclasses f…

fadac10

…or capture metadata

feat: implement generate_manifest_record and append_manifest_record f…

ab3e15e

…unctions for capture metadata management

jairomelo and others added 25 commits January 17, 2026 12:24

refactor: update DATABASE_URL construction and .env file loading

2366c21

chore: remove obsolete .env and .env-dist files

8614051

resolved merge conflict

8cec4f6

chore: update .gitignore and add pixi.toml for project configuration

553b2bc

fix: update Python version constraint and correct comment in pixi.toml

f102d99

feat: move setup script for DTK capture service dependencies

2c5c6f5

feat: add UserLogin schema and update login endpoint to use it

b26914e

feat: rename documents to records and update related models and endpo…

a387eee

…ints

feat: add collections feature with CRUD operations and update related…

3cab342

… models

feat: add filtering options for listing records by project and collec…

328c057

…tion IDs

feat: refactor Record and RecordImage models, update related schemas …

38a5b5b

…and endpoints

feat: enhance capture functionality to link or create records and upd…

3ad4181

…ate image metadata handling

feat: update .gitignore to include backup files

26d749b

feat: implement on-demand thumbnail generation for images and update …

d52bfe0

…capture functionality

feat: enhance authentication to support token retrieval from query pa…

8555e23

…rameters

feat: update thumbnail generation paths to store alongside source images

475e4fc

feat: add system router for temperature endpoint and integrate into m…

3aa23d7

…ain application

feat: update record handling in project API to use Record model inste…

3ecf756

…ad of RecordImage

feat: update record handling in collection deletion to use Record model

7909c22

feat: add orphaned option to list records

8a68499

feat: update CORS middleware to allow production Node server

e5d26f2

feat: remove redundant FastAPI initialization

06a38c8

feat: add operational status check for camera devices

34990a7

feat: enhance camera operational check using subprocess for reliability

7658691

feat: remove camera operational probing logic and related imports

e626204

jairomelo requested review from Asemica-me and Copilot February 21, 2026 08:21

Copilot started reviewing on behalf of jairomelo February 21, 2026 08:22 View session

Copilot AI reviewed Feb 21, 2026

View reviewed changes

feat: update .gitignore to include .DS_Store and retain *.backup

a172e0e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev#12

Dev#12
jairomelo wants to merge 94 commits intomainfrom
dev

jairomelo commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Copilot AI Feb 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		record_count = db.query(func.count(RecordImage.id)).filter(
		RecordImage.collection_id == collection_id

Conversation

jairomelo commented Feb 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants