- Status: Draft
- Date: 2025-12-23
- Context: The scientific community lacks a standardized, scalable way to cite thousands of datasets/software in a single journal article. We need a tool to generate OAI-ORE compliant "Complex Citation Objects" (CCOs) that is portable, modular, and user-friendly.
We will implement a "Core Library + Service Wrapper" architecture.
cco-core: A pure Python library containing the domain logic (parsing, validation, OAI-ORE graph generation). It has no web dependencies and can be used in offline scripts.cco-service: A FastAPI web application that wraps the core library to provide a user interface, session management, and asynchronous metadata enrichment.
This separation ensures the tool is portable (can be hosted by Zenodo, a Journal, or a University) and composable (the core logic can be reused in other pipelines).
The system acts as a middleware between raw user inputs (BibTeX, Lists) and permanent repositories (Zenodo, etc.).
The application is containerized using Docker to ensure easy deployment by third parties.
- Frontend (React): A Single Page Application (SPA) for manual editing and visualization.
- Backend (FastAPI): Handles API requests, orchestration, and auth flows.
- Worker (Background Tasks): Handles high-latency tasks (fetching metadata for 1000+ IDs) without blocking the UI.
- Session Store (Redis): Stores "Work-in-Progress" CCOs. Decision: We do not use a permanent database; state is transient.
We use Pydantic to enforce the OAI-ORE specification.
Aggregation: The central collection object.AggregatedResource: The individual Linked Digital Object (LDO).ResourceMap: The top-level document wrapper (serialized to JSON-LD).
To handle diverse inputs without tight coupling, we define a common BaseIngestor interface.
BibTexIngestor: Parses rich metadata from.bibfiles.IdentifierListIngestor: Parses raw lists of DOIs. Creates "hollow" objects that require enrichment.OaiOreIngestor: Parses existing JSON-LD for the "Edit" workflow.
To allow publishing to different repositories, we define a RepositoryAdapter interface.
ZenodoAdapter: Implements the "Reserve DOI -> Update CCO -> Publish" cycle specific to Zenodo.- Future: Can add
DryadAdapter,FigshareAdapterwithout changing core logic.
- Containerization: The system is defined via
docker-compose.ymlorchestrating Backend, Frontend, and Redis. - Rebrandable: The Frontend reads configuration (branding, active adapters) from environment variables to allow hosting institutions to customize the look and feel.
-
Pros:
- High Portability: Can be deployed anywhere via Docker.
- Scalability: Async workers handle large citation lists preventing timeouts.
- Reusability:
cco-corecan be published to PyPI separately.
-
Cons:
- State Complexity: Managing state in Redis (vs a standard DB) requires careful expiration handling.
- Async UX: The Frontend must handle "loading" states gracefully while metadata populates in the background.