Skip to content

anndata2cas add accession_columns parameter#165

Merged
hkir-dev merged 2 commits into
mainfrom
anndata2cas_accessions
Jun 4, 2025
Merged

anndata2cas add accession_columns parameter#165
hkir-dev merged 2 commits into
mainfrom
anndata2cas_accessions

Conversation

@hkir-dev
Copy link
Copy Markdown
Collaborator

@hkir-dev hkir-dev commented Jun 2, 2025

Anndata2cas is automatically generating accession IDs from the hash of the cells belonging to the cell set. However in some cases (in the anndata or spreadsheet) there are already pre-calculated accession ids in the dataset. This update enables users to identify accession columns and utilise those instead of generating new accessions.

Also parent_cell_set_name doesn't exist in the CAS schema, so deleted it from the json output.

@hkir-dev hkir-dev requested a review from Copilot June 2, 2025 09:01
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates anndata2cas to support using pre-calculated accession IDs by introducing an accession_columns parameter while removing the now-unsupported parent_cell_set_name field. Key changes include updating tests to reflect schema changes, modifying conversion utility functions to accept an accession mapping, and adding a new MappedAccessionManager to handle predefined accession IDs.

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/test/spreadsheet_to_cas_test.py Updated expected annotation length in the test to match the removal of parent_cell_set_name.
src/test/conversion_utils_test.py Commented out parent_cell_set_name test case as it is no longer part of the schema.
src/cas/utils/conversion_utils.py Added an accession mapping parameter to generate_parent_cell_lookup and created create_accession_mapping.
src/cas/anndata_to_cas.py Updated function signature and logic to pass accession_columns and use the new accession mapping.
src/cas/accession/*.py Updated accession manager implementations to accept an extra cellset_name parameter.
src/cas/main.py & docs/cli.md Updated CLI options and documentation for the new accession_columns parameter.

Comment thread src/cas/utils/conversion_utils.py Outdated
@hkir-dev hkir-dev requested a review from ubyndr June 2, 2025 09:08
@hkir-dev hkir-dev merged commit 61a9f4a into main Jun 4, 2025
1 check passed
@hkir-dev hkir-dev deleted the anndata2cas_accessions branch June 4, 2025 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants