Skip to content

Release 8 fixes#167

Closed
aditigopalan wants to merge 11 commits into
mainfrom
release-8-fixes
Closed

Release 8 fixes#167
aditigopalan wants to merge 11 commits into
mainfrom
release-8-fixes

Conversation

@aditigopalan
Copy link
Copy Markdown
Collaborator

Summary
Fixes #152 / #153 — Required enum fields missing sentinel values: adds AJCC_TUMOR_STAGING_AVAILABLE (Yes/No/Unknown) as a new required field to Diagnosis, makes AJCC_STAGING_SYSTEM_EDITION optional (conditional on the new flag), and adds "Not Reported" to ECOGPerformanceStatusEnum

Fixes #165 — PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS were unconditionally required in Therapy; both are now set to required: false in the class body and slot_usage, and moved into the pharmacotherapy conditional rule

Fixes #166 — LinkML generates {"const": "Pharmacotherapy"} for equals_string checks on multivalued slots, which never matches an array value; adds a fix_multivalued_conditions() post-processor in linkml_to_flat_synapse_jsonschema.py that rewrites these to {"contains": {"const": "..."}} at build time

Test plan

  • poetry run pytest modules/Clinical/tests/ passes (11/11)
  • make modules-gen completes without new errors
  • Verify a Diagnosis record with AJCC_TUMOR_STAGING_AVAILABLE: "No" and no AJCC_STAGING_SYSTEM_EDITION loads without error
  • Verify a Therapy record with TREATMENT_TYPE: ["Surgical Procedure"] (no pharmacotherapy) no longer requires THERAPEUTIC_AGENTS or PHARMACOTHERAPY_TYPE
  • Verify a Therapy record with TREATMENT_TYPE: ["Pharmacotherapy"] still requires both fields

aditigopalan and others added 3 commits April 9, 2026 11:04
…ional on TREATMENT_TYPE

- Set PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS to required: false in
  the Therapy class body and slot_usage so they are absent from the
  top-level JSON Schema required array
- Add both to the pharmacotherapy rule postconditions so they become
  required only when TREATMENT_TYPE includes "Pharmacotherapy"
- Fix linkml_to_flat_synapse_jsonschema.py: add fix_multivalued_conditions()
  post-processor that rewrites const/pattern checks on array-typed
  properties to use JSON Schema `contains`, so conditionals on multivalued
  slots (TREATMENT_TYPE) fire correctly against array values
- Broaden array_props detection to include $defs so the fix works when
  generating for a parent class (e.g. ClinicalData)
- Add YAML comment documenting the LinkML equals_string/multivalued
  limitation and the downstream workaround

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…enum (fixes #152)

- Add AJCCTumorStagingAvailableEnum (Yes/No/Unknown) and required AJCC_TUMOR_STAGING_AVAILABLE slot to Diagnosis
- Make AJCC_STAGING_SYSTEM_EDITION optional (conditional on AJCC_TUMOR_STAGING_AVAILABLE = Yes)
- Add "Not Reported" to ECOGPerformanceStatusEnum
- Update test data to include new required field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
github-actions[bot]
github-actions Bot previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTAN Schema Review — ✅ Approved

Automated review by htan-claude · commit d68159a92c31f7df6154a31ce1875d0edf3d8566

Solid work on this PR — there are a few meaningful issues to address before merge, including one blocking concern about the equals_string predicate on a multivalued slot that leaves the source YAML semantically incorrect for native LinkML validators.

Caution

1 blocking issue must be resolved before merge

Files Changed

  • modules/Clinical/domains/diagnosis.yaml
  • modules/Clinical/domains/followup.yaml
  • modules/Clinical/domains/therapy.yaml
  • scripts/linkml_to_flat_synapse_jsonschema.py
  • modules/Clinical/tests/test_conditional_requirements.yaml
  • modules/Clinical/tests/test_clinical_data.yaml

Checklist Results

Check Result Notes
Inheritance correctness PASS No inheritance changes introduced
Inlining of nested objects N/A No new nested objects added
Slot completeness (range, title, description) PASS New slot AJCC_TUMOR_STAGING_AVAILABLE has all required fields
Enum integrity (alphabetical, descriptions) PASS All new/modified enum values are correctly ordered and have descriptions
Generated artifacts N/A Generated files excluded from diff per scope rules

Findings

Blocking

  • modules/Clinical/domains/therapy.yamlequals_string predicate on multivalued TREATMENT_TYPE is semantically incorrect for native LinkML validation (biological)
    The pharmacotherapy conditional rule uses preconditions: slot_conditions: TREATMENT_TYPE: equals_string: "Pharmacotherapy", but TREATMENT_TYPE is declared multivalued: true. LinkML's equals_string is defined for scalar values only and will never fire correctly on an array-valued slot in any validator consuming the schema directly (e.g., linkml-validate). The fix_multivalued_conditions() post-processor patches this for Synapse JSON Schema output, but the source YAML is left semantically broken for native LinkML tooling. The correct fix is to use any_of with pattern (consistent with the existing surgical/radiation precondition already in the file), e.g., any_of: - pattern: ".*Pharmacotherapy.*", which works in both LinkML-native and JSON Schema contexts. The post-processor workaround can remain for Synapse compatibility, but the source YAML should not rely exclusively on it.

Warnings

  • modules/Clinical/domains/diagnosis.yamlAJCC_TUMOR_STAGING_AVAILABLE marked required: true may be too strict for many cancer types (biological)
    AJCC staging is definitionally inapplicable for many cancer types (e.g., haematologic malignancies, CNS tumours), meaning this field would routinely be absent in valid datasets. The PR frames this slot as a sentinel to enable conditionality on AJCC_STAGING_SYSTEM_EDITION, so making the sentinel itself unconditionally required may force incorrect data entry. Consider either setting required: false or adding a "Not Applicable" permissible value to AJCCTumorStagingAvailableEnum (ordered alphabetically between "No" and "Unknown") and confirming HTAN mandates this field universally via a data-dictionary reference.

  • modules/Clinical/domains/diagnosis.yamlAJCCTumorStagingAvailableEnum missing "Not Applicable" permissible value (biological)
    The enum covers Yes / No / Unknown but provides no way to express that AJCC staging is definitionally inapplicable for a given cancer type. Neither "No" (staging was possible but not done) nor "Unknown" (unknown whether it was done) accurately captures cases like leukaemias or certain CNS tumours. Analogous HTAN enums consistently include "Not Applicable"; its absence here will force submitters to misuse the existing values. Add "Not Applicable": Determination of a value is not relevant in the current context. ordered alphabetically between "No" and "Unknown".

  • modules/Clinical/domains/therapy.yaml — Missing newline at end of file (structural)
    The diff shows \ No newline at end of file after the final line of therapy.yaml. While most YAML parsers tolerate this, it is non-conformant with POSIX text file standards and can cause subtle issues with some tooling (diffs, concatenation, YAML streaming parsers). Add a trailing newline after the last line.

  • modules/Clinical/domains/diagnosis.yaml — No invalid-value test for AJCCTumorStagingAvailableEnum, and no missing-required-field test for AJCC_TUMOR_STAGING_AVAILABLE (coverage)
    Two test gaps exist for the new enum and slot: (1) no test asserts that an invalid value (e.g., "Maybe") raises a ValueError; (2) no test verifies that a Diagnosis record missing AJCC_TUMOR_STAGING_AVAILABLE fails validation (or asserts required == True via schema-view). Valid-value coverage exists in the updated test data, but the negative paths are unexercised. Add both assertion types to the Clinical test suite.

  • modules/Clinical/domains/followup.yaml — No test for newly added "Not Reported" value in ECOGPerformanceStatusEnum (coverage)
    "Not Reported" was appended to ECOGPerformanceStatusEnum but no test data or assertion covering ECOG_PERFORMANCE_STATUS: "Not Reported" is visible in the diff. Add a valid-value test confirming this value is accepted by the schema.

  • modules/Clinical/domains/therapy.yaml — Non-pharmacotherapy Therapy record negative path not exercised in tests (coverage)
    Both PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS were changed from required: true to required: false, but the test data for the surgical procedure case in test_conditional_requirements.yaml still includes both fields. The key regression path — a Therapy record with TREATMENT_TYPE: ["Surgical Procedure"] and neither PHARMACOTHERAPY_TYPE nor THERAPEUTIC_AGENTS present — is not exercised. Add a test case omitting both fields for a non-pharmacotherapy treatment type to confirm the fix holds.

Informational

  • modules/Clinical/domains/therapy.yaml — Redundant required: false in both attributes and slot_usage for PHARMACOTHERAPY_TYPE / THERAPEUTIC_AGENTS (biological)
    Both slots have required: false set in the class attributes block and again in slot_usage. The slot_usage entries are still valuable for carrying contextual description strings, but explicitly repeating required: false there implies the override is load-bearing when it isn't. Consider removing required: false from the slot_usage entries and retaining only the descriptions, to make the class-body setting the unambiguous authoritative source. No action required, but worth cleaning up for future editor clarity.

  • pyproject.toml — LinkML version pin not verifiable from this diff (coverage)
    The diff does not include pyproject.toml, so the LinkML version pin cannot be confirmed. The README documents linkml ^1.3.5, which is more than 2 minor versions behind the current stable 1.8.x line. No action required from the author on this PR, but a dedicated dependency-update PR is worth tracking.

  • modules/Clinical/domains/diagnosis.yaml — Schema version bump may be warranted (coverage)
    This PR adds a new required slot, makes an existing slot optional, and introduces a new enum — collectively breaking changes for existing Diagnosis records. If diagnosis.yaml carries a version: header field, consider a minor or major version bump to signal the compatibility boundary. Awareness only; conventions may vary by repo policy.

Verdict

REQUEST_CHANGES


Rules defined in CLAUDE.md · To update review rules, edit CLAUDE.md and open a PR to main

Auto-generated Python classes from LinkML schema updates.

**Auto-generated by GitHub Actions workflow**
[skip ci]
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

Python classes auto-updated!

The Python classes have been automatically regenerated and committed to this PR branch.

**Updated files:**
```
modules/Biospecimen/src/htan_biospecimen/datamodel/biospecimen.py

modules/Clinical/src/htan_clinical/datamodel/clinical.py
modules/DigitalPathology/src/htan_digitalpathology/datamodel/digital_pathology.py
modules/Imaging/src/htan_imaging/datamodel/imaging.py
modules/MultiplexMicroscopy/src/htan_multiplexmicroscopy/datamodel/multiplex_microscopy.py
modules/Sequencing/src/htan_sequencing/datamodel/sequencing.py
modules/SpatialOmics/src/htan_spatial/datamodel/spatial.py
modules/WES/src/htan_wes/datamodel/wes.py
modules/scRNA-seq/src/htan_scrna_seq/datamodel/scrna_seq.py
```

The generated classes are now up to date with the schema changes.

aditigopalan and others added 2 commits April 9, 2026 12:21
…multivalued slot

equals_string on a multivalued slot is semantically undefined in LinkML and
never fires in native validators (e.g. linkml-validate). Switch to any_of with
pattern, consistent with the existing surgical/radiation precondition.
fix_multivalued_conditions() in the Synapse post-processor already handles
pattern rewrites to contains, so Synapse JSON Schema output is unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot dismissed their stale review April 9, 2026 16:23

Superseded by updated review on commit e58fa18

github-actions[bot]
github-actions Bot previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTAN Schema Review — ✅ Approved

Automated review by htan-claude · commit e58fa183e2c4f2526d4f10b25b10b9332600ba22

Solid set of fixes for Release 8 — a few warnings to address (notably test coverage gaps and a potential "Not Applicable" gap in the new enum), but nothing that's outright broken.

Warning

6 warnings to address (no blocking issues)

Files Changed

  • modules/Clinical/domains/diagnosis.yaml
  • modules/Clinical/domains/followup.yaml
  • modules/Clinical/domains/therapy.yaml
  • modules/Clinical/tests/test_clinical_data.yaml
  • modules/Clinical/tests/test_conditional_requirements.yaml
  • scripts/linkml_to_flat_synapse_jsonschema.py

Checklist Results

Check Result Notes
Inheritance correctness PASS No inheritance changes in scope
Inlining of nested objects PASS No inlining changes introduced
Slot completeness (range, title, description) PASS New slot AJCC_TUMOR_STAGING_AVAILABLE has all required fields; slot_usage overrides correctly omit range/title per established pattern
Enum integrity (alphabetical, descriptions) PASS AJCCTumorStagingAvailableEnum (N < U < Y) and ECOGPerformanceStatusEnum (digits before strings) both sort correctly; all new values have descriptions
Generated artifacts N/A Generated .py and .json files excluded from diff per scope rules

Findings

Warnings

  • modules/Clinical/domains/diagnosis.yamlAJCCTumorStagingAvailableEnum missing "Not Applicable" permissible value (biological)
    The new enum provides "Yes", "No", and "Unknown", but omits "Not Applicable" — a clinically meaningful sentinel for tumor types where AJCC staging is formally inapplicable (e.g., brain tumors, hematologic malignancies, many pediatric cancers). Without it, curators are forced to record "Unknown" (implies undetermined) or "No" (implies determined-to-be-unavailable), both of which are semantically incorrect for an inapplicable scenario. This also creates inconsistency with analogous sentinel enums elsewhere in the Clinical module (e.g., ProgressionOrRecurrenceEnum), which do include "Not Applicable". Consider adding "Not Applicable" with an appropriate description before merge.

  • modules/Clinical/domains/therapy.yaml — Missing trailing newline at end of file (structural)
    The diff shows \ No newline at end of file after the final required: true line added by this PR. While YAML parsers typically handle this gracefully, it is non-conformant and can produce spurious diffs in future PRs. Since this PR modifies the last line of the file, it's in scope to fix: add a single trailing newline after the final required: true.

  • modules/Clinical/domains/therapy.yaml — Pharmacotherapy conditional regex may produce false positives (biological)
    The precondition pattern: ".*Pharmacotherapy.*" will substring-match any future TreatmentTypeEnum value containing "Pharmacotherapy" (e.g., a hypothetical "Non-Pharmacotherapy" or "Anti-Pharmacotherapy"). The current enum has only one matching value so there's no immediate failure, but a tighter pattern — pattern: "^Pharmacotherapy$" — would be more defensive and express intent more clearly. The fix_multivalued_conditions() post-processor correctly rewrites the contains wrapping at build time, but the regex permissiveness remains a latent schema-level risk.

  • modules/Clinical/domains/diagnosis.yaml + modules/Clinical/domains/followup.yaml — Missing automated test coverage for new slot, changed optionality, and new enum value (coverage)
    Three behavioral changes introduced by this PR lack dedicated pytest assertions: (a) AJCC_TUMOR_STAGING_AVAILABLE is required: true — no test verifies a Diagnosis record missing this field fails validation or that an invalid enum value like "Maybe" raises an error; (b) AJCC_STAGING_SYSTEM_EDITION is now optional — no test asserts a Diagnosis with AJCC_TUMOR_STAGING_AVAILABLE: "No" and no AJCC_STAGING_SYSTEM_EDITION loads without error; (c) "Not Reported" in ECOGPerformanceStatusEnum — no test validates acceptance of that value while rejecting out-of-range inputs. The PR test plan calls these out as manual verification steps; they should be encoded as automated pytest cases in modules/Clinical/tests/.

  • modules/Clinical/domains/therapy.yaml — Missing automated test coverage for pharmacotherapy conditional rule (coverage + biological)
    The core fix in this PR (moving PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS into the pharmacotherapy conditional rule) has no invalid-instance test coverage for either path: (a) a Therapy record with TREATMENT_TYPE: ["Pharmacotherapy"] but missing PHARMACOTHERAPY_TYPE or THERAPEUTIC_AGENTS should fail validation; (b) a Therapy record with TREATMENT_TYPE: ["Surgical Procedure"] and neither field present should load without error. The existing test fixture at test_conditional_requirements.yaml includes pharmacotherapy fields even for a "Surgical Procedure" treatment type, which means it doesn't exercise the negative path the PR is specifically fixing. Add pytest cases for both paths.

  • modules/Clinical/domains/diagnosis.yamlAJCC_TUMOR_STAGING_AVAILABLE: required: true may break validation for legacy records (biological + coverage)
    Making the new sentinel field unconditionally required: true means any existing Diagnosis record submitted before this PR (which cannot have AJCC_TUMOR_STAGING_AVAILABLE) will fail validation on re-submission or schema re-validation. This is expected for a breaking change, but should be explicitly acknowledged — and if a schema version: field exists on the Clinical module, a minor or major version bump is warranted to signal the break to downstream consumers. Confirm this is intentional and document it in the PR or changelog.

Informational

  • modules/Clinical/domains/followup.yamlECOGPerformanceStatusEnum mixes numeric and string values (biological)
    The addition of "Not Reported" after "5" is alphabetically correct (digits sort before letters in lexicographic order) and consistent with how other enums in this module handle the same sentinel. Downstream consumers sorting the enum numerically may place "Not Reported" unexpectedly, but this is a pre-existing pattern in the module — no action required.

  • modules/Clinical/tests/test_conditional_requirements.yaml — Surgical-procedure therapy record includes pharmacotherapy fields (biological + coverage)
    The test fixture specifies TREATMENT_TYPE: "Surgical Procedure" while also populating PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS. This doesn't cause a validation failure (optional fields may be present), but it doesn't exercise the key negative case the PR intends to fix. Addressed as a warning above (add an explicit negative-path test); flagged here for awareness that the existing fixture alone is insufficient documentation of the fixed behavior.

Verdict

REQUEST_CHANGES


Rules defined in CLAUDE.md · To update review rules, edit CLAUDE.md and open a PR to main

Auto-generated Python classes from LinkML schema updates.

**Auto-generated by GitHub Actions workflow**
[skip ci]
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

Python classes auto-updated!

The Python classes have been automatically regenerated and committed to this PR branch.

**Updated files:**
```
modules/Biospecimen/src/htan_biospecimen/datamodel/biospecimen.py

modules/Clinical/src/htan_clinical/datamodel/clinical.py
modules/DigitalPathology/src/htan_digitalpathology/datamodel/digital_pathology.py
modules/Imaging/src/htan_imaging/datamodel/imaging.py
modules/MultiplexMicroscopy/src/htan_multiplexmicroscopy/datamodel/multiplex_microscopy.py
modules/Sequencing/src/htan_sequencing/datamodel/sequencing.py
modules/SpatialOmics/src/htan_spatial/datamodel/spatial.py
modules/WES/src/htan_wes/datamodel/wes.py
modules/scRNA-seq/src/htan_scrna_seq/datamodel/scrna_seq.py
```

The generated classes are now up to date with the schema changes.

aditigopalan and others added 3 commits April 9, 2026 14:44
#159)

Sphinx slugifies heading IDs by lowercasing and replacing non-alphanumeric
characters with dashes. The link generator was using .lower() only, so
enum names with underscores (e.g. tissue_or_organ_of_origin_uberon_enum)
generated anchors that didn't match the rendered heading ID, causing the
broken link reported for SITE_OF_RESECTION_OR_BIOPSY.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add ChannelMetadata to the json-schema-synapse.yml workflow so
HTAN.ChannelMetadata-v*.json is produced at each release alongside
the Level 2/3/4 schemas. Also update the MultiplexMicroscopy Makefile
so local gen-synapse-schema generates both schemas.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot dismissed their stale review April 9, 2026 18:46

Superseded by updated review on commit 76d6fde

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTAN Schema Review — ✅ Approved

Automated review by htan-claude · commit 76d6fdec57e7c8633a7b747dbe32dd5837015851

Solid PR with a clear problem statement — found one blocking issue with an invalid LinkML precondition expression, several test coverage gaps, and a few informational notes worth tracking.

Caution

1 blocking issue must be resolved before merge

Files Changed

  • modules/Clinical/domains/diagnosis.yaml
  • modules/Clinical/domains/therapy.yaml
  • modules/Clinical/domains/followup.yaml
  • modules/Clinical/linkml_to_flat_synapse_jsonschema.py
  • modules/Clinical/tests/test_clinical_data.yaml
  • modules/Clinical/tests/test_conditional_requirements.yaml

Checklist Results

Check Result Notes
Inheritance correctness PASS No inheritance changes in this PR
Inlining of nested objects N/A No nested object changes
Slot completeness (range, title, description) PASS New slot AJCC_TUMOR_STAGING_AVAILABLE and modified slots are structurally complete
Enum integrity (alphabetical, descriptions) PASS PharmacotherapyTypeEnum and ECOGPerformanceStatusEnum ordering correct; new values have descriptions
Generated artifacts N/A Generated files excluded from diff per scope rules

Findings

Blocking

  • therapy.yaml — Invalid precondition expression for AGE_IN_DAYS_AT_TREATMENT_END (structural + biological)
    The rule that makes OFF_TREATMENT_REASON and RESPONSE conditionally required uses required: true inside preconditions.slot_conditions.AGE_IN_DAYS_AT_TREATMENT_END. In LinkML, required is a slot definition property — it is not a valid runtime value comparator inside a rule precondition. LinkML preconditions check slot values (via equals_string, minimum_value, value_presence, etc.), not schema declarations. As written, this rule will never fire, meaning OFF_TREATMENT_REASON and RESPONSE will remain unconditionally optional even when a treatment end date is supplied. Fix by replacing required: true with value_presence: PRESENT in the precondition block.

Warnings

  • therapy.yaml — Multivalued slot pattern matching in rule preconditions may not fire correctly (structural + biological)
    The pharmacotherapy and surgical/radiation conditional rules use any_of: [{pattern: ".*Pharmacotherapy.*"}] (and analogous patterns) as preconditions on TREATMENT_TYPE, which is multivalued: true. LinkML's slot_conditions with pattern evaluates against slot values in an implementation-dependent way for multivalued slots — the PR itself notes a related issue where equals_string on multivalued slots generates broken {"const": "..."} JSON Schema (fixed via fix_multivalued_conditions()). It is not confirmed that the same post-processor handles pattern-based preconditions in conditional rules. If these preconditions silently fail, PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS will never be enforced even when TREATMENT_TYPE contains "Pharmacotherapy". Verify during make modules-gen and confirm whether fix_multivalued_conditions() covers this case, or document the limitation explicitly.

  • diagnosis.yamlAJCC_TUMOR_STAGING_AVAILABLE required cardinality may break retrospective data re-ingestion (biological)
    The new AJCC_TUMOR_STAGING_AVAILABLE slot is added as required: true. Retrospective HTAN cohorts frequently lack documentation of whether AJCC staging was even attempted (distinct from staging results being absent), so a hard required constraint may cause existing valid Diagnosis records to fail validation on re-ingestion. Consider whether required: false with a default of "Unknown" better reflects real-world data completeness, or explicitly document why retroactive enforcement is acceptable for this release.

  • modules/Clinical/tests/ — Missing test coverage for AJCC_TUMOR_STAGING_AVAILABLE conditionality and the "No" path (structural + coverage)
    The updated test data covers only the case where AJCC_TUMOR_STAGING_AVAILABLE: "Yes" and AJCC_STAGING_SYSTEM_EDITION: "8th" are both present. Missing coverage includes: (a) a Diagnosis record with AJCC_TUMOR_STAGING_AVAILABLE: "No" and no AJCC_STAGING_SYSTEM_EDITION loading without error, (b) a record missing AJCC_TUMOR_STAGING_AVAILABLE entirely raising a ValueError, and (c) confirmation that AJCC_STAGING_SYSTEM_EDITION is no longer unconditionally required at the class level. The PR description lists these as manual verification steps — they should be automated.

  • modules/Clinical/tests/ — Missing test coverage for PHARMACOTHERAPY_TYPE / THERAPEUTIC_AGENTS optionality (coverage)
    The test data does not demonstrate a Therapy record with TREATMENT_TYPE: ["Surgical Procedure"] that omits both PHARMACOTHERAPY_TYPE and THERAPEUTIC_AGENTS and loads without error. Without this, the fix for issue #165 is not automatically validated. Add a test case asserting these fields are required: false at the class level and that a non-pharmacotherapy record omitting them passes validation.

  • modules/Clinical/tests/ — New "Not Reported" value in ECOGPerformanceStatusEnum lacks test coverage (coverage)
    The new permissible value "Not Reported" added to ECOGPerformanceStatusEnum is not exercised in any visible test data. Add at least one test that loads a valid FollowUp (or equivalent) instance with ECOG_PERFORMANCE_STATUS: "Not Reported" and confirms no error is raised. An invalid-value rejection test (e.g., "6") would further strengthen coverage.

Informational

  • diagnosis.yaml / therapy.yaml — Schema version bump advisable for breaking changes (biological + coverage)
    This PR adds a new required field (AJCC_TUMOR_STAGING_AVAILABLE), makes a previously required field optional (AJCC_STAGING_SYSTEM_EDITION), and changes the requiredness of two slots in Therapy — all backward-incompatible changes for existing data consumers. If the schema headers carry a version: field, incrementing it would communicate the breaking change clearly. No action required unless the team's versioning policy mandates it.

  • therapy.yaml — Generic slot name RESPONSE noted for future refactoring (coverage)
    The slot RESPONSE was not introduced in this PR but its slot_usage entry was touched (changed to required: false). For awareness: a more specific name such as TREATMENT_RESPONSE would reduce risk of slot overloading in future cross-module work. No action needed now.

Verdict

REQUEST_CHANGES


Rules defined in CLAUDE.md · To update review rules, edit CLAUDE.md and open a PR to main

Auto-generated Python classes from LinkML schema updates.

**Auto-generated by GitHub Actions workflow**
[skip ci]
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

Python classes auto-updated!

The Python classes have been automatically regenerated and committed to this PR branch.

**Updated files:**
```
modules/Biospecimen/src/htan_biospecimen/datamodel/biospecimen.py

modules/Clinical/src/htan_clinical/datamodel/clinical.py
modules/DigitalPathology/src/htan_digitalpathology/datamodel/digital_pathology.py
modules/Imaging/src/htan_imaging/datamodel/imaging.py
modules/MultiplexMicroscopy/src/htan_multiplexmicroscopy/datamodel/multiplex_microscopy.py
modules/Sequencing/src/htan_sequencing/datamodel/sequencing.py
modules/SpatialOmics/src/htan_spatial/datamodel/spatial.py
modules/WES/src/htan_wes/datamodel/wes.py
modules/scRNA-seq/src/htan_scrna_seq/datamodel/scrna_seq.py
```

The generated classes are now up to date with the schema changes.

@aditigopalan
Copy link
Copy Markdown
Collaborator Author

Closing to start fresh on clinical changes.

@aditigopalan aditigopalan deleted the release-8-fixes branch April 14, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants