Skip to content

Update underlying template to v0.4.2 & linkml to v1.10.0#64

Draft
dalito wants to merge 4 commits intomainfrom
update-linkml-and-template
Draft

Update underlying template to v0.4.2 & linkml to v1.10.0#64
dalito wants to merge 4 commits intomainfrom
update-linkml-and-template

Conversation

@dalito
Copy link
Collaborator

@dalito dalito commented Mar 3, 2026

  • pre-commit hooks were updated
  • various end-of-line/file fixes by pre-commit

- pre-commit hooks were updated
- various end-of-line/file fixes by pre-commit
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://nfdi-de.github.io/dcat-ap-plus/pr-preview/pr-64/

Built to branch gh-pages at 2026-03-04 00:03 UTC.
Preview will be ready when the GitHub Pages deployment is complete.

@HendrikBorgelt
Copy link
Collaborator

Hi @dalito,

@StroemPhi and I ran into the same errors that you are currently running into. @StroemPhi has worked on a good bug fix doc and analysis, which we wanted to discuss with you tomorrow, I think. So maybe if Philip hasn't contacted you yet, wait until tomorrow, and we can share some insights.

@dalito
Copy link
Collaborator Author

dalito commented Mar 4, 2026

I looks to me like a linkml-runtime bug. But I need to verify.

@HendrikBorgelt
Copy link
Collaborator

yes it is,

here are some PR's touching on this issue
github.com/linkml/linkml/pull/3165
linkml/linkml#3183
linkml/linkml#3182

@StroemPhi pinpointed it down slightly more. I will share more with you via RocketChat.

@StroemPhi
Copy link
Member

Fallback key heuristic in gen_postinit breaks _normalize_inlined for inlined_as_list slots

Summary

PR #3165 removed the and not slot.inlined_as_list guard from the fallback identifier search in gen_postinit. This causes the generated __post_init__ to call _normalize_inlined_as_list with a non-identifier required slot as key_name, triggering latent bugs in yamlutils._normalize_inlined that produce ValueError and TypeError at runtime when loading data via linkml-convert.

Affected versions

  • Works: linkml 1.9.6
  • Broken: linkml 1.10.0

Reproducer

Schema (minimal extract -- full schema at dcat-ap-plus):

classes:
  Entity:
    slots:
      - other_identifier
    slot_usage:
      other_identifier:
        range: Identifier
        multivalued: true
        inlined_as_list: true

  Identifier:
    slots:
      - notation
    slot_usage:
      notation:
        range: string
        required: true

Identifier has no identifier: true or key: true slot -- notation is simply a required slot.

Data:

other_identifier:
  - notation: https://pubchem.ncbi.nlm.nih.gov/compound/26248854

Command:

linkml-convert -s schema.yaml -C Entity -t rdf data.yaml

Error (linkml 1.10.0):

File "linkml_runtime/utils/yamlutils.py", line 182, in _normalize_inlined
    order_up(list_entry[lek], slot_type(list_entry))
ValueError: notation must be supplied

The same data passes linkml validate without error, because validation uses JSON Schema and never instantiates the generated Python dataclasses.

Root cause

In gen_postinit (python_generator.py), the elif slot.inlined: block searches for a fallback identifier when no true identifier/key exists:

1.9.6 (working):

if not identifier and not slot.inlined_as_list:
    for range_slot_name in slot_range_cls.slots:
        range_slot = self.schema.slots[range_slot_name]
        if range_slot.required:
            identifier = range_slot.name
            break
    keyed = False

1.10.0 (broken, after PR #3165):

if not identifier:
    for range_slot_name in slot_range_cls.slots:
        range_slot = self.schema.slots[range_slot_name]
        if range_slot.required and range_slot.range not in self.schema.classes:
            identifier = range_slot.name
            break
    keyed = False

The removal of and not slot.inlined_as_list means the fallback now also fires for inlined_as_list slots. This feeds a non-identifier required slot (notation) as key_name into _normalize_inlined_as_list, which then enters a code path in yamlutils._normalize_inlined (line ~182) that calls slot_type(list_entry) -- passing a dict as a positional argument instead of unpacking it as kwargs. The generated dataclass constructor expects keyword arguments, so notation is never set, and __post_init__ raises MissingRequiredField.

Even after patching that line to slot_type(**as_dict(list_entry)), a second bug surfaces in form_1 (line ~160) where raw_obj[key_name] = key fails with TypeError: list indices must be integers or slices, not str when raw_obj happens to be a list.

Both are latent bugs in _normalize_inlined that were never triggered before because inlined_as_list slots without a true identifier never entered that code path.

Why the fallback is conceptually wrong for lists

The fallback key heuristic exists to organize entries in a dict -- you need some key to index by. Lists don't need keys. The simple list comprehension that handles the no-identifier case works correctly:

sn = [v if isinstance(v, Type) else Type(**as_dict(v)) for v in sn]

Proposed fix

Move the guard inside the if not identifier block so it only restricts the fallback, not true identifiers:

if not identifier:
    if not slot.inlined_as_list:
        for range_slot_name in slot_range_cls.slots:
            range_slot = self.schema.slots[range_slot_name]
            if range_slot.required and range_slot.range not in self.schema.classes:
                identifier = range_slot.name
                break
    keyed = False

This preserves PR #3165's range_slot.range not in self.schema.classes refinement for dict cases while preventing the fallback from firing for list cases, where it causes the crash.

Behavior matrix:

Slot has identifier/key? inlined_as_list? Old (1.9.6) New (1.10.0) Fix
Yes Yes _normalize_inlined_as_list _normalize_inlined_as_list _normalize_inlined_as_list (unchanged)
Yes No _normalize_inlined_as_dict _normalize_inlined_as_dict _normalize_inlined_as_dict (unchanged)
No Yes list comprehension crash list comprehension (restored)
No No fallback -> _normalize_inlined_as_dict fallback -> _normalize_inlined_as_dict fallback -> _normalize_inlined_as_dict (unchanged)

Secondary issue in yamlutils._normalize_inlined

Independent of the gen-python fix, there are two latent bugs in linkml_runtime.utils.yamlutils._normalize_inlined that should be tracked separately:

  1. Line ~182: slot_type(list_entry) passes a dict as a positional arg; should be slot_type(**as_dict(list_entry))
  2. Line ~160: form_1 doesn't handle the case where raw_obj is a list, causing TypeError

These are pre-existing bugs in linkml-runtime that are only exposed when _normalize_inlined_as_list is called with a non-identifier key. The gen-python fix avoids triggering them, but they should still be fixed defensively.

Environment

  • Python 3.12
  • linkml 1.10.0
  • linkml-runtime (as shipped with linkml 1.10.0)
  • OS: Windows/WSL2

Regression tests

The PR includes pytest tests covering all four cells of the behavior matrix. Each test generates Python from a minimal schema, compiles it, and loads data via yaml_loader. The critical regression test is test_no_identifier_inlined_as_list (cell 3) which reproduces the crash.

Related

  • PR #3165 (introduced the change)

@dalito
Copy link
Collaborator Author

dalito commented Mar 4, 2026

With the fixes proposed for linkml-runtime in linkml/linkml#3247 the tests here will pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants