[THIS-1049] Neat alpha autofix infrastructure by Magssch · Pull Request #1569 · cognitedata/neat

Magssch · 2026-02-04T20:23:55Z

Description

Add core infrastructure for automatic fixing data models. Introduces FixAction (immutable data model for field-level changes), FixApplicator (groups fixes by resource, checks for conflicts, and applies them efficiently via in-place mutation of a deep copy), and transform_physical on NeatStore to integrate fix application into the provenance pipeline. The orchestrator and session layer are wired up to support the read-validate-fix flow behind an alpha feature flag.

The code path for fixe functionality is disabled until exposed in a later PR.

Bump

Patch
Skip

gemini-code-assist · 2026-02-04T20:23:59Z

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

github-actions · 2026-02-04T20:24:49Z

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines	Covered	Coverage	Threshold	Status
7016	6487	92%	90%	🟢

New Files

File	Coverage	Status
cognite/neat/_data_model/_fix.py	96%	🟢
TOTAL	96%	🟢

Modified Files

File	Coverage	Status
cognite/neat/_data_model/_shared.py	92%	🟢
cognite/neat/_data_model/_snapshot.py	100%	🟢
cognite/neat/_data_model/models/dms/init.py	100%	🟢
cognite/neat/_data_model/models/dms/_http.py	100%	🟢
cognite/neat/_data_model/rules/_base.py	100%	🟢
cognite/neat/_data_model/rules/dms/_orchestrator.py	97%	🟢
cognite/neat/_session/_physical.py	71%	🟢
cognite/neat/_store/_provenance.py	100%	🟢
cognite/neat/_store/_store.py	92%	🟢
TOTAL	95%	🟢

updated for commit: 07baacb by action🐍

codecov · 2026-02-04T20:25:37Z

Codecov Report

❌ Patch coverage is 85.84071% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.73%. Comparing base (0f7f8fa) to head (07baacb).

Files with missing lines	Patch %	Lines
cognite/neat/_session/_physical.py	50.00%	7 Missing ⚠️
cognite/neat/_data_model/_fix.py	93.05%	5 Missing ⚠️
cognite/neat/_data_model/_shared.py	57.14%	3 Missing ⚠️
...ognite/neat/_data_model/rules/dms/_orchestrator.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1569      +/-   ##
==========================================
- Coverage   91.76%   91.73%   -0.03%     
==========================================
  Files         121      122       +1     
  Lines        7065     7161      +96     
==========================================
+ Hits         6483     6569      +86     
- Misses        582      592      +10

Files with missing lines	Coverage Δ
cognite/neat/_data_model/_snapshot.py	`98.61% <100.00%> (+0.08%)`	⬆️
cognite/neat/_data_model/models/dms/__init__.py	`100.00% <100.00%> (ø)`
cognite/neat/_data_model/models/dms/_http.py	`100.00% <100.00%> (ø)`
cognite/neat/_data_model/rules/_base.py	`93.75% <100.00%> (ø)`
cognite/neat/_store/_provenance.py	`100.00% <100.00%> (ø)`
cognite/neat/_store/_store.py	`89.43% <100.00%> (+0.46%)`	⬆️
...ognite/neat/_data_model/rules/dms/_orchestrator.py	`96.87% <66.66%> (-3.13%)`	⬇️
cognite/neat/_data_model/_shared.py	`81.25% <57.14%> (-6.75%)`	⬇️
cognite/neat/_data_model/_fix.py	`93.05% <93.05%> (ø)`
cognite/neat/_session/_physical.py	`71.31% <50.00%> (+0.12%)`	⬆️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add ability for validators to provide automatic fixes for issues they detect. This implements the core infrastructure and fixes for: Infrastructure: - FixAction class extending ResourceChange for atomic schema fixes - Helper functions for generating auto IDs (constraints, indexes) - Orchestrator support for apply_fixes parameter - Tracking of applied fixes in provenance Fixable validators: - MissingRequiresConstraint: Adds requires constraints - SuboptimalRequiresConstraint: Removes suboptimal constraints - RequiresConstraintCycle: Removes constraints to break cycles - MissingReverseDirectRelationTargetIndex: Adds indexes All fixes use __auto suffix for easy identification and are only applied when explicitly enabled via the orchestrator.

Magssch · 2026-02-05T15:37:56Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a FixAction class for applying atomic fixes to schema issues, along with helper functions for generating constraint and index identifiers, ensuring they adhere to CDF's length limits by truncating and hashing if necessary. It also includes validators for identifying and resolving performance-related issues such as missing requires constraints and unindexed reverse direct relations, as well as breaking constraint cycles. The review comments suggest an iterative approach to applying fixes to avoid conflicts and ensure a more robust autofix mechanism, as applying one fix can change the data model in a way that invalidates other pending fixes.

cognite/neat/_data_model/rules/dms/_orchestrator.py

Magssch · 2026-02-05T15:45:12Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new autofix infrastructure for DMS validators, allowing FixAction objects to automatically resolve validation issues. The changes include defining FixAction and helper functions for auto-generated IDs, integrating fix application into the DmsDataModelRulesOrchestrator, and implementing fix methods for several performance-related validators. The new fix methods correctly generate FixAction objects to add/remove constraints and indexes. Comprehensive unit and end-to-end tests have been added to verify the functionality of the fix actions and their ability to resolve validation issues.

cognite/neat/_data_model/rules/dms/_containers.py

doctrino

I like the FixAction structure. I suggest utilizing the pydantic model_copy method for efficient update. See comment

cognite/neat/_data_model/_fix_actions.py

cognite/neat/_data_model/rules/dms/_orchestrator.py

cognite/neat/_data_model/rules/dms/_performance.py

cognite/neat/_data_model/rules/dms/_containers.py

nikokaoja

Comments in the first pass are related to:

scope of PR
readability of code

cognite/neat/_data_model/rules/dms/_containers.py

cognite/neat/_data_model/_fix_actions.py

cognite/neat/_data_model/_fix_helpers.py

cognite/neat/_data_model/_fix_actions.py

…xed snapshot - Merge _fix_actions.py + _fix_helpers.py into _fix.py - Move as_resource_update onto FixAction as a method (uses model_copy instead of in-place mutation) - Group fixes by resource ID in orchestrator with conflict detection - Return fixed_snapshot from orchestrator.run() for provenance - Revert validator changes to separate follow-up PR Co-authored-by: Cursor <cursoragent@cursor.com>

Magssch · 2026-02-10T18:00:19Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a robust infrastructure for automatically fixing data model validation issues. The core components like FixAction and FixApplicator are well-designed and the integration into the existing provenance and session management through transform_physical is clean. The changes are extensive but logical, and the addition of tests for the new components is great.

I've found a critical bug in the conflict detection logic within FixApplicator and a high-severity issue related to incorrect provenance tracking for applied fixes. Please see the detailed comments for suggestions on how to resolve these.

cognite/neat/_data_model/_fix.py

cognite/neat/_session/_physical.py

Magssch · 2026-02-11T13:06:47Z

cognite/neat/_data_model/_fix.py

+        for action in self._fix_actions:
+            fix_by_resource_id[action.resource_id].append(action)
+
+        resources_list_lookup: dict[type, dict[SchemaResourceId, DataModelResource]] = {


We should likely refactor RequestSchema to use dicts for more efficient lookup instead of having to do the lookup table here. Tracking this here: https://cognitedata.atlassian.net/browse/THIS-1068

Then we can avoid maintaining SchemaSnapshot class, as that one in an exactly the shape your FixApplicator needs

I am wandering if we attach a resource look up on RequestSchema could be an alternative @doctrino

nikokaoja

Good work, some optional refactoring which can be done in another PR (not critical)

nikokaoja · 2026-02-13T09:38:51Z

cognite/neat/_data_model/_fix.py

+        for action in self._fix_actions:
+            fix_by_resource_id[action.resource_id].append(action)
+
+        resources_list_lookup: dict[type, dict[SchemaResourceId, DataModelResource]] = {


Then we can avoid maintaining SchemaSnapshot class, as that one in an exactly the shape your FixApplicator needs

nikokaoja · 2026-02-13T10:40:40Z

cognite/neat/_data_model/_fix.py

+        for action in self._fix_actions:
+            fix_by_resource_id[action.resource_id].append(action)
+
+        resources_list_lookup: dict[type, dict[SchemaResourceId, DataModelResource]] = {


I am wandering if we attach a resource look up on RequestSchema could be an alternative @doctrino

nikokaoja · 2026-02-13T10:40:55Z

cognite/neat/_data_model/_fix.py

+            if resource_lookup is None:
+                raise RuntimeError(
+                    f"{type(self).__name__}: Unsupported resource type {type(resource_id)}. This is a bug in NEAT."
+                )
+            resource = resource_lookup.get(resource_id)
+            if resource is None:
+                raise RuntimeError(
+                    f"{type(self).__name__}: Resource {resource_id} not found in schema. This is a bug in NEAT."
+                )


nice catch !

nikokaoja · 2026-02-13T10:41:54Z

cognite/neat/_data_model/_fix.py

+    def _check_no_field_path_conflicts(self, changes: list[FieldChange]) -> None:
+        """Raise if any changes touch a field_path already modified by a previous change."""
+        seen_paths: set[str] = set()


nikokaoja · 2026-02-13T10:43:44Z

cognite/neat/_data_model/_fix.py

+def make_auto_id(base_id: str) -> str:
+    """Generate an auto-generated identifier with truncation if needed.
+
+    CDF has a 43-character limit on constraint/index identifiers. This function
+    ensures the ID stays within that limit while maintaining uniqueness.
+
+    Args:
+        base_id: The primary identifier to use (e.g., external_id or property_id).
+
+    Returns:
+        For short base_ids (≤37 chars): "{base_id}__auto"
+        For long base_ids (>37 chars): "{truncated_id}_{hash}__auto"
+    """
+    if len(base_id) <= MAX_BASE_LENGTH_NO_HASH:
+        return f"{base_id}{AUTO_SUFFIX}"
+
+    hash_suffix = hashlib.sha256(base_id.encode()).hexdigest()[:HASH_LENGTH]
+    truncated_id = base_id[:MAX_BASE_LENGTH_WITH_HASH]
+    return f"{truncated_id}_{hash_suffix}{AUTO_SUFFIX}"
+
+
+def make_auto_constraint_id(dst: ContainerReference) -> str:
+    """Generate a constraint identifier for auto-generated requires constraints."""
+    return make_auto_id(dst.external_id)
+
+
+def make_auto_index_id(property_id: str) -> str:
+    """Generate an index identifier for auto-generated indexes."""
+    return make_auto_id(property_id)


Feel free to move this into utils under a dedicated module identifiers or similar

nikokaoja · 2026-02-13T10:51:53Z

cognite/neat/_store/_store.py

+    def transform_physical(self, activity: Callable, on_success: OnSuccess | None = None) -> Change:
+        """Transform the current physical data model and record in provenance."""
+        change, transformed_model = self._do_activity(activity, on_success)
+
+        if transformed_model:
+            self.physical_data_model.append(transformed_model)
+
+        self.provenance.append(change)
+        return change


If possible transform_physical should return None and updates to the provenance (change) occur here, as in the example of read_physical.

This is optional refactoring, not nesseraly needed to be done in this PR

nikokaoja · 2026-02-13T10:55:48Z

cognite/neat/_session/_physical.py

+            applicator = FixApplicator(self._store.physical_data_model[-1], on_success.pending_fixes)
+            post_fix_on_success = self._create_on_success()
+            change = self._store.transform_physical(applicator.apply_fixes, post_fix_on_success)
+            change.applied_fixes = on_success.pending_fixes


Optional refactoring (in another PR):

_create_on_success can be extended allowing creation of data model transformer on_success object, then _read_validate_fix becomes simpler.

nikokaoja · 2026-02-13T10:56:14Z

cognite/neat/_session/_physical.py

+            applicator = FixApplicator(self._store.physical_data_model[-1], on_success.pending_fixes)
+            post_fix_on_success = self._create_on_success()
+            change = self._store.transform_physical(applicator.apply_fixes, post_fix_on_success)
+            change.applied_fixes = on_success.pending_fixes


Optional refactoring (in another PR):

_create_on_success can be extended allowing creation of data model transformer on_success object, then _read_validate_fix becomes

nikokaoja · 2026-02-13T10:58:46Z

cognite/neat/_store/_provenance.py

    target_entity: str | None = field(default="FailedEntity")
    issues: IssueList | None = field(default=None)
    errors: IssueList | None = field(default=None)
+    applied_fixes: list[FixAction] | None = field(default=None)


optional nitpic , keep it simple

Suggested change

applied_fixes: list[FixAction] | None = field(default=None)

fixes: list[FixAction] | None = field(default=None)

Magssch force-pushed the feat/autofix-part1 branch 4 times, most recently from fb4de40 to 4a7321c Compare February 5, 2026 07:31

Magssch added 2 commits February 5, 2026 09:20

Refactoring

945ab97

Magssch force-pushed the feat/autofix-part1 branch from 4a7321c to 945ab97 Compare February 5, 2026 12:26

Magssch and others added 5 commits February 5, 2026 12:27

Linting and static code checks

3a99bb2

Add future fix support for ChangedField

2f75557

Minor refactoring

2521b2e

Refactoring

2cf0fa7

Fix test

ff77992

Magssch changed the title ~~[THIS-1049] Neat alpha autofix capability (part 1)~~ [THIS-1049] Neat alpha autofix infrastructure Feb 5, 2026

Merge branch 'main' into feat/autofix-part1

42c80e9

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_orchestrator.py Outdated Show resolved Hide resolved

Change tests slightly

3234141

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_containers.py Outdated Show resolved Hide resolved

cognite/neat/_data_model/rules/dms/_containers.py Outdated Show resolved Hide resolved

Minor refactoring

f468191

Magssch marked this pull request as ready for review February 5, 2026 16:00

Magssch requested a review from a team as a code owner February 5, 2026 16:00

Fix lint

526e09e

doctrino reviewed Feb 9, 2026

View reviewed changes

cognite/neat/_data_model/_fix_actions.py Outdated Show resolved Hide resolved

cognite/neat/_data_model/rules/dms/_orchestrator.py Outdated Show resolved Hide resolved

Magssch commented Feb 9, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_performance.py Outdated Show resolved Hide resolved

Minor refactor

0bb23a2

nikokaoja reviewed Feb 10, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_containers.py Outdated Show resolved Hide resolved

nikokaoja reviewed Feb 10, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_containers.py Outdated Show resolved Hide resolved

nikokaoja reviewed Feb 10, 2026

View reviewed changes

cognite/neat/_data_model/rules/dms/_containers.py Outdated Show resolved Hide resolved

nikokaoja reviewed Feb 10, 2026

View reviewed changes

Magssch and others added 13 commits February 10, 2026 11:21

Refactoring to use provenance

a9a11e0

Linting and static code checks

03f17cb

Refactoring

7e17641

Refactoring

4ead64e

Refactoring

86cb08a

More minor refactoring

5f35122

Revert unintended changes

6878991

Revert some changes

c8ece61

Refactoring

79b4a71

Add back tests and refactor

7097051

Fix tests

2129d7d

Fix lint

49e7b14

Merge branch 'main' into feat/autofix-part1

5cd19ef

gemini-code-assist bot reviewed Feb 10, 2026

View reviewed changes

cognite/neat/_data_model/_fix.py Outdated Show resolved Hide resolved

cognite/neat/_session/_physical.py Outdated Show resolved Hide resolved

Magssch and others added 7 commits February 10, 2026 19:09

Address comments from gemini review

58b3ded

Minor refactoring

9db1919

Minor refactor

faf82c3

Linting and static code checks

9b6b05d

Test refactoring

7f4c971

Fix lint

34bdd32

Disable fix mechanism until exposed later

07baacb

Magssch requested review from doctrino and nikokaoja February 11, 2026 10:19

Magssch commented Feb 11, 2026

View reviewed changes

nikokaoja approved these changes Feb 13, 2026

View reviewed changes

	applied_fixes: list[FixAction] \| None = field(default=None)
	fixes: list[FixAction] \| None = field(default=None)

Conversation

Magssch commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Bump

Uh oh!

gemini-code-assist bot commented Feb 4, 2026

Uh oh!

github-actions bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

☂️ Python Coverage

Overall Coverage

New Files

Modified Files

Uh oh!

codecov bot commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Magssch commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Magssch commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

doctrino left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikokaoja left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Magssch commented Feb 10, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikokaoja left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Magssch commented Feb 4, 2026 •

edited

Loading

github-actions bot commented Feb 4, 2026 •

edited

Loading

codecov bot commented Feb 4, 2026 •

edited

Loading