Skip to content

Bug(pbip-validator): M-expression vs table name collision check#24

Merged
data-goblin merged 2 commits into
releases/26.20from
bug/pbip-validation
May 13, 2026
Merged

Bug(pbip-validator): M-expression vs table name collision check#24
data-goblin merged 2 commits into
releases/26.20from
bug/pbip-validation

Conversation

@data-goblin
Copy link
Copy Markdown
Owner

Summary

Adds a new ERROR rule to validate_pbip.py that catches the case a user just hit on a client model: an M expression and a table sharing a name, which puts them in the same member namespace and fails the model load with:

Microsoft.Data.Mashup.Preview; This document contains a duplicate member '<name>'.

Before this PR, the pbip-validator passed the model cleanly even though Desktop refused to open it. None of the three layers (Python script, Rust tmdl-validate binary, pbir validate) was enumerating M-expression names and table names together.

What changed

  • plugins/pbip/skills/pbip/scripts/validate_pbip.py: new check_m_table_name_collisions runs as part of TMDL presence checks. Parses top-level declarations from expressions.tmdl and tables/*.tmdl, normalizes bare / single-quoted / double-quoted / escaped (#"name") identifiers, ignores nested expression keywords inside let blocks. Reports each collision as ERROR with exact remediation (rename M expression, reference via #"<Name> Query").
  • plugins/pbip/agents/pbip-validator.agent.md: new Step 3a documents the check and the fix workflow.
  • plugins/pbip/skills/tmdl/SKILL.md: critical-rules section now calls out the shared namespace.
  • release-notes/v26.20.md: adds the change under validator updates.

Out of scope (flagged for follow-up)

The bundled Rust tmdl-validate binary in plugins/pbip/hooks/bin/ does not yet ship this check. Its source lives in an out-of-tree tools/tmdl-validate/ workspace that is gitignored; that work happens separately.

Test plan

Tested locally against a synthetic PBIP:

  • Bare-identifier collision (expression Globals vs table Globals): caught, exit 2

  • Quoted collision (expression 'Other Source' vs table 'Other Source'): caught, exit 2

  • Indented expression = 1 inside a let block: not misclassified as a top-level declaration; no false positive

  • After renaming the M expression to Globals Query: 0 errors, validator clears

  • CI passes (validate-plugins)

Power BI Desktop puts M shared expressions and tables in the same
member namespace. A name declared by `expression <name>` in
definition/expressions.tmdl and `table <name>` in definition/tables/*.tmdl
collides, and Desktop fails the model load with:

    Microsoft.Data.Mashup.Preview; This document contains a duplicate
    member '<name>'.

User hit this on a client model: an M expression named `Globals` and a
table named `Globals`. The pbip-validator passed the model cleanly
before they opened it because no validator (the Python script, the
Rust tmdl-validate binary, or `pbir validate`) was enumerating these
two namespaces together.

Adds `check_m_table_name_collisions` to validate_pbip.py:
- Parses top-level declarations from expressions.tmdl and tables/*.tmdl
- Normalizes identifiers (bare, single-quoted, double-quoted, and the
  escaped `#"name"` form all reduce to the same bare string)
- Ignores `expression`/`table` keywords that appear nested inside `let`
  blocks or other constructs (top-level means column 0 only)
- Reports each collision as ERROR with the exact remediation: rename
  the M expression (suggested pattern: append ` Query`) and update
  dependent partitions to use `source = #"<Name> Query"`

Updates:
- pbip-validator.agent.md: new "Step 3a" documenting the check and the
  fix workflow; flags the same rule as a candidate for the Rust
  tmdl-validate binary in the out-of-tree tools/tmdl-validate/ source
- tmdl skill SKILL.md: critical-rules section now mentions the shared
  namespace and the rename convention
- release-notes/v26.20.md: adds a section under the validator changes

Tested against a synthetic PBIP:
- Bare-identifier collision (Globals vs Globals): caught, exit 2
- Quoted collision ('Other Source' vs 'Other Source'): caught, exit 2
- Indented `expression` inside a let block: not misclassified as a
  top-level declaration; no false positive
- After renaming the expressions: exit code clears
Adds a cross-file directory mode to the Rust tmdl-validate binary that
mirrors the new Python check. Invocation:

    tmdl-validate path/to/Model.SemanticModel/definition
    tmdl-validate path/to/Model.SemanticModel/definition --json

It reads definition/expressions.tmdl and every definition/tables/*.tmdl,
parses top-level `expression <name>` and `table <name>` declarations
(bare, single-quoted, double-quoted, and the escaped `#"name"` form all
normalize to the same identifier; nested keywords inside `let` blocks
are ignored), and reports each collision as a Severity::Error.

Single-file mode (`tmdl-validate path/to/file.tmdl`) is unchanged and
remains what the PostToolUse hook calls per edit.

Source edits in tools/tmdl-validate/src/main.rs (out-of-tree, gitignored).
Only plugins/pbip/hooks/bin/tmdl-validate-linux-x64 has been rebuilt with
the new mode in this commit; darwin-arm64, darwin-x64, and windows-x64
binaries still need to be rebuilt on their target platforms before they
pick up the check. Agent doc and release notes flag this.

Tested on the synthetic PBIP from the previous commit:
- Directory mode against the colliding model: exit 2, both collisions
  reported
- JSON mode: valid JSON with escaped quotes
- Single-file mode against a clean table file: exit 0, "Valid TMDL"
  (regression-clean)
@data-goblin data-goblin merged commit 806aa0d into releases/26.20 May 13, 2026
1 check passed
@data-goblin data-goblin deleted the bug/pbip-validation branch May 13, 2026 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant