diff --git a/plugins/pbip/agents/pbip-validator.agent.md b/plugins/pbip/agents/pbip-validator.agent.md index f6b81b7a..54f0b279 100644 --- a/plugins/pbip/agents/pbip-validator.agent.md +++ b/plugins/pbip/agents/pbip-validator.agent.md @@ -100,7 +100,7 @@ Flag reference: ### Step 3 — TMDL validation (only if `definition/` exists in `.SemanticModel/`) -The scripts do not parse TMDL. You handle it: +`validate_pbip.py` covers presence checks and the M-expression-vs-table name collision rule (see Step 3a). Everything else you handle: - `model.tmdl` has `ref table` entries for every file in `tables/`. - Each `tables/*.tmdl`: @@ -113,6 +113,32 @@ The scripts do not parse TMDL. You handle it: - `relationships.tmdl`: every referenced table/column exists. - `cultures/*.tmdl`: `ConceptualEntity` refs match table names. +### Step 3a — M-expression vs table name collision (ERROR) + +Power BI Desktop puts M shared expressions and tables in the same member namespace. A duplicate name fails the model load with: + +``` +Microsoft.Data.Mashup.Preview; This document contains a duplicate member ''. +``` + +`validate_pbip.py` enumerates top-level declarations in `definition/expressions.tmdl` (the `expression ` lines) and `definition/tables/*.tmdl` (the `table ` lines), then reports any name in the intersection as `m_table_name_collision` with severity `ERROR`. The rule handles bare, single-quoted, double-quoted, and escaped (`#"name"`) identifiers, and ignores `expression` keywords that appear inside nested `let` blocks. + +When you see this error: +- Rename the M expression. Common pattern: append ` Query` or ` Source` (e.g. `Globals` -> `Globals Query`). +- Update every partition whose `source = ` referenced the old expression to use M escaped-identifier syntax: `source = #"Globals Query"`. +- Re-run `validate_pbip.py` to confirm the collision clears. + +The Rust `tmdl-validate` binary now also supports a directory mode that runs the same check across `definition/expressions.tmdl` and `definition/tables/*.tmdl`: + +``` +tmdl-validate path/to/Model.SemanticModel/definition +tmdl-validate path/to/Model.SemanticModel/definition --json +``` + +Single-file mode (`tmdl-validate path/to/file.tmdl`) is unchanged and remains what the PostToolUse hook calls per edit. Use directory mode for whole-model checks alongside `validate_pbip.py`. + +Source: `tools/tmdl-validate/src/main.rs` (out-of-tree, gitignored). Only the Linux x64 binary in `plugins/pbip/hooks/bin/` has been rebuilt with the new mode; darwin-arm64, darwin-x64, and windows-x64 still need to be rebuilt on their target platforms before they pick up the check. + ### Step 4 — Cross-reference consistency Only when `--fields` output is insufficient or you are chasing a rename cascade: diff --git a/plugins/pbip/hooks/bin/tmdl-validate-linux-x64 b/plugins/pbip/hooks/bin/tmdl-validate-linux-x64 index 7902887a..0f0a1ad2 100755 Binary files a/plugins/pbip/hooks/bin/tmdl-validate-linux-x64 and b/plugins/pbip/hooks/bin/tmdl-validate-linux-x64 differ diff --git a/plugins/pbip/skills/pbip/scripts/validate_pbip.py b/plugins/pbip/skills/pbip/scripts/validate_pbip.py index 369fca46..f83b87dc 100644 --- a/plugins/pbip/skills/pbip/scripts/validate_pbip.py +++ b/plugins/pbip/skills/pbip/scripts/validate_pbip.py @@ -459,6 +459,100 @@ def check_tmdl_presence(def_dir: Path, result: Result) -> None: result.add(INFO, f"tmdl_{key}_absent", f"definition/{optional} absent (optional)", def_dir / optional) + check_m_table_name_collisions(def_dir, result) + + +#region TMDL declaration parsing + +def _tmdl_decl_name(line: str, keyword: str) -> str | None: + """Parse ` [...]` at the start of a TMDL line. + + Returns the bare name (quotes stripped) or None if the line is not a + declaration. Handles single-quoted, double-quoted, escaped (`#"name"`), + and bare identifier forms. The TMDL grammar allows annotations or + sub-clauses after the name on the same line, separated by whitespace; + we only consume the first token after the keyword. + """ + stripped = line.lstrip() + prefix = f"{keyword} " + if not stripped.startswith(prefix): + return None + rest = stripped[len(prefix):].strip() + if not rest: + return None + + # Quoted form: capture up to the matching closing quote + quoted = re.match(r"""^#?(['"])(.+?)\1""", rest) + if quoted: + return quoted.group(2) + + # Bare identifier: first whitespace-delimited token + return rest.split(None, 1)[0] + + +def _collect_tmdl_declarations(file_path: Path, keyword: str) -> set[str]: + """Read a TMDL file and collect all top-level ` ` names. + + Top-level means the declaration is at the start of a line (no leading + indentation), which is how TMDL distinguishes object declarations from + nested properties. Returns an empty set if the file does not exist. + """ + if not file_path.is_file(): + return set() + names: set[str] = set() + try: + text = file_path.read_text(encoding="utf-8-sig", errors="replace") + except OSError: + return names + for raw in text.splitlines(): + # Top-level declarations live at column 0 + if raw and raw[0].isspace(): + continue + name = _tmdl_decl_name(raw, keyword) + if name: + names.add(name) + return names + +#endregion + + +def check_m_table_name_collisions(def_dir: Path, result: Result) -> None: + """Detect M-expression names that collide with table names. + + Power BI Desktop puts M shared expressions and tables in the same + member namespace. A duplicate name triggers a fatal load error: + 'Microsoft.Data.Mashup.Preview; This document contains a + duplicate member .' + + Resolution: rename the M expression (common pattern: append " Query" + or " Source") and update any partition that references it via M + escaped-identifier syntax: Source = #"Renamed Expression". + """ + expressions_file = def_dir / "expressions.tmdl" + tables_dir = def_dir / "tables" + + expr_names = _collect_tmdl_declarations(expressions_file, "expression") + if not expr_names: + return + + table_names: set[str] = set() + if tables_dir.is_dir(): + for tmdl_file in sorted(tables_dir.glob("*.tmdl")): + table_names |= _collect_tmdl_declarations(tmdl_file, "table") + + collisions = sorted(expr_names & table_names) + for name in collisions: + result.add( + ERROR, + "m_table_name_collision", + (f"M expression '{name}' collides with table '{name}'. " + f"PBI Desktop will fail to load the model with " + f"'duplicate member {name}'. Rename the M expression " + f"(e.g. '{name} Query') and update dependent partitions " + f"to reference it via #\"{name} Query\"."), + expressions_file, + ) + #endregion diff --git a/plugins/pbip/skills/tmdl/SKILL.md b/plugins/pbip/skills/tmdl/SKILL.md index 2b65cc6e..b6785dd4 100644 --- a/plugins/pbip/skills/tmdl/SKILL.md +++ b/plugins/pbip/skills/tmdl/SKILL.md @@ -30,9 +30,10 @@ Activate only when the Tabular Editor CLI, Power BI MCP server, or `connect-pbid ## Critical -- **`///` (triple-slash) sets the `Description` property** on the object that immediately follows it. A `///` line must be immediately followed by a declaration (`measure`, `column`, `table`, etc.) — never by a blank line or another `///`. Use `//` for regular comments. +- **`///` (triple-slash) sets the `Description` property** on the object that immediately follows it. A `///` line must be immediately followed by a declaration (`measure`, `column`, `table`, etc.); never by a blank line or another `///`. Use `//` for regular comments. - **Indentation is semantic.** TMDL uses tabs for indentation, and depth equals nesting level. Properties of a table are indented one level; properties of a column (which belongs to a table) are indented two levels. Incorrect indentation will break the model. - **Name quoting rules:** Only quote names that contain spaces, special characters, or start with a digit. Simple names and underscore-prefixed names are unquoted. See the Name Quoting section for details. +- **M expressions and tables share a namespace.** A name declared by `expression ` in `expressions.tmdl` and a name declared by `table ` in `tables/*.tmdl` collide; Power BI Desktop fails the load with `'duplicate member '`. Pick distinct names; the conventional fix is to suffix the M expression with ` Query` or ` Source` and have partitions reference it via `source = #" Query"`. `validate_pbip.py` enforces this as an ERROR. ## TMDL File Types diff --git a/release-notes/v26.20.md b/release-notes/v26.20.md index 0a4f111e..d180ee62 100644 --- a/release-notes/v26.20.md +++ b/release-notes/v26.20.md @@ -34,6 +34,12 @@ Starting with this release the marketplace uses **YY.WW** versions (ISO year + I - `validate-plugins.yml` now triggers on `releases/**` push and PR events in addition to `main`, and on changes to the workflow file itself. Repo variable `ENABLE_PLUGIN_CI` set to `true` so the gated job actually runs. +## pbip-validator: M-expression vs table name collision (new ERROR rule) + +`validate_pbip.py` now detects when an M expression in `definition/expressions.tmdl` shares a name with a table in `definition/tables/*.tmdl`. Power BI Desktop puts both in the same member namespace, so a collision fails the model load with `Microsoft.Data.Mashup.Preview; This document contains a duplicate member ''`. Severity is ERROR; suggested fix is to rename the M expression (typical pattern: append ` Query`) and reference it from partitions via `source = #" Query"`. The `tmdl` skill and the `pbip-validator` agent both document the rule. + +The Rust `tmdl-validate` binary gains a directory mode that runs the same cross-file check: `tmdl-validate path/to/Model.SemanticModel/definition`. Single-file mode is unchanged. Only the Linux x64 binary in `plugins/pbip/hooks/bin/` has been rebuilt with the new mode in this release; darwin-arm64, darwin-x64, and windows-x64 still need to be rebuilt on their target platforms (source lives in the out-of-tree gitignored `tools/tmdl-validate/`). + ## Known limitations - The longpath PowerShell script has been static-tested only; smoke-test on a Windows machine before relying on it for a production install.