feat(csharp): add enable_fast_metadata_query flag for DESC TABLE EXTENDED#456
feat(csharp): add enable_fast_metadata_query flag for DESC TABLE EXTENDED#456jadewang-db wants to merge 3 commits into
Conversation
jadewang-db
left a comment
There was a problem hiding this comment.
Self-review
Overall the change is small and well-scoped. The Thrift-side opt-in plus warehouse-path gating matches the JIRA, and SEA was correctly identified as already wired via x-databricks-sea-can-run-fully-sync. One high-priority PR-hygiene issue around the branch state, plus a handful of low-priority simplifications inline.
Positive observations
UseFastMetadataQuery = _enableFastMetadataQuery && IsWarehousePathis the right composition; unit tests enumerate the full truth table.- Per-statement
RunAsyncOverrideis the right surface — mutating connection state would have been worse. - Existing 42601/20000 fallback in
GetColumnsExtendedAsyncstill protects against servers that reject the new shape, so rollout is safe. - Activity tag
fast_metadata_querywill make the flag's effect easy to confirm in production traces.
HIGH — branch is significantly behind main
The PR diff lists 11 files (+277/-9), but the commit f8f66af only touches 6 files (+200/-3). The extras come from telemetry refactors that landed on main while this branch was idle (e.g., PECO-2978/2988/2992 in DatabricksStatement.cs and ConnectionTelemetry.cs), plus d02dae2 which was already merged via #401.
$ git log origin/main..PECO-3021 --oneline
f8f66af feat(csharp): add enable_fast_metadata_query flag for DESC TABLE EXTENDED
d02dae2 fix(csharp): emit bare hostname for telemetry host_url to match JDBC ← already merged via #401
Recommend git rebase origin/main and force-push so the diff cleanly reflects only PECO-3021's work.
Test coverage note
Coverage of new code is exclusively unit-level. Wire behavior (server actually receiving RunAsync=false) is not asserted — relies on the existing E2E harness once runtime PR #198486 deploys. Worth mentioning the intentional E2E gap in the PR body.
jadewang-db
left a comment
There was a problem hiding this comment.
Self-review pass on PECO-3021. Overall the design is clean — the AND-gate of flag + warehouse-path on Thrift, the per-statement RunAsyncOverride to avoid mutating connection state, and the reuse of the existing 42601/20000 fallback for old servers are all the right shapes. The unit-test matrix on IsWarehousePath and UseFastMetadataQuery is thorough.
Findings below are mostly Low/Medium polish — no Critical or High issues. One scope/PR-hygiene note: this PR also carries commit d02dae2 (telemetry host_url bare-hostname fix, PECO-2987) which the description doesn't mention. Consider either splitting it out or extending the PR body so reviewers know both changes are in flight.
- Drop `_isWarehousePathCached` field; `IsWarehousePath` getter now
recomputes on each call. Removes the `Nullable<bool>` non-atomic
read/write concern and simplifies the property.
- Scope the `IndexOf('?')` query-string strip to the raw-Path branch
only; `Uri.AbsolutePath` already strips it for the URI branch.
- Replace the defensive `descStmt is DatabricksStatement` pattern in
`GetColumnsExtendedAsync` with an unconditional cast — if a future
refactor breaks the invariant, a loud `InvalidCastException` is
better than silently emitting a fast-metadata query that still goes
through WLM.
- Extract `var connection = (DatabricksConnection)Connection;` to
reuse a single local for `CanUseDescTableExtended` and
`UseFastMetadataQuery` instead of casting twice.
- Drop overly-strict `Assert.DoesNotContain(":", actualHostUrl)` from
the host_url regression test; the equality check and `"://"` check
together enforce the bare-hostname contract without rejecting IPv6
literals.
Co-authored-by: Isaac
| // telemetry event emitted at statement Dispose, instead of a meaningless 0. | ||
| internal long? CloseStatementRpcLatencyMs { get; set; } | ||
| internal Exception? CloseStatementRpcError { get; set; } | ||
| internal bool? RunAsyncOverride { get; set; } |
There was a problem hiding this comment.
Why do we need this extra flag? Who is passing this in?
Can we just rely on runAsyncInThrift?
There was a problem hiding this comment.
runAsyncInThrift is the user's connection-level setting (adbc.databricks.enable_run_async_in_thrift_op, defaults to true) — flipping it on the connection would affect all their statements, and mutating it around the desc-stmt call would race with any concurrent statements on the same connection.
RunAsyncOverride is an internal per-statement mechanism — no user surface — used exclusively by GetColumnsExtendedAsync to flip RunAsync=false on its own internal descStmt while the user's other statements keep their configured setting. The pairing with the STATIC ONLY SQL modifier is what tells the warehouse to route the desc-stmt off the WLM path; the connection-level flag alone can't express "this single statement, no others."
Open to a different shape if you have one in mind — happy to refactor if e.g. you'd prefer a constructor param or a dedicated ExecuteWithRunAsyncOff() helper instead of the override property.
msrathore-db
left a comment
There was a problem hiding this comment.
LGTM apart from Eric's comment
…NDED
Adds the SQL modifier STATIC ONLY (runtime PR #198486) to DESC TABLE EXTENDED
AS JSON so a DBSQL warehouse can skip Delta log / Mesa RPCs / other expensive
I/O. STATIC ONLY only takes effect when paired with off-WLM routing, so the
driver also flips the matching protocol signal:
- Thrift: RunAsync=false on the descStmt — gated on warehouse path
(/sql/1.0/warehouses/{id} or /sql/1.0/endpoints/{id}) so the signal is
never sent to general-purpose clusters.
- SEA: ExecuteMetadataSqlAsync already sends x-databricks-sea-can-run-fully-sync;
SEA is always-warehouse, so the flag alone gates the SQL change there.
Older servers that don't recognise STATIC ONLY return parse error 42601,
caught by the existing fallback in GetColumnsExtendedAsync.
Co-authored-by: Isaac
- Drop `_isWarehousePathCached` field; `IsWarehousePath` getter now
recomputes on each call. Removes the `Nullable<bool>` non-atomic
read/write concern and simplifies the property.
- Scope the `IndexOf('?')` query-string strip to the raw-Path branch
only; `Uri.AbsolutePath` already strips it for the URI branch.
- Replace the defensive `descStmt is DatabricksStatement` pattern in
`GetColumnsExtendedAsync` with an unconditional cast — if a future
refactor breaks the invariant, a loud `InvalidCastException` is
better than silently emitting a fast-metadata query that still goes
through WLM.
- Extract `var connection = (DatabricksConnection)Connection;` to
reuse a single local for `CanUseDescTableExtended` and
`UseFastMetadataQuery` instead of casting twice.
- Drop overly-strict `Assert.DoesNotContain(":", actualHostUrl)` from
the host_url regression test; the equality check and `"://"` check
together enforce the bare-hostname contract without rejecting IPv6
literals.
Co-authored-by: Isaac
The runtime grammar (SqlBaseParser.g4) accepts modifiers in this order:
identifierReference partitionSpec? describeColName? (AS JSON)? (STATIC ONLY)?
The driver was emitting `STATIC ONLY AS JSON` (modifiers swapped), which
the parser rejects with `42601 PARSE_SYNTAX_ERROR at 'AS'`. The existing
`42601` catch in `GetColumnsExtendedAsync` would have masked the bug —
falling back silently to the slow path even on warehouses that *have*
the runtime patch (PECO-3022 / runtime PR #198486), defeating the
optimization with no signal to the operator.
E2E-verified against a custom DBR image built from ~/runtime on the
`stack/describe-table-static-only` branch (commit 485e630), deployed
to a SQL warehouse on benchmarking-staging-aws-us-west-2:
- flag=true + warehouse: emits `DESC TABLE EXTENDED <t> AS JSON STATIC ONLY`,
status FINISHED, returns JSON metadata.
- flag=false + warehouse: emits `DESC TABLE EXTENDED <t> AS JSON`,
status FINISHED.
- flag=true + cluster path: `UseFastMetadataQuery` returns false
(existing unit test `UseFastMetadataQuery_RequiresFlagAndWarehousePath`).
Also adds `FastMetadataQueryE2ETest.cs` with three E2E checks (driver
emission, runtime acceptance of the keyword, flag-disabled control)
and tightens docstrings to spell out the emitted SQL explicitly so the
order is harder to flip in a future refactor.
Co-authored-by: Isaac
Summary
Adds opt-in
adbc.databricks.enable_fast_metadata_queryflag (defaultfalse) that pairs the new SQL modifierSTATIC ONLY(runtime PR #198486) onDESC TABLE EXTENDED AS JSONwith the matching off-WLM routing signal per protocol.Both signals are required together —
STATIC ONLYalone still queues through WLM; off-WLM routing alone still does the full metadata scan.DESC TABLE EXTENDED <t> STATIC ONLY AS JSONRunAsync=falseon descStmt/sql/1.0/warehouses/{id}or/sql/1.0/endpoints/{id}). General-cluster paths: flag is ignored.DESC TABLE EXTENDED <t> STATIC ONLY AS JSONx-databricks-sea-can-run-fully-sync: true(already sent byExecuteMetadataSqlAsync)Implementation
DatabricksConnection.IsWarehousePathmatches the path regex (cached) — only used to gate the Thrift opt-in.DatabricksStatement.RunAsyncOverrideis a new internal nullable bool; when non-null it takes precedence over the connection-levelRunAsyncInThriftinsideSetStatementProperties. The override is set on the per-call descStmt inGetColumnsExtendedAsyncrather than mutating connection state.StatementExecutionConnection.EnableFastMetadataQueryparses the same flag for SEA.Fallback safety
STATIC ONLYis rejected with parse error42601(INVALID_STATIC_ONLY_USAGE) on servers without PR #198486. The existingcatch (HiveServer2Exception ex) when (ex.SqlState == "42601" || ex.SqlState == "20000")inGetColumnsExtendedAsyncalready handles that — falls back toGetColumns + GetPrimaryKeys + GetCrossReference.Test plan
IsWarehousePathcovering/sql/1.0/warehouses/{id},/sql/1.0/endpoints/{id}(with/without trailing slash, with/without query strings),/sql/protocolv1/...clusters, empty path, and theAdbcOptions.Urifallback.UseFastMetadataQuerycovering all four combinations of(flag, warehouse-path).dotnet test --filter "FullyQualifiedName~Unit."suite: 638 passed, 0 failed.Closes PECO-3021
This pull request and its description were written by Isaac.