Schema-only dry-run feature - similar to a universal override of materialize: view
or LIMIT 0
#6989
Replies: 1 comment 1 reply
-
I have thought before about how to make the approach used in I spiked this out for a Postgres adapter here where we execute the query and then just fetch the first row. It seemed like it could work but ran out of time to take it any further. Once you have the predicted schema of your
|
Beta Was this translation helpful? Give feedback.
-
Feature idea
Especially in CI and as quick end-to-end "compile check" for column name spellings, refactored dependencies, etc., it would be helpful to have a built-in dbt feature to build all table schemas - or at least test that they are correctly generated from the SQL provided.
This allows users to quickly confirm that their SQL definitions are correct, free from spelling issues, and that there are no circular loops or missing source tables, for instance - without waiting for the entire DW to rebuild itself with data.
In CI and also in local development, we want to fail as quickly as possible if
column_a
is mispelled ascolumm_a
, for instance, and the priority here is to not wait 45 minutes (or longer) to find that compile-type SQL issue.Why have as a first-class feature
A benefit of having this as a first-class feature is that specific adapters can implement their own platform-optimized implementation.
materialize: view
ormaterialize: ephemeral
to all views, overriding the model-specific settings.dbt-bigquery
could use the BigQuery specific logic withindbt-dry-run
, which leverages BigQueries native 'dry run' feature to do zero-cost tests that bypass the movement of data.dbt-postgres
ordbt-duckdb
implementation might be able to run the entire dry-run locally without touching the actual remote server.Integration with other dbt features
Integration with profiles
It should be assumed that this feature would be used primarily in CI testing and localdev "compiling" use cases, where we just want a quick check to find compile errors (if any) in the generated SQL or in mappings between models. As such, this could optionally be handled as a special type of profile - or as a setting against an existing profile type.
To use the bigquery example from below, it could be functionally equivalent to specifying a
dry-run
profile name, which passes all of its logic to https://github.com/autotraderuk/dbt-dry-run instead of sending todbt-bigquery
.An advantage adding this as a feature at the profile level is that it then is very clear to the user that zero-data or view-only materializations are not going to affect other profiles like
devtest
- and then you could havecicd_nodata
as well ascicd_withdata
as separate profile names.Integration with
defer
Any of the above may be define themselves as compatible (or not) with
defer
feature, to skip over model definitions which are known to have not changed since last full build or since last dry-run.Integration with tests
Tests would either fail (if zero rows) or be extremely slow (if materialize: view) - so they would not be viable in this scenario.
However, similar to models, the SQL generated for each test could be validated in a schema-only way, so if a test references a non-existent column (for instance) we can fail the dry-run on the basis of that test query being invalid.
Compatibility with "pivot" logic
Any dynamic SQL code which pivots row data into column headers would require (1) that the models still compile without any data or (2) that a default behavior is provided in cases where 'dry-run' is detected, or no data exists.
Workarounds / Alternative Implementations
A hacky solution is to inject a special handling logic into the macro that determines the
materialize
setting, and to let certainvars
orenv vars
toggle everything tomaterialize: view
.A similar workaround approach is to inject a
LIMIT 0
feature into the macro that generates the SQL text - and have the limit only toggle on when a specific var or env var is detected.Related links
Note, this is different from:
There's a BigQuery-specific solution here:
Beta Was this translation helpful? Give feedback.
All reactions