Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -613,6 +613,27 @@ models:
where: "num_orders > 0"
```

### functional_dependency ([source](macros/generic_tests/functional_dependency.sql))

This test confirms that a particular column is *functionally dependent* on one or more other columns. That is, for each distinct combination of those other columns, there should be no more than one distinct value in our particular column.

This test is often useful for denormalized source data, where logical relationships between fields are implicitly expected but don't always hold, due to manual entry errors, or merges from different systems. Broken functional dependencies often surface as dupes and other anomalies downstream.

*Common misunderstanding*: Functional dependency is *not* uniqueness. Functional dependency checks there is at most one distinct value (in each group), but allows that value to appear many times. Uniqueness allows many distinct values, but checks each value appears only once.

**Usage:**

```yaml
models:
- name: orders
columns:
- name: customer_name
tests:
- dbt_utils.functional_dependency:
depends_on:
- customer_id
```

----

### Grouping in tests
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
order_id,customer_id,customer_name
1001,1,Ash
1002,2,Brock
1003,2,Brock
1004,3,Ash
1005,4,
11 changes: 11 additions & 0 deletions integration_tests/data/schema_tests/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,14 @@ seeds:
- dbt_utils.sequential_values:
interval: 1
datepart: 'hour'


- name: data_test_functional_dependency
columns:
- name: order_id
- name: customer_id
- name: customer_name
data_tests:
- dbt_utils.functional_dependency:
depends_on:
- customer_id
39 changes: 39 additions & 0 deletions macros/generic_tests/functional_dependency.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{% test functional_dependency(model, column_name, depends_on, quote_columns=False) %}
{{ return(adapter.dispatch('test_functional_dependency', 'dbt_utils')(model, column_name, depends_on, quote_columns)) }}
{% endtest %}


{% macro default__test_functional_dependency(model, column_name, depends_on, quote_columns=False) %}


{% if not quote_columns %}
{%- set column_list=depends_on %}
{% elif quote_columns %}
{%- set column_list=[] %}
{%- for column in depends_on %}
{%- set column_list = column_list.append( adapter.quote(column) ) %}
{%- endfor %}
{% else %}
{{ exceptions.raise_compiler_error(
"`quote_columns` argument for functional_dependency test must be one of [True, False]"
) }}
{% endif %}


{%- set columns_csv=column_list | join(', ') %}


with validation_errors as (

select {{ columns_csv }}
from {{ model }}
group by {{ columns_csv }}
having count(distinct {{ column_name }}) > 1

)

select *
from validation_errors


{% endmacro %}
Loading