Skip to content

Leoebfolsom databricks test #202

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master-databricks
Choose a base branch
from

Conversation

leoebfolsom
Copy link
Collaborator

No description provided.

Copy link

@datafold datafold bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✨ AI Overview

This AI-generated summary of your pull request analyzes your code changes and their potential impact on your data. As this feature is still experimental, please review the details carefully.

  • Major data transformation: The changes to org_id calculation will affect org identification across the entire pipeline, modifying values for certain organizations (when org_id % 49 = 0). This impacts 95% of rows and affects 9 downstream dependencies including financial dashboards in both Looker and Tableau.

  • User count and subscription plan changes: The modification to num_users calculation and new logic for sub_plan affects organization classification significantly. 70% of subscription plans will change, with organizations having ≤2 users now being categorized as 'Individual' plans. This could substantially impact financial reporting in downstream dashboards.

  • Improved null handling: The change to use coalesce(price, 0) eliminates null prices (previously 298 null values) and standardizes the data. While this improves data quality, it affects 37% of rows and could impact financial calculations in downstream tables and dashboards if they were previously treating nulls differently than zeros.

Was this helpful? Yes / No

🕵️ Details

View CI Run Details →

Base branch Pull Request branch
master-databricks (9e16b42) leoebfolsom-databricks-test (b5e8fa4)
Tables modified: 1 (details)
  • Different: 1
demo.default.dim__orgs
Primary keys org_id
master-databricks leoebfolsom-databric...
DIFFERENCES
  Exclusive PKs 12 12
 3 column(s) with differing values
column number of rows
sub_plan 558  70.9%
num_users 388  49.3%
sub_price 294  37.4%
View details →
 
6 potential data app dependencies
 
Unchanged Attributes
Total rows 799
Total columns 6
Schema changes 0
Common unique PKs 787
Rows with NULL PKs 0
Rows with duplicate PKs 0

Skipped Data Diffs of downstream tables: 3 Add "datafold:diff-all-downstream" label to this pull request to diff all affected tables
demo.default.fct__monthly__financials (table) Run Data Diff →
demo.default.fct__yearly__financials (table) Run Data Diff →
demo.default.sales__sync (table) Run Data Diff →

Copy link

@datafold datafold bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✨ AI Overview

This AI-generated summary of your pull request analyzes your code changes and their potential impact on your data. As this feature is still experimental, please review the details carefully.

  • Modified Subscription Plans: A new rule automatically assigns the 'Individual' plan to organizations with 2 or fewer users. This affects 558 organizations (70% of records) and impacts several financial dashboards in both Looker and Tableau.

  • Changed User Count Formula: Adding org_id % 2 to the user count increases the average users per organization from 1.69 to 2.18. This affects 388 organizations (49% of records) and could impact analysis of user adoption metrics.

  • Organization ID Adjustments: Organizations with IDs divisible by 49 have 50,000,000 subtracted from their ID. This structural change affects organization identification across the system and impacts 12 organizations. Consider updating any downstream logic that assumes a specific org_id format.

Was this helpful? Yes / No

🕵️ Details

View CI Run Details →

Base branch Pull Request branch
master-databricks (9e16b42) leoebfolsom-databricks-test (b5e8fa4)
Tables modified: 1 (details)
  • Different: 1
demo.default.dim__orgs
Primary keys org_id
master-databricks leoebfolsom-databric...
DIFFERENCES
  Exclusive PKs 12 12
 3 column(s) with differing values
column number of rows
sub_plan 558  70.9%
num_users 388  49.3%
sub_price 294  37.4%
View details →
 
6 potential data app dependencies
 
Unchanged Attributes
Total rows 799
Total columns 6
Schema changes 0
Common unique PKs 787
Rows with NULL PKs 0
Rows with duplicate PKs 0

Downstream tables: 3 (details)
  • Different: 1
  • Identical: 2
demo.default.sales__sync
Primary keys org_id
master-databricks leoebfolsom-databric...
DIFFERENCES
  Total rows 14 2 -85.7%
  Exclusive PKs 12 0
View details →
 
Unchanged Attributes
Total columns 2
Schema changes 0
Common unique PKs 2
Rows with NULL PKs 0
Rows with duplicate PKs 0
Columns with different values 0
 
 Modified upstream models   model.demo.dim__orgs

demo.default.fct__monthly__financials
Primary keys date_month
View details →
 
Unchanged Attributes
Total rows 11
Total columns 3
Schema changes 0
Common unique PKs 11
Added/removed rows 0
Rows with NULL PKs 0
Rows with duplicate PKs 0
Columns with different values 0
 
 Modified upstream models   model.demo.dim__orgs

demo.default.fct__yearly__financials
Primary keys date_year
View details →
 
Unchanged Attributes
Total rows 2
Total columns 3
Schema changes 0
Common unique PKs 2
Added/removed rows 0
Rows with NULL PKs 0
Rows with duplicate PKs 0
Columns with different values 0
 
 Modified upstream models   model.demo.dim__orgs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants