From 9183c6bbfb003f61d0c313dffd53d16a2c679e1b Mon Sep 17 00:00:00 2001 From: Dave Moore Date: Mon, 18 Nov 2024 11:43:20 -0800 Subject: [PATCH] Fix typo from floewr to flower --- docs/tutorials/enriching-your-warehouse.mdx | 88 +++++++++---------- docs/tutorials/tutorials-intro.mdx | 69 +++++++-------- .../metadata/raw/raw_customers.sdf.yml | 9 +- .../metadata/raw/raw_customers.sdf.yml | 11 ++- 4 files changed, 87 insertions(+), 90 deletions(-) diff --git a/docs/tutorials/enriching-your-warehouse.mdx b/docs/tutorials/enriching-your-warehouse.mdx index 88739bf5..4ec30274 100644 --- a/docs/tutorials/enriching-your-warehouse.mdx +++ b/docs/tutorials/enriching-your-warehouse.mdx @@ -3,9 +3,9 @@ title:"Enriching Your Warehouse" --- ## Overview -In the previous tutorial, we set up guardrails based on checks. -In this tutorial, we will see how SDF's semantic understanding -can help transform your data warehouse from strings and numbers +In the previous tutorial, we set up guardrails based on checks. +In this tutorial, we will see how SDF's semantic understanding +can help transform your data warehouse from strings and numbers to real-world business logic: * Maintain business logic consistency * Control development environment and minimize mistakes from propagating @@ -25,7 +25,7 @@ Init for run commands: - If you haven't completed the [previous tutorial](/tutorials/deprecating-a-model), + If you haven't completed the [previous tutorial](/tutorials/deprecating-a-model), uncomment the relevant section to reference the metadata files: ``` yml workspace.sdf.yml @@ -38,14 +38,14 @@ Init for run commands: - path: checks # Checks against SDF's information schema type: check # <<<<<<< - ``` + ``` SDF's has the ability to annotate columns and tables with user defined types which represent - real-world business logic. Those types enrich the data warehouse and create new SQL types - + real-world business logic. Those types enrich the data warehouse and create new SQL types - instead of just BIGINTs, we can now have currencies, different types of IDs, zip-codes, and many more. - - SDF **automatically propagates** those types to downstream assets, enriching the entire + + SDF **automatically propagates** those types to downstream assets, enriching the entire data warehouse with a new layer of semantic understanding. @@ -54,27 +54,27 @@ Init for run commands: Let's focus back on Mom's Flower Shop. - - If you recall, V1 of `app_installs` had an incorrect `JOIN` between - mobile app in-app events in the `raw_inapp_events` table, and marketing + + If you recall, V1 of `app_installs` had an incorrect `JOIN` between + mobile app in-app events in the `raw_inapp_events` table, and marketing campaign events in the `raw_marketing_campaign_events` table: - + ``` sql ... - FROM inapp_events i + FROM inapp_events i LEFT OUTER JOIN raw.raw_marketing_campaign_events m - ON (i.event_id = m.event_id) + ON (i.event_id = m.event_id) ... ``` - Essentially, we were joining two elements that are completely different. - Like comparing Apples to Oranges. These kind of mistakes happen all the time. - Thankfully, we can leverage SDF's semantic understanding and smart propagation + Essentially, we were joining two elements that are completely different. + Like comparing Apples to Oranges. These kind of mistakes happen all the time. + Thankfully, we can leverage SDF's semantic understanding and smart propagation to set guardrails which will prevent future similar mistakes. Let's use SDF classifiers to add the missing business logic. - The column classifiers file `classifications/column_classifiers.sdf.yml` already contains + The column classifiers file `classifications/column_classifiers.sdf.yml` already contains the event classifiers. Take a look yourself: ```yml classifications/column_classifiers.sdf.yml @@ -86,7 +86,7 @@ Init for run commands: ``` - To assign the classifiers, uncomment the relevant section in each of the + To assign the classifiers, uncomment the relevant section in each of the files: ```yml metadata/raw/raw_inapp_events.sdf.yml @@ -115,7 +115,7 @@ Init for run commands: SDF actually propagates the classifiers automatically, so no extra steps are required. - + Let's compile to view our tables metadata: ```shell sdf compile --show result @@ -149,7 +149,7 @@ Schema moms_flower_shop.analytics.dim_marketing_campaigns Furthermore, we can see that `app_installs` inherited both `EVENT.inapp` and `EVENT.marketing`. This is due to the incorrect `JOIN` we found. - ```shell + ```shell sdf compile staging.app_installs --show result ```
@@ -194,14 +194,14 @@ Schema moms_flower_shop.staging.app_installs WHERE -- more than one EVENT classifier is assigned CAST(c.classifiers AS VARCHAR) LIKE '%EVENT%EVENT%' - ``` + ``` Let's run it: ```shell sdf check mixed_event_ids --show result ``` - This check will fail because `app_installs` still has an - incorrect `JOIN`. + This check will fail because `app_installs` still has an + incorrect `JOIN`.
 
@@ -219,28 +219,28 @@ Schema moms_flower_shop.staging.app_installs
     
     
         In the [previous tutorial](/tutorials/deprecating-a-model), we already resolved
-        any downstream dependencies of `app_installs` and we now fully support the newer 
+        any downstream dependencies of `app_installs` and we now fully support the newer
         version, `app_installs_v2`. It is safe to deprecate the model - just delete the file.
 
         Or course, you can run `sdf compile` to validate the change.
     
 
 
-## Bonus 
+## Bonus
 Classifiers can enrich your data warehouse in many ways.
 The following are just a few examples of added information layers
-to your static tables. 
+to your static tables.
 
-With each example, you can create checks and reports 
+With each example, you can create checks and reports
 to monitor your warehouse's health and compliance.
 
 
     
-        Privacy is critical when storing sensitive information. 
-        With SDF's smart classifiers propagation, it is easier than 
-        ever to track PII and other privacy related concerns. 
+        Privacy is critical when storing sensitive information.
+        With SDF's smart classifiers propagation, it is easier than
+        ever to track PII and other privacy related concerns.
 
-        Open `metadata/raw/raw_customers.sdf.yml` and uncomment 
+        Open `metadata/raw/raw_customers.sdf.yml` and uncomment
         all classifier sections in the file. They should look like this:
 
         ``` yml metadata/raw/raw_customers.sdf.yml
@@ -270,7 +270,7 @@ Schema moms_flower_shop.staging.customers
 ┌───────────────┬───────────┬─────────────┬────────────────────────────────────────────────────────────┐
 │ column_name   ┆ data_type ┆ classifier  ┆ description                                                │
 ╞═══════════════╪═══════════╪═════════════╪════════════════════════════════════════════════════════════╡
-│ customer_id   ┆ bigint    ┆             ┆ A unique identifier of a mom's floewr shop customer        │
+│ customer_id   ┆ bigint    ┆             ┆ A unique identifier of a mom's flower shop customer        │
 │ first_name    ┆ varchar   ┆ PII.name    ┆ The first name of the customer                             │
 │ last_name     ┆ varchar   ┆ PII.name    ┆ The last name of the customer                              │
 │ full_name     ┆ varchar   ┆ PII.name    ┆                                                            │
@@ -290,12 +290,12 @@ Schema moms_flower_shop.staging.customers
     
     
         We can set up table level and column level retention classifiers.
-        Let's look at a table level example. 
+        Let's look at a table level example.
 
         In the `table_classifiers.sdf.yml` file you will find a retention classifier:
 
         ```yml classifications/table_classifiers.sdf.yml
-        classifier: 
+        classifier:
             name: RETENTION
             labels:
                 - name: d7
@@ -306,7 +306,7 @@ Schema moms_flower_shop.staging.customers
         ```
 
         We can assign short term retention to our raw tables, while keeping infinite
-        retention for any analytics tables. 
+        retention for any analytics tables.
 
         For each raw table metadata found in `metadata/raw/*`, add:
         ``` yml
@@ -323,11 +323,11 @@ Schema moms_flower_shop.staging.customers
             - name: RETENTION.infinity
         ...
         ```
-        
-        
-        Notice that these classifiers are defined not to propagate downstream 
+
+
+        Notice that these classifiers are defined not to propagate downstream
         using the flag `propagate: false`:
-        ```yml 
+        ```yml
         classifier:
             ...
             propagate: false
@@ -343,8 +343,8 @@ Schema moms_flower_shop.staging.customers
         sdf compile models/analytics/ --show result
         ```
         
-        
-        For example, we can look at the `raw_addresses` output 
+
+        For example, we can look at the `raw_addresses` output
         from the first command:
 
@@ -397,8 +397,8 @@ Schema moms_flower_shop.analytics.dim_marketing_campaigns
 
 ## Summary
 This tutorial only shows the tip of the iceberg of what you can do with our
-semantic understanding. Anything that's possible with SQL is possible as a check or 
-report against the information schema. 
+semantic understanding. Anything that's possible with SQL is possible as a check or
+report against the information schema.
 
 
     We created the information schema to support custom checks and reports.
diff --git a/docs/tutorials/tutorials-intro.mdx b/docs/tutorials/tutorials-intro.mdx
index 44a35f83..c7406135 100644
--- a/docs/tutorials/tutorials-intro.mdx
+++ b/docs/tutorials/tutorials-intro.mdx
@@ -5,17 +5,17 @@ description:
 ---
 
 ## Overview
-The goal of this tutorials series is to provide a guided way for you to explore SDF and 
+The goal of this tutorials series is to provide a guided way for you to explore SDF and
 understand how it can be integrated into your data workflows. We built SDF to be an intuitive
 and easy to use and we hope you'll have fun exploring it.
 
-In our series of tutorial we will be working on a single SDF workspace - "Mom's Flower Shop". 
-In this page, we will provide a setup guide as well as an overview of this workspace. We 
+In our series of tutorial we will be working on a single SDF workspace - "Mom's Flower Shop".
+In this page, we will provide a setup guide as well as an overview of this workspace. We
 will even use SDF to conduct some initial exploration.
 
-This project was inspired by [Fleurette Studio](https://www.instagram.com/fleurette_studio/), 
-one of our co-founders [Elias'](https://www.linkedin.com/in/eliasdefaria/) mom's 
-boutique floral design studio located in Los Angeles, CA. 
+This project was inspired by [Fleurette Studio](https://www.instagram.com/fleurette_studio/),
+one of our co-founders [Elias'](https://www.linkedin.com/in/eliasdefaria/) mom's
+boutique floral design studio located in Los Angeles, CA.
 
 Let's get started!
 
@@ -23,8 +23,8 @@ Let's get started!
 * A Mac or Linux with a [valid installation](/introduction/install) of the latest SDF version running locally.
 * (Recommended) Having gone through our [Getting Started Guide](/introduction/getting-started).
 
-For the sake of the tutorials, there is no need to connect to a database or to leverage any 
-  compute engine other than your own laptop. 
+For the sake of the tutorials, there is no need to connect to a database or to leverage any
+  compute engine other than your own laptop.
 
 
   If using VSCode, SDF's YML schema is available for type and syntax checking via the [Red HAT YAML](https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml). This will
@@ -42,14 +42,14 @@ For the sake of the tutorials, there is no need to connect to a database or to l
     sdf new --sample moms_flower_shop
     ```
 
-    After running the command, you will see the following output: 
-    
-        
-    
+    After running the command, you will see the following output:
+
+    
+
     That's it!
   
   
-    This workspace is powering the data warehouse of Mom's Flower Shop. 
+    This workspace is powering the data warehouse of Mom's Flower Shop.
 
     First, let's open our terminal and change the directory. Run:
     ```shell
@@ -122,13 +122,13 @@ For the sake of the tutorials, there is no need to connect to a database or to l
     ```
     
     
-    * Raw data seeds are available in the `seeds` folder. 
+    * Raw data seeds are available in the `seeds` folder.
     * Models (SQL files) are available in the `models` folder.
     * The workspace is defined in the `workspace.sdf.yml` configuration file.
 
     Let's ignore the rest of the directory for now. We will get back to those in later tutorials.
   
-   
+  
     Let's explore the tables. In your terminal, run:
     ```shell
     sdf compile models/raw
@@ -136,8 +136,8 @@ For the sake of the tutorials, there is no need to connect to a database or to l
 
     
     When we run SDF compile, our engine validates SQL syntax and dependencies correctness.
-    In the example above, SDF is set to compile models under `models/raw`, but if we 
-    run `sdf compile` alone we will instantly guarantee a successful execution of all models 
+    In the example above, SDF is set to compile models under `models/raw`, but if we
+    run `sdf compile` alone we will instantly guarantee a successful execution of all models
     in the warehouse with a single command, running locally with lightning speed.
     
 
@@ -159,7 +159,7 @@ Schema moms_flower_shop.raw.raw_customers
 ┌─────────────┬───────────┬────────────┬────────────────────────────────────────────────────────────┐
 │ column_name ┆ data_type ┆ classifier ┆ description                                                │
 ╞═════════════╪═══════════╪════════════╪════════════════════════════════════════════════════════════╡
-│ id          ┆ bigint    ┆            ┆ A unique identifier of a mom s floewr shop customer        │
+│ id          ┆ bigint    ┆            ┆ A unique identifier of a mom s flower shop customer        │
 │ first_name  ┆ varchar   ┆            ┆ The first name of the customer                             │
 │ last_name   ┆ varchar   ┆            ┆ The last name of the customer                              │
 │ email       ┆ varchar   ┆            ┆ The email of the customer                                  │
@@ -175,7 +175,7 @@ Schema moms_flower_shop.raw.raw_customers
     ```shell
     sdf run models/raw
     ```
-    
+
     For example, the results for `raw_customers` look like this:
 
 
@@ -206,12 +206,12 @@ Table moms_flower_shop.raw.raw_customers ** Note that these files are randomly generated and do not contain any real data - In this workspace we are using sample data stored locally for ease of use. - + In this workspace we are using sample data stored locally for ease of use. + When setting up your own workspace, you can connect existing data providers to SDF. Follow the relevant [provider's guide](/integrations/overview) to get started. - + We can also explore the other models that are found under the `models` directory. @@ -219,7 +219,7 @@ Table moms_flower_shop.raw.raw_customers ```shell sdf compile ``` - + The output should look like this:
@@ -238,21 +238,21 @@ Working set 11 model files, 22 .sdf files
 
- - SDF intelligently caches previous compilations. Since we compiled the source models - under `models/raw` in previous steps, those will not be re-compiled in this run. + + SDF intelligently caches previous compilations. Since we compiled the source models + under `models/raw` in previous steps, those will not be re-compiled in this run. - Let's see how it could've looked like without SDF's optimization. First we need + Let's see how it could've looked like without SDF's optimization. First we need to clean cache. Run: - ```shell + ```shell sdf clean - ``` - Now, compile again by running: + ``` + Now, compile again by running: ```shell sdf compile ``` - + Notice the difference? Imagine running unoptimized compilations on a warehouse of your magnitude.
@@ -274,15 +274,15 @@ Working set 11 model files, 22 .sdf files
 
 
- - To see the schema of each table, similarly to the source tables in the previous step, + + To see the schema of each table, similarly to the source tables in the previous step, simply add the flag `--show all` to the command: ```shell sdf compile --show all ``` - + ## Next Steps Let's continue in our journey to explore SDF: @@ -290,4 +290,3 @@ Let's continue in our journey to explore SDF: 2. [Debugging](/tutorials/debugging) 3. [Deprecating a model](/tutorials/deprecating-a-model) 4. [Enriching your warehouse](/tutorials/enriching-your-warehouse) - diff --git a/examples/moms_flower_shop/metadata/raw/raw_customers.sdf.yml b/examples/moms_flower_shop/metadata/raw/raw_customers.sdf.yml index efd29246..30527f56 100644 --- a/examples/moms_flower_shop/metadata/raw/raw_customers.sdf.yml +++ b/examples/moms_flower_shop/metadata/raw/raw_customers.sdf.yml @@ -1,7 +1,7 @@ table: name: raw_customers description: > - All relevant information related to customers known to mom s flower shop. + All relevant information related to customers known to mom s flower shop. This information comes from the user input into the mobile app. # Uncomment below to begin the "Enriching Your Warehouse" Tutorial >>>>> @@ -11,8 +11,8 @@ table: columns: - name: id - description: A unique identifier of a mom s floewr shop customer - + description: A unique identifier of a mom s flower shop customer + - name: first_name description: The first name of the customer # Uncomment to begin the "Enriching your Warehouse" tutorial >>>>> @@ -33,7 +33,7 @@ table: # classifiers: # - PII.email # <<<<< - + - name: gender description: The gender of the customer # Uncomment to begin the "Enriching your Warehouse" tutorial >>>>> @@ -47,4 +47,3 @@ table: # classifiers: # - PII.address # <<<<< - \ No newline at end of file diff --git a/examples/moms_flower_shop_completed/metadata/raw/raw_customers.sdf.yml b/examples/moms_flower_shop_completed/metadata/raw/raw_customers.sdf.yml index 69284c03..674912d0 100644 --- a/examples/moms_flower_shop_completed/metadata/raw/raw_customers.sdf.yml +++ b/examples/moms_flower_shop_completed/metadata/raw/raw_customers.sdf.yml @@ -1,18 +1,18 @@ table: name: raw_customers description: > - All relevant information related to customers known to mom's flower shop. + All relevant information related to customers known to mom's flower shop. This information comes from the user input into the mobile app. # Uncomment below to begin the "Enriching Your Warehouse" Tutorial >>>>> classifiers: - RETENTION.d7 # <<<<< - + columns: - name: id - description: A unique identifier of a mom's floewr shop customer - + description: A unique identifier of a mom's flower shop customer + - name: first_name description: The first name of the customer # Uncomment to begin the "Enriching your Warehouse" tutorial >>>>> @@ -33,7 +33,7 @@ table: classifiers: - PII.email # <<<<< - + - name: gender description: The gender of the customer # Uncomment to begin the "Enriching your Warehouse" tutorial >>>>> @@ -47,4 +47,3 @@ table: classifiers: - PII.address # <<<<< - \ No newline at end of file