Overview
The goal of this tutorials series is to provide a guided way for you to explore SDF and understand how it can be integrated into your data workflows. We built SDF to be an intuitive and easy to use and we hope you’ll have fun exploring it. In our series of tutorial we will be working on a single SDF workspace - “Mom’s Flower Shop”. In this page, we will provide a setup guide as well as an overview of this workspace. We will even use SDF to conduct some initial exploration. This project was inspired by Fleurette Studio, one of our co-founders Elias’ mom’s boutique floral design studio located in Los Angeles, CA. Let’s get started!Prerequisites
- A Mac or Linux with a valid installation of the latest SDF version running locally.
- (Recommended) Having gone through our Getting Started Guide.
Getting Started with Mom’s Flower Shop
1
Create the Sample Workspace
To create a new SDF Workspace containing Mom’s Flower Shop data, run the following command:After running the command, you will see the following output:
That’s it!
That’s it!2
Workspace Overview
This workspace is powering the data warehouse of Mom’s Flower Shop.First, let’s open our terminal and change the directory. Run:Let’s review the files and folders in the directory:
.
├── checks
│ └── README.txt
├── classifications
│ ├── column_classifiers.sdf.yml
│ └── table_classifiers.sdf.yml
├── metadata
│ ├── analytics
│ │ ├── agg_installs_and_campaigns.sdf.yml
│ │ └── dim_marketing_campaigns.sdf.yml
│ ├── raw
│ │ ├── raw_addresses.sdf.yml
│ │ ├── raw_customers.sdf.yml
│ │ ├── raw_inapp_events.sdf.yml
│ │ └── raw_marketing_campaign_events.sdf.yml
│ └── staging
│ └── app_installs.sdf.yml
├── models
│ ├── analytics
│ │ └── agg_installs_and_campaigns.sql
│ ├── raw
│ │ ├── raw_addresses.sql
│ │ ├── raw_customers.sql
│ │ ├── raw_inapp_events.sql
│ │ └── raw_marketing_campaign_events.sql
│ └── staging
│ ├── app_installs.sql
│ ├── app_installs_v2.sql
│ ├── customers.sql
│ ├── inapp_events.sql
│ ├── marketing_campaigns.sql
│ └── stg_installs_per_campaign.sql
├── reports
│ └── README.txt
├── seeds
│ ├── csv
│ │ └── README.txt
│ └── parquet
│ ├── addresses.parquet
│ ├── customers.parquet
│ ├── inapp_events.parquet
│ └── marketing_campaign_events.parquet
└── workspace.sdf.yml15 directories, 28 files- Raw data seeds are available in the
seedsfolder. - Models (SQL files) are available in the
modelsfolder. - The workspace is defined in the
workspace.sdf.ymlconfiguration file.
3
Source Tables
Let’s explore the tables. In your terminal, run:SDF will print out the full description of the raw tables (seeds) that exist in our workspace:We can also get a sample of each table if we run the command:For example, the results for ** Note that these files are randomly generated and do not contain any real data
- Customers
- Marketing campaigns
- Mobile in-app events
- Street addresses
raw_customers look like this:Schema moms_flower_shop.raw.raw_customers
All relevant information related to customers known to mom s flower shop. This information comes from the user input into the mobile app.┌─────────────┬───────────┬────────────┬────────────────────────────────────────────────────────────┐
│ column_name ┆ data_type ┆ classifier ┆ description │
╞═════════════╪═══════════╪════════════╪════════════════════════════════════════════════════════════╡
│ id ┆ bigint ┆ ┆ A unique identifier of a mom s floewr shop customer │
│ first_name ┆ varchar ┆ ┆ The first name of the customer │
│ last_name ┆ varchar ┆ ┆ The last name of the customer │
│ email ┆ varchar ┆ ┆ The email of the customer │
│ gender ┆ varchar ┆ ┆ The gender of the customer │
│ address_id ┆ bigint ┆ ┆ The address_id of the address associated with the customer │
└─────────────┴───────────┴────────────┴────────────────────────────────────────────────────────────┘raw_customers look like this:Table moms_flower_shop.raw.raw_customers
┌────┬────────────┬─────────────┬─────────────────────────────┬─────────────┬────────────┐
│ id ┆ first_name ┆ last_name ┆ email ┆ gender ┆ address_id │
╞════╪════════════╪═════════════╪═════════════════════════════╪═════════════╪════════════╡
│ 1 ┆ Willi ┆ Vannuchi ┆ wvannuchi0@engadget.com ┆ Male ┆ 73 │
│ 2 ┆ Opalina ┆ Christophle ┆ ochristophle1@sina.com.cn ┆ Female ┆ 17 │
│ 3 ┆ Parke ┆ Tackley ┆ ptackley2@deliciousdays.com ┆ Male ┆ 89 │
│ 4 ┆ Berton ┆ Klamp ┆ bklamp3@desdev.cn ┆ Male ┆ 99 │
│ 5 ┆ Taryn ┆ Brook ┆ tbrook4@flavors.me ┆ Female ┆ 86 │
│ 6 ┆ Moina ┆ Crake ┆ mcrake5@uol.com.br ┆ Female ┆ 38 │
│ 7 ┆ Robbyn ┆ Chezelle ┆ rchezelle6@themeforest.net ┆ Female ┆ 92 │
│ 8 ┆ Carce ┆ Clendening ┆ cclendening7@chron.com ┆ Male ┆ 64 │
│ 9 ┆ Pennie ┆ Shilton ┆ pshilton8@mapy.cz ┆ Genderqueer ┆ 13 │
│ 10 ┆ Brig ┆ Graal ┆ bgraal9@microsoft.com ┆ Male ┆ 38 │
└────┴────────────┴─────────────┴─────────────────────────────┴─────────────┴────────────┘
1000 rows, showing only 10 rows.
Run with —limit 0 to show all rows.4
Other Models
We can also explore the other models that are found under the The output should look like this:Let’s see how it could’ve looked like without SDF’s optimization. First we need
to clean cache. Run:Now, compile again by running:Notice the difference? Imagine running unoptimized compilations on a warehouse of your magnitude.To see the schema of each table, similarly to the source tables in the previous step,
simply add the flag
models directory.To see the full list of models displayed in the order of execution, we can run:Working set 11 model files, 22 .sdf files
Compiling moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Compiling moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Compiling moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Compiling moms_flower_shop.staging.app_installs_v2 (./models/staging/app_installs_v2.sql)
Compiling moms_flower_shop.analytics.agg_installs_and_campaigns (./models/analytics/agg_installs_and_campaigns.sql)
Compiling moms_flower_shop.staging.customers (./models/staging/customers.sql)
Compiling moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Finished 11 models [7 succeeded, 4 reused] in 0.931 secsWorking set 11 model files, 22 .sdf files
Compiling moms_flower_shop.raw.raw_addresses (./models/raw/raw_addresses.sql)
Compiling moms_flower_shop.raw.raw_customers (./models/raw/raw_customers.sql)
Compiling moms_flower_shop.raw.raw_marketing_campaign_events (./models/raw/raw_marketing_campaign_events.sql)
Compiling moms_flower_shop.raw.raw_inapp_events (./models/raw/raw_inapp_events.sql)
Compiling moms_flower_shop.staging.marketing_campaigns (./models/staging/marketing_campaigns.sql)
Compiling moms_flower_shop.staging.inapp_events (./models/staging/inapp_events.sql)
Compiling moms_flower_shop.staging.app_installs_v2 (./models/staging/app_installs_v2.sql)
Compiling moms_flower_shop.staging.app_installs (./models/staging/app_installs.sql)
Compiling moms_flower_shop.analytics.agg_installs_and_campaigns (./models/analytics/agg_installs_and_campaigns.sql)
Compiling moms_flower_shop.staging.customers (./models/staging/customers.sql)
Compiling moms_flower_shop.staging.stg_installs_per_campaign (./models/staging/stg_installs_per_campaign.sql)
Finished 11 models [11 succeeded] in 0.910 secs--show all to the command: