Skip to content

Latest commit

 

History

History
110 lines (67 loc) · 5.23 KB

data-warehouse-as-a-source.md

File metadata and controls

110 lines (67 loc) · 5.23 KB
description
Detailed technical description of RudderStack's Warehouse Actions feature, with step-by-step instructions for configuration.

RudderStack Warehouse Actions

RudderStack provides an infrastructure to track, capture, transform, and route your customer event data to your preferred destinations. However, this data can be quite raw in nature - comprising mainly of customer interactions with your website, mobile apps, or other digital assets.

While the customer event data is important to reconstruct your customers' behavior, there is scope to further enrich customer profiles by processing the other customer-related data residing in every company's main data infrastructure - the Data Warehouse.

With RudderStack's Warehouse Actions feature, you can leverage the already processed customer data residing in your data warehouse and route this enriched information to your desired destinations.

{% hint style="success" %} With this feature, you can configure your data warehouse as a source on the RudderStack dashboard, select the right data and then sync this data to all the destinations that are supported by RudderStack. {% endhint %}

{% hint style="info" %} RudderStack currently supports Google BigQuery, Amazon Redshift, PostgreSQL, ClickHouse, and Snowflake as sources. {% endhint %}

Configuring Warehouse Actions on RudderStack

{% hint style="info" %} Configuring a warehouse actions on the RudderStack dashboard involves the following steps:

  1. Setting a name for the source and providing connection credentials for RudderStack to access the data warehouse.
  2. Defining the data that should be pulled out of the data warehouse and subsequently sent to the specified destinations through RudderStack.
  3. Choosing a data update schedule and defining how often the data synchronization should happen between the data warehouse and RudderStack. {% endhint %}

To configure your data warehouse as a source on the RudderStack dashboard, follow these steps:

  • Log into your RudderStack dashboard.
  • Navigate to Sources, present in the left panel of the dashboard.
  • Choose your preferred data warehouse which you want to configure as a source, as shown. Then, click on Next.

Specifying Connection Credentials

  • Assign a name to your source. Then, click on the Create credentials from scratch button, if you are configuring your data warehouse in the RudderStack dashboard for the first time.

{% hint style="info" %} If you have previously configured your data warehouse as a source on the RudderStack dashboard, you can simply use the existing credentials and proceed. {% endhint %}

  • Next, enter the connection credentials to configure your data warehouse connection with RudderStack, as shown:

{% hint style="warning" %} RudderStack currently supports Google BigQuery, PostgreSQL, ClickHouse, Amazon Redshift, and Snowflake as sources. The connection settings will vary according to each warehouse. {% endhint %}

Specifying Warehouse Schema and Table

  • Next, enter the data warehouse schema and the table name. RudderStack will collect the data from this table.

{% hint style="warning" %} Please note that your source table must include at least one of the following columns for it to be considered a valid source:

  • email

  • user_id

  • anonymous_id {% endhint %}

  • If the table is valid, you can then preview a snippet of the data, as shown:

  • You can also filter, select and edit the column names of the table to be included as the data source, as shown:

  • Once you've selected all the the required table columns, click on Next.

Setting the Data Update Schedule

  • RudderStack also allows you to specify the data update frequency and set a data synchronization time as per your requirement.

That's it! Your data warehouse is now configured and added as a RudderStack source.

{% hint style="info" %} Currently all events sent to RudderStack via warehouse as a source will be identify() events {% endhint %}

Now you can connect this source to any RudderStack destination of your choice. RudderStack will collect the enriched customer data from the specified table columns in your warehouse source and send it to the destination for your activation use-cases.

FAQs

What type of events are supported by the RudderStack Warehouse Actions?

Currently all events from the RudderStack warehouse actions are identify() events.

Contact Us

To know more about RudderStack's Warehouse Actions feature, feel free to contact us or start a conversation on our Slack channel. You can also see this feature in action by requesting a demo.