Optimize Your Dataflows with Replication

Replication is a feature available in Wave Analytics and is used for decoupling the data from your dataflow. This feature allows you to extract data on a separate schedule. By scheduling the extracts, the dataflows have less to do and run faster. When you enable the replication option, then you can identify any additional actions that are available for the replication.

Overview 

Replication is a feature available in Wave Analytics and is used for decoupling the data from your dataflow. This feature allows you to extract data on a separate schedule. By scheduling the extracts, the dataflows have less to do and run faster. When you enable the replication option, then you can identify any additional actions that are available for the replication.

Features of Data Replication

  1. Create multiple Dataflows.
  2. Schedule the replication to run the dataflows.
  3. Automatically update the replications.
  4. Remove Fields.
  5. Add Filter rows.

What is Data Replication?

It’s a stored set of extracted data or a set of steps that you want the dataflow or replication to perform from an existing source. We can view and edit our replication settings, and create dataflows, and more from the data manager tab.

data flow optimization

Fig: Data Replication Model

Enable Replication

Setup [Symbol] Enter â€œWave Analytics â€œin the Quick Find box, [Symbol] Settings[Symbol] Enable Replication [Symbol] Save.

Once we enable the replication, we can view the Salesforce objects, setup a replication schedule, and run the replication.

Create a Dataflow (Replication)

Step 1: Go to Force.com App Menu [Symbol] Wave Analytics [Symbol] Click Settings Icon[Symbol] Choose Data Manager

data flow optimization

Fig: Select Data Manager window

Step 2: Click Dataflow tab; then click Create Dataflow

  1. Here, we must create new Dataflow from Data Manager Window by clicking the drop-down list and select the â€œDataflow View â€œoption.
  2. It will be adding the transformations directly to the dataflow definition file.
  3. This file is a JSON format that contains transformations that represent the dataflow logic.
  4. Dataflow definition file is saved with UTF-8 coding.
data flow optimization

Fig: Data flow View

Step 3: Click the schedule if you want to start the schedule for the dataflow.

  1. After we run a dataflow job for the first time, it will run as per the daily schedules, by default.
  2. We can schedule the dataflow with an interval of an hour, week or month.
  3. We can also unschedule the dataflow.
data flow optimization

Fig: Schedule Data flow

Step 4: Enter the day and hours in schedule Dataflow window; then click save button.

data flow optimization

Fig: Schedule Data flow (Day and Hours)

Step 5: We can Edit and Run the replication in the source data replication window

We can edit the existing replication and run the replication at a specific time.

data flow optimization

Fig: Edit and Run the Replication

Step 6: We can create more than one Dataflow Replication at a time.

  1. We can create multiple dataflows for different purposes and run them on their own schedules.
  2. We can break large dataflows into smaller ones to build datasets faster.
  3. We can delete unwanted dataflows and disable them to temporarily stop them from running.
data flow optimization

Fig:  Many Dataflow Replication

Step 7: Add the filters in Replication Settings window

  1. We can create replication filters to filter the field values and drill down the results.
  2. We can disable replication for a field here too. Hover the field column and click the cross option.
  3. Here, we can add more fields and remove the fields.
data flow optimization

Fig:  Adding filters

Pros:

  1. Increased reliability and availability.
  2. Replicated copies of data is always faster.
  3. Less communication overhead.

Cons:

  1. More storage space is needed.
  2. Update operation is costly.
  3. Maintaining data integrity is complex.

Summary 

This article illustrates the process of optimizing your Dataflow efficiently and schedule the Dataflow replication in our Wave analytics. Also, creating the multiple dataflows with replication is made easy.

Reference Link 

https://releasenotes.docs.salesforce.com/en-us/winter17/release-notes/rn_bi_integrate_replication.htm
https://help.salesforce.com/articleView?id=bi_integrate_understand_enable_replication.htm&language=en_US&type=0

About MST

At MST Solutions our cornerstone is to adapt, engage and create solutions which guarantee the success of our clients. The talent of our team and experiences in varied business verticals gives us an advantage over other competitors.

Recent Articles

Work with us.

Our people aren’t just employees, they are key to the success of our business. We recognize the strengths of each individual and allow them time and resources to further develop those skills, crafting a culture of leaders who are passionate about where they are going within our organization.