0% found this document useful (0 votes)
33 views

Session 2 - Data Ingestion

Uploaded by

Joe Nishanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Session 2 - Data Ingestion

Uploaded by

Joe Nishanth
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 77

Data Cloud

Academy
Get Certified

Webinar starting soon


Please stand by ...
Data Cloud
Academy
Get Certified

Session 2 of 8
Data Ingestion
Poll Time

1. Did you attend the last session (S01 - Data Cloud Overview)?
First, some logistics
Questions, answers and videos

How do you ask a question?


• Please Post your question only in the Q&A section on
your Zoom Window.
How do you turn on Closed Captions?
• At the bottom of your screen, click on “closed
captions” John Smith
Will this be available as a recording after today?
• Yes, a recording of this event will be available on
demand

Bookmark -> Program Guide

sfdc.co/DCAcademyGuide
Forward Looking Statements
This presentation contains forward-looking statements about, among other things, trend analyses and statements regarding future events, future financial performance, anticipated growth, industry prospects,
environmental, social and governance goals, our strategies, expectation or plans regarding our investments, including strategic investments or acquisitions, our beliefs or expectations regarding our competition, our
intentions regarding use of future earnings or dividends, and the expected timing of product releases and enhancements. The achievement or success of the matters covered by such forward-looking statements
involves risks, uncertainties and assumptions. If any such risks or uncertainties materialize or if any of the assumptions prove incorrect, Salesforce’s results could differ materially from the results expressed or implied by
these forward-looking statements. The risks and uncertainties referred to above include those factors discussed in Salesforce’s reports filed from time to time with the Securities and Exchange Commission, including,
but not limited to: our ability to maintain security levels and service performance that meet the expectations of our customers, and the resources and costs required to avoid unanticipated downtime and prevent,
detect and remediate performance degradation and security breaches; the expenses associated with our data centers and third-party infrastructure providers; our ability to secure additional data center capacity; our
reliance on third-party hardware, software and platform providers; uncertainties regarding AI technologies and its integration into our product offerings; the effect of evolving domestic and foreign government
regulations, including those related to the provision of services on the Internet, those related to accessing the Internet, and those addressing data privacy, cross-border data transfers and import and export controls;
current and potential litigation involving us or our industry, including litigation involving acquired entities, and the resolution or settlement thereof; regulatory developments and regulatory investigations involving us or
affecting our industry; our ability to successfully introduce new services and product features, including any efforts to expand our services; the success of our strategy of acquiring or making investments in
complementary businesses, joint ventures, services, technologies and intellectual property rights; our ability to complete, on a timely basis or at all, announced transactions; our ability to realize the benefits from
acquisitions, strategic partnerships, joint ventures and investments, and successfully integrate acquired businesses and technologies; our ability to compete in the markets in which we participate; the success of our
business strategy and our plan to build our business, including our strategy to be a leading provider of enterprise cloud computing applications and platforms; our ability to execute our business plans; our ability to
continue to grow unearned revenue and remaining performance obligation; the pace of change and innovation in enterprise cloud computing services; the seasonal nature of our sales cycles; our ability to limit
customer attrition and costs related to those efforts; the success of our international expansion strategy; the demands on our personnel and infrastructure resulting from significant growth in our customer base and
operations, including as a result of acquisitions; our ability to preserve our workplace culture, including as a result of our decisions regarding our current and future office environments or remote work policies; our
dependency on the development and maintenance of the infrastructure of the Internet; our real estate and office facilities strategy and related costs and uncertainties; fluctuations in, and our ability to predict, our
operating results and cash flows; the variability in our results arising from the accounting for term license revenue products; the performance and fair value of our investments in complementary businesses through our
strategic investment portfolio; the impact of future gains or losses from our strategic investment portfolio, including gains or losses from overall market conditions that may affect the publicly traded companies within
our strategic investment portfolio; our ability to protect our intellectual property rights; our ability to maintain and enhance our brands; the impact of foreign currency exchange rate and interest rate fluctuations on our
results; the valuation of our deferred tax assets and the release of related valuation allowances; the potential availability of additional tax assets in the future; the impact of new accounting pronouncements and tax
laws; uncertainties affecting our ability to estimate our tax rate; uncertainties regarding our tax obligations in connection with potential jurisdictional transfers of intellectual property, including the tax rate, the timing of
transfers and the value of such transferred intellectual property; uncertainties regarding the effect of general economic, business and market conditions, including inflationary pressures, general economic downturn or
recession, market volatility, increasing interest rates, changes in monetary policy and the prospect of a shutdown of the U.S. federal government; the potential impact of financial institution instability; the impact of
geopolitical events, including the ongoing armed conflict in Europe; uncertainties regarding the impact of expensing stock options and other equity awards; the sufficiency of our capital resources; our ability to execute
our share repurchase program; our ability to comply with our debt covenants and lease obligations; the impact of climate change, natural disasters and actual or threatened public health emergencies; expected
benefits of and timing of completion of the restructuring plan and the expected costs and charges of the restructuring plan, including, among other things, the risk that the restructuring costs and charges may be
greater than we anticipate, our restructuring efforts may adversely affect our internal programs and ability to recruit and retain skilled and motivated personnel, our restructuring efforts may be distracting to employees
and management, our restructuring efforts may negatively impact our business operations and reputation with or ability to serve customers, and our restructuring efforts may not generate their intended benefits to the
extent or as quickly as anticipated; and our ability to achieve our aspirations, goals and projections related to our environmental, social and governance initiatives, including our ability to comply with emerging
corporate responsibility regulations.

September 8, 2023
Today’s Agenda

Data Lifecycle & Ingestion

Ingestion Methods

Configuring Data Streams

Next Steps

Q&A
Your Salesforce Team

Deepthi Kamath Durgesh Dhoot Gaj Sisodia Zafar Mohammad


Partner Practice Partner Practice Partner Product Partner Practice
Development Development Success Development
Quick Recap into Session 1
Recap - Data Cloud Overview
Data Cloud Overview:
A trusted hyperscale data engine inside Salesforce

Partner Opportunities:
Use Case driven approach to understand customer challenges

Data Cloud Credential:


Target persona, Exam outline, General tips

Solution overview:
Trust and Data Ethics, Right to be Forgotten through consent API,
What Data cloud does and does not, examples of Industry use cases
Recap - Setup and Administration
Data Cloud Provisioning
● Home Org and Existing org
● Data Cloud Permission Sets (Data Cloud Admin and Data Cloud Marketing
Admin)
● Object Permissions
● Creating Profiles & adding users

Data Cloud Topology


● Data Cloud OOTB connectors for Marketing Cloud, CRM, B2C, Cloud
Storages, APIs & SDKs
● Data Spaces
Trailhead & Partner Learning Camp
Walkthrough
Recap of Homework - Trailhead


Verify access to below links
1
● Data Cloud Consultant Certification: http://sfdc.co/DCCert
● Data Cloud Consultant Exam Guide: http://sfdc.co/DCCertGuide
Start with Prepare for your Salesforce Data Cloud Consultant Credential

✅ 2
● Salesforce Data Cloud Consultant Credential Trailmix

Complete
✅ 3 ● Data Cloud: Solution Overview
● Data Cloud: Setup and Administration Bookmark -> Program Guide

sfdc.co/DCAcademyGuide
Recap of Homework - Partner Learning Camp
➔ Verify access to Bookmark -> Program Guide

✅ ●

Partner Community
PLC (Partner Learning Camp)
✅ sfdc.co/DCAcademyGuide

➔ Enroll for
✅ ● Data Cloud: Practical Experience Course
http://sfdc.co/DCPractical
➔ Slack Channel
✅➔ Register for Free S3 Trial Account ✅ Join #help-dc-academy-april2024 Slack channel
- Trial Account

✅➔ ●
Complete
Activity: Partner Pocket Guide Data Cloud
➔ Request a Data Cloud Trial org*

● Activity: Join Collaboration Channels
● Activity: Request a Data Cloud Trial Org
● Activity: Review the Data Cloud Practical Experience
FAQ
sfdc.co/dctrialorgroe
● Activity: Set Up Your Instance *Trial orgs only available for 30 days

NOTE: If you are Not a Salesforce Partner you can sign up for a free, 5-day Developer Edition org with Data Cloud
Frequently Asked Questions
I can’t access the PLC! What do I do?
● Review & follow troubleshooting post in Slack
● Contact [email protected] if still
blocked
How do I request a Trial org?
● Login to Partner Learning camp, browse to demo org
section and request for DCO org
● Wait ~24 Hours for the org to be provisioned
I have a specific technical question!
● Ask in the Q&A quip or #help-dc-academy-april2024
slack channel
Can I review the recordings or deck?
● Recordings will be posted in the quip linked below

Bookmark -> Program Guide

sfdc.co/DCAcademyGuide
Data Ingestion
The Big Picture: Implementation Themes
Related to the components of Data Cloud

Provisioning Insights & Analytics


Provision and set up Data Cloud instance, Derive insights from your mapped data,
users and permissions, configure explore and visualize it in the analytical and

Data Consumption
integrations to source/target systems, etc. business intelligence tools.

Data Preparation
Data Ingestion Segmentation
Set up data streams bringing data into Data Turn mapped data into useful audiences or
Cloud from various supported sources and segments, to understand, target or analyze
applying necessary transformations customers at the unified level.

Data Mapping (Harmonization) Activation


Map ingested data into the Customer 360 Materialize created segments and publish
data model, making it available for to relevant activation/engagement
unification, segmentation and activation platforms. Trigger relevant business
processes based on data points identified
within Data Cloud. Consume and expose
Identity Resolution (Unification) data in relevant user experiences within
Configure rules for individual matching other systems .
across sources of data, establish preference
for unified attributes reconciliation rules
Let’s walk through how this works
A “day in the life” of customer data

Data Sources Connect & Prepare Harmonize Act

Customer 360 Customer 360


Out-of-the-Box Data Spaces
Connectors
Einstein
Cloud Storage
MuleSoft Anypoint Data Models
Amazon S3 Platform
Einstein Studio
Google Cloud
Microsoft Azure
Data Bundles Data Mapping
Segmentation
Zero-Copy Federation
Snowflake Streaming & Batch Customer Graph
Google BigQuery Data Ingestion Calculated Insights

Mobile & Web Streaming & Batch Identity Resolution Automations


Data Transforms
APIs & SDKs

Analytics
Legacy Systems

Third Party
Let’s walk through how this works
A “day in the life” of customer data

Data Sources

Customer 360

Cloud Storage
Amazon S3
Google Cloud
● What Data would you load?
Microsoft Azure
● What connectors do you need?
Zero-Copy Federation
Snowflake
● How much data would you load?
Google BigQuery

Mobile & Web

APIs & SDKs

Legacy Systems
Let’s walk through how this works
A “day in the life” of customer data

Data Sources Connect & Prepare Harmonize Act

Customer 360 Customer 360


Out-of-the-Box Data Spaces
Connectors
Einstein
Cloud Storage
MuleSoft Anypoint Data Models
Amazon S3 Platform
Einstein Studio
Google Cloud
Microsoft Azure
Data Bundles Data Mapping
Segmentation
Zero-Copy Federation
Snowflake Streaming & Batch Customer Graph
Google BigQuery Data Ingestion Calculated Insights

Mobile & Web Streaming & Batch Identity Resolution Automations


Data Transforms
APIs & SDKs

Analytics
Legacy Systems

Third Party
Hyperscale Data Store
Data Lakehouse combines best of Data Lake and Data Warehouse worlds

Data Lake Data Warehouse Data Lakehouse

Type of data Semi-structured and Structured data Structured, semi-structured, and


unstructured data unstructured data

Purpose Applicable for machine Best for data analytics and BI, Flexible storage, can be used for
learning and artificial but limited to particular research, data analytics and ML
intelligence tasks
+ problem-solving
=
ACID Non-ACID compliant: data ACID-compliant: ensures the ACID-compliant: ensures
integrity issues integrity of data consistency of data read and
compliance
written by multiple sources

Cost of storage Cost-effective, fast, flexible Expensive, time-consuming Cost-effective, easy, allows for a
lot of flexibility, reduced data
duplication
Data Ingestion Flow Overview

Data Streams: Data Spaces:


An entity that can be Partitions of your prepared data and its utilized components
extracted from a variety of
Data Source Systems
Formulas
where data resides Ways to perform
(e.g. CRM, SFMC, etc.) minor adjustments
at the time of
ingestion

Data Source Data Lake Data Model


Objects (DSOs): Bulk / Streaming Objects (DLOs): Objects (DMOs):
Original, ingested Transformation Source data Either materialized or
data Ways to perform hydrated with views on top of the
major joins / filters / transformations Data Lake Objects.
transformations on
DLOs
Data Cloud Objects

Data Source Object Data Lake Object Data Model Object


An object of data When a data stream is A harmonized grouping
ingested as-is from the created, Data Cloud of data created from
original data source. automatically creates a data streams, insights,
This is the original file locally stored Data Lake and other data sources.
format (e.g. CSV) or Object (DLO) with the
transient data storage in same name. The DLO
case of built-in includes transforms and
connectors (e.g. system source ids in Data must first be mapped
Marketing Cloud). addition to the raw from a DLO to a DMO for
source data. use in segmentation,
activation, and analytics.
Data Cloud Objects Contd.
Data progresses from bronze -> silver -> gold Data Transformations

Data is logically organized as 4 parts

Data Source Objects - the original data


sources. This is the customer’s original file
format (e.g. CSV) or transient data storage
in case of built-in connectors. (e.g.
Data Source Data Model Marketing cloud)
Data Lake
Object (DSO) Objects (DLO) Object (DMO)
Data Lake Objects - the data that is
transformed and actually stored in the lake
This is generally stored as Parquet files.

● Multi Format (Json, csv, ● Schema enforced ● Semantic Mapping establishes Data Spaces - Once your data has been
parquet, orc) ● Parquet formatted Iceberg Tables DLO to DMO ingested, it is assigned to a Data Space that
● ●
Multi Sourced - Cloud Storage, Hydrated by transformations ● Can be optionally materialized acts as a partition, allowing you greater
Mulesoft, Kafka ● Typed (Profile Vs Engagement) ● Insights, Unified Profiles are control over how your data is organized
● Schema Preserving ● Materialized Tables DMOs
● Salesforce Data come direct into ● Simplified Curated Data to Data Model Objects - These are either
● Virtual BYOL Tables
Lake Objects Powers Business Applications materialized or views on top of the Data
Lake Objects. These can be Customer 360
DMO or materialized ones such as Unified
Individual, Computed Insights,
transformations etc.
Data Streams & Sources
Ingestion - Marketing Cloud
What is it?
• Native Integration with Marketing Cloud to bring in any MC data into
Data Cloud
• Ingest Data from any data extension in MC in a few clicks and any
channel related data like Opens, Clicks, Bounce etc.

Unique Value Proposition


• Seamless Native Integration with Salesforce Proprietary APIs
• Time To Value - Pre-Built Bundles
• All Engagement Data from MC available in CDP instantly.
• Clicks Not Code
• Ingest Data at High Scale & Velocity

Example Use Cases


Ingest Email Open, Click data to identify top engagers for segmentation
Ingest Einstein scores for AI Based Segmentation
Surface Marketing Insights to CRM Agents
Ingestion - Salesforce CRM
What is it?
• Ingest any data from CRM in a few clicks across Sales, Service, Loyalty
& any CRM Object.
• Pre-Made Bundles mapped to CDP Data Model
• Packaging capabilities to create industry specific/ISV bundles

Unique Value Proposition


• Seamless Native Integration with Salesforce Proprietary APIs
• Time To Value - Pre-Built Bundles
• All Data from CRM accessible at big data scale - Competitors offer
integration that only allow subset of data to be ingested.
• Clicks Not Code

Example Use Cases


Ingest Accounts, Case, Leads, Loyalty data for segmentation.
Single repository of Data across CRM & Other sources for BI
Ingestion - B2C Commerce Cloud
What is it?
• Ingest Commerce Cloud Order Data and Related Customer and
Catalog Data with OOTB Connector

Unique Value Proposition


• This is a unique capability offered only between Salesforce CDP and
Commerce Cloud
• Time To Value - Pre-Built Bundles
• Clicks Not Code

Example Use Cases


● Unify Online data from Commerce with Offline data coming from
other sources to understand lifetime value of the customer.
● Leverage Order Data to create affinities based on previous purchasing
patterns within CDP
Ingestion - Web SDK
What is it?
• SDK/Tag to capture real-time customer events from the brand’s
website

Unique Value Proposition


• Unified SDK with Personalization allows Data Collection and
Actionability using same tag

Example Use Cases


● Collect Real-Time Web Behaviour - Views, Clicks, Add to Cart, Form
Submission, Watch Video etc
● Trigger actions based on real-time behavior on any channel - Email,
SMS, Push, Sales/Service Events, External Webhooks, Slack Message,
Stream to Warehouse & more
● Leverage Web Data for other use cases for Insights, Identity
Resolution, Segmentation, Activation, Business Intelligence,
Personalization
Ingestion - Mobile SDK
What is it?
• Mobile SDK to capture all mobile transactions, behaviors, and other
events.

Unique Value Proposition


• Unified SDK with Marketing Cloud allows same SDK to capture
mobile events as well as trigger personalized push, in-app
messages and more
• Fully Integrated with Journey Builder to trigger omni-channel
journeys

Example Use Cases


● Collect Real-Time Mobile Behaviour - Views, Clicks, Add to Cart,
Form Submission, Watch Video etc
● Trigger actions based on real-time behavior on any channel -
Email, SMS, Push, Sales/Service Events, External Webhooks, Slack
Message, Stream to Warehouse & more
● Leverage Web Data for other use cases for Insights, Identity
Resolution, Segmentation, Activation, Business Intelligence,
Personalization
Ingestion - APIs
What is it?
• Streaming and Bulk APIs
• Send data from any application to CDP

Unique Value Proposition


• Easily Configurable Schema
• Designed for High Scale, High Velocity
• Packaging support for re-usability

Example Use Cases


● Ingest Real-Time POS data from Store
● Ingest Weather Updates
● Ingest Loyalty Data
● Ingest External Data Sources from any system
Ingestion - Personalization
What is it?
• Native integration to ingest data from Interaction Studio.
• Integrate multiple datasets to allow ability to get a global view
across brands and regions.
• All types of events are ingested in few clicks.
• Ingest anonymous and known data.

Unique Value Proposition


• Faster time to value with customers already using Personalization
• Native Connector with Personalization Engine to streamline Data
collection and Personalization.

Example Use Cases


● Build affinities within CDP using Calculated Insights based on raw
data from Personalization
● Create a superset of data to understand/segment on/report on
customer lifetime value, affinities across business units and
datasets for personalization.
Ingestion - Amazon S3
What is it?
• Ingest data from any system via S3 bucket
• Import data stored on public cloud seamlessly

Unique Value Proposition


• UI Driven Experience
• Clicks not Code
• Automatic delimiter, data type, and date time pattern detection
• Ability to transform incoming data easily
• Wildcard match to accommodate date-stamped or otherwise
changing file names
• High water mark tracking to allow only reading new files
• Customized scheduler (hourly, weekly, monthly)
• Compressed with Zip and GZ compression standards

Example Use Cases


● Ingest any and all external data sources in bulk.
● Ingest data available in customer’s data lake or other services on
AWS in a few clicks.
Ingestion - Google Cloud Storage
What is it?
• Ingest data from any system via Google Cloud Storage..
• Import data stored on public cloud seamlessly.

Unique Value Proposition


• UI-Driven Experience
• Automatic delimiter, data type, and date time pattern detection
• Ability to transform incoming data easily
• Wildcard match to accommodate date-stamped or otherwise
changing file names
• High water mark tracking to allow only reading new files
• Customized scheduler (hourly, weekly, monthly)

Example Use Cases


● Ingest any and all external data sources in bulk.
● Ingest data available in customer’s data lake or other services on
GCP in a few clicks.
● Ingest Google Analytics data of your choice via the
GA->Bigquery->GCS ingestion route.
Ingestion - Muleso
What is it?
• Native Integration with Mulesoft to ingest data using Streaming
and Bulk APIs

Unique Value Proposition


• Mulesoft opens an ecosystem of 250+ OOTB Native Connectors
• If you need a connector, chances are mulesoft already have one.
• API First strategy for data integration and complex use cases
• Reduce Time to build custom integrations
• IT agility
• Accelerator Patterns focused on time to value

Example Use Cases


● Ingest data from legacy systems
● Ingest data from External systems like POS, OMS, Snowflake, Azure
and other connectors for which CDP does not have ootb approach
Data Ingestion Timings EXAM TIP
EXAM TIP

Lookback
Connectors Data Delivery Latency Refresh Mode
Window
Marketing Cloud 90 days Batch Hourly - 24 Hours Upsert or Full Refresh
Hourly Upsert
Salesforce CRM No limit Batch
Bi-weekly Full Refresh
Cloud File Storage
None Batch Hourly Upsert or Full Refresh
(S3, GCS, Azure)
Sales Order and Sales Order
Sales Order - Upsert
B2C Commerce 30 days Batch Customer - Hourly
All others - Full Refresh
Others - Daily
Profile - 15 minutes Users - Upsert
Marketing Cloud Personalization 0 days Near Real Time
Events/Engagement - 2 mins All others - Insert
Ingestion API (Batch and
Near Real Time 15 minutes Upsert
Streaming)
User Profiles - Hourly
Web and Mobile SDK Near Real Time
Engagement - 15 minutes
Mulesoft (using Ingestion API) Near Real Time 15 Minutes
EXAM TIP
EXAM TIP
Data Object Type Categories

Profile Engagement Other


Segment oriented data set. Time-series oriented data set. Data sets which are related to
A data set which contains any An Event Time field must be Profile or Engagement data.
population you wish to segment defined as part of set-up. The Time-series data sets which do
by, or use as the starting date field chosen for Event Time not have an immutable date
population for a segment. should be immutable otherwise field.
records will be duplicated.

Party Contact Point Contact Point


Order Case Visit Catalog Lookups Price Books
Identification Mobile App

Profile Contact Point


Device Download
Attributes Email

Important
You cannot change the category after saving the data stream
EXAM TIP
EXAM TIP
Data Field Types

Text Number Date DateTime


Stores any kind of text Stores numbers with a Holds the calendar Stores an instant in
data. It can contain fixed scale of 18 and date without a time time expressed as a
both single-byte and precision of 38. part or time zone. calendar date and time
multibyte characters Scale represents the If the incoming data of day.
that the locale number of fractional record includes a time A valid datetime must
supports. digits. Precision part for a field include the time part
Zero length strings (“”) represents the count of configured as type and time zone
and no value are digits, regardless of the date, the time part is (following ISO-8601
treated as empty location of the decimal ignored. standard). If time part
strings. point. and time zone are not
included, it’s inferred
as 00:00:00 UTC.

See the full list of expressions by field type.


Data Field Types Added February 2024

Email Percent Phone URL


Stores email addresses. Holds percentage Stores phone numbers. Stores URL values. Data
The email data type is values. The percent Data Cloud doesn’t Cloud doesn’t parse or
modeled on the text data type is modeled validate the format of interpret the ingested
data type. You can use on the number data the phone number. URL value. It also
any valid text value for type. Only valid The phone data type is doesn’t validate the
ingestion into an email numeric values are modeled on the text value for correctness.
data type field. Data accepted for ingestion data type. Any valid Data Cloud doesn’t
Cloud doesn’t validate into a percent data text value is accepted store any metadata
the format. type field. for ingestion into a related to the URL.
phone data type field.

See the full list of expressions by field type.


Formulas
Leverage Formula Fields to Enhance or Enrich Source Data for Mapping

Primary Keys Set Picklist-type Values Standardization


Create needed primary keys for Create fields that bucket values Ensure consistent data
mapping. to simplify segmentation. formatting for activation.

Consider functions like CONCAT() Consider functions like IF(), AND(), or Consider functions like PROPER() or
NOT() REPLACE()

Transform Data Sources with Formula Fields


Supplemental fields can be hard-coded or derived from other fields in the data stream.
Streaming Data Transforms
Transform data in near real-time

A streaming data transform reads one record in a source data lake object, reshapes the record data, and writes one or more
records to a target data lake object.
The source and target objects must be different objects.
A streaming data transform runs continuously as a streaming process, picking up new or changed data.

Use Case:
Normalize Data with UNION Use Case
Batch Data Transforms
Use a batch data transform to create a repeatable series of operations to transform your data and
load it into a target for further usage.
● Does a full refresh of data
● Can use multiple source objects
● Can be used with DLOs or DMOs
Batch Transforms Streaming Transforms
● Does a full refresh ● Acts on one row of data at a time
● Repeatable process, can be ● Transforms data as it’s ingested
scheduled or triggered manually ● Works only with DLOs
● Works with DLOs or DMOs as source ● Does not replace Calculated Insights
objects
● Does not replace Calculated Insights
Recap: Ingestion
Steps for Configuring Data Streams

6. Configure
2. Select Data 3. Define Data 4. Confirm 5. Apply
1. Select Data Updates to
Source Object Stream Data Source Transforms &
Source Data Source
(Dataset) Properties Object Schema Data Space
Object

Optionally add
Choose previously Choose starting
formula fields to
connected bundle Name source, set
Verify fields and cleanse your Configure refresh
or or label, developer
data types, set source data or mode and set the
authenticate new select object name and data
primary key derive new fields schedule
data source or category
and assign to a
(cloud storage) specify filename
data space

Note: Data spaces


are not currently
mentioned in PLC
Let’s See Data Ingestion in Action
Knowledge Check

1. Data cloud is _____________


a. Data Lake
b. Data Lakeshouse
c. Data Warehouse
d. CRM
Knowledge Check - Answer

1. Data cloud is _____________


a. Data Lake
b. Data Lakeshouse ✅
c. Data Warehouse
d. CRM
Knowledge Check

2. True or False, The starter data bundle can be ingested multiple times.

a. True
b. False
Knowledge Check - Answer

2. True or False, The starter data bundle can be ingested multiple times.

a. True
b. False ✅
Knowledge Check

3. Which type of data extraction setting is best for when the data changes frequently, but
the majority of the data stays the same?

a. Delta Extract by Number


b. Upsert
c. Delta Extract by Date
d. Full Refresh
Knowledge Check - Answer

3. Which type of data extraction setting is best for when the data changes frequently, but
the majority of the data stays the same?

a. Delta Extract by Number


b. Upsert ✅
c. Delta Extract by Date
d. Full Refresh
Knowledge Check

4. Which Data Object Type Category you will use to ingest Purchase Order Details?

a. Profile
b. Sales Order
c. Engagement
d. Other
Knowledge Check - Answer

4. Which Data Object Type Category you will use to ingest Purchase Order Details?

a. Profile
b. Sales Order
c. Engagement ✅
d. Other
Your Next Activity
Tip: Do not rush through the hands on
Activity without completing the Trails
Step 1 : Trailhead Modules

Prepare to Build Your Data Model


Define key terms related to data ingestion and modeling.
Identify the benefits of using Data Cloud.

Review Data Ingestion and Modelling Phases


Examine how data is ingested into Data Cloud.
Configure key qualifiers to help interpret ingested data. Important Note:
Apply basic data modeling concepts to your account. Follow along with these
exercises but wait until the
hands-on exercise
instructions before
Create Data Streams configuring your Trial Org
Bring standard and custom data sources into Data Cloud. account!
Create a data stream in Marketing Cloud.
Manipulate your data to better suit your marketing tasks.
Learn from formula expression use cases.
Step 2 : Trailhead Rise Article

DATA CLOUD INGESTION : Data Streams


● Select Data Source
● Select Data Source Objects
● Define Data Stream Properties
● Confirm the Data Source Object Schema
● Apply necessary Row level Transformation
● Configure updates to Data Source Object

Important Note:
Follow along with these
EXPLORE SALESFORCE CRM DATA INGESTION exercises but wait until the
hands-on exercise
● Starter Bundle instructions before
● Direct object Ingestion configuring your Trial Org
● Data kits for CRM Data streams account!
Step 3 : Partner Learning Camp

DATA CLOUD PRACTICAL EXPERIENCE


● Activity: Set Up Your Instance
● Activity: Prepare Your Data
● Activity: Configure Data Ingestion
Important Note:
● Activity: Configure Batch Data Transforms Follow along with these
exercises but wait until the
hands-on exercise
instructions before
configuring your Trial Org
account!
Hands-on Exercise Use Case

Requirements
The RAV Group wants to combine data from two
Scenario systems: Salesforce CRM that contains data from their Solution
vehicle rental business and eCommerce platform with
RAV Group is a company that ● Use an independent Data Cloud
transactions from their retail brand selling sports org that will ingest data from
has several lines of businesses equipment. These brands operate as independent multiple data sources, and post
operating under one main entities from the customer experience perspective. After segmentation activated
brand. They use the following the data is brought into the Data Cloud platform the RAV audiences into relevant target
products: Group wants to identify and merge records of the people systems.
that exist in both systems. ● Ingest Data from all data sources
● Salesforce Marketing Cloud
● Salesforce CRM: contains data ● Extend the standard Data Model.
from vehicle rental business The combined audience should enable users from either ● Perform harmonization
● eCommerce platform: with of the brands to create segments of customers with unification of the data.
transactions from their retail certain characteristics that are sourced from either of the ● Create required insights and
brand selling sports equipment systems. These segments then can be consumed for segments from unified profiles.
further actions, such as analytics, retargeting on the ● Activate segments to be
eCommerce site , activating segments on Marketing consumed by a target system.
cloud and creating a segment that can be used in a
campaign in Marketing Cloud. Create Other segments
and calculated insights for analysis.
Let’s walk through how this works
A “day in the life” of RAV Group

Data Sources Connect & Prepare Harmonize Act

Customer 360
Salesforce Out-of-the-Box Data Spaces
CRM Connectors

contains data
from vehicle Data Models
rental business
Data Bundles Segmentation

Data Mapping
Amazon S3
Batch Data Calculated Insights
Transactions from
Ingestion Identity Resolution
their retail brand
selling sports
equipment Automations

Analytics

Third Party
● Install a data bundle to assist with ingesting
data from a Salesforce org.

● Connect Data Cloud to a Salesforce org for


data ingestion.
Data Ingestion
● Create data streams to ingest data from
Exercise Goals Salesforce orgs.

● Create data streams to ingest data from


Amazon S3.

● Learn how to transform data during ingestion


to cleanse or add business logic to source
data.
Activity: Prepare Your Data
Salesforce CRM Data Common Mistake
Many consultants skip the “Set Audit Fields” step
in the Setup exercise, this will cause your
installation to fail. Make sure to set this field!

Found in Setup
Go to User Interface -> User Interface
Explore Salesforce CRM Data Ingestion
Bundles: Quickly integrate common data sets from multiple sources

Bundles
Native Connectors
Pre-configured Unified Data Model
Data Sources

Data Model
Mappings
Activity: Configure Data Ingestion
CRM Starter Data Bundle

Expect a roadblock!
We have designed this exercise to show you what happens when
the object you want to ingest has insufficient permissions with
the integration user.
Activity: Prepare Your Data
Amazon S3 Data Sources

Common Mistake
Many consultants skip the guide where we
configure the Amazon S3 bucket, its
Access Policy, and the User we will use to
connect. Without this step complete, you
won’t be able to connect to your S3
Bucket!

S3 Guide here:
https://salesforce.quip.com/Ge0zAXFPc
YLE
Activity: Configure Data Ingestion
Applying Data Transformations

Common Mistakes
Assign a Data Space: with the addition of Data
Spaces, make sure to assign this new DLO to your
default data space!
Reference your Org Id: Replace the placeholder
ORGID fields with your actual Data Cloud Org Id.

Data Point 1

Denormalized Record with 2 Data Points Data Transform

Data Point 2
Activity: Configure Data Ingestion
Ingest Data from S3 Bucket

Common Mistakes

● Make sure your S3 User has the appropriate


permissions to read the files in your bucket
● Set your refresh rate to none - this data will not
change, and daily refreshes may impact your
free S3 account
● Validate your data types! You cannot map two
different types of data to the same field in a
DMO
● Do not initially refresh the Product SKU Lookup
and Related Activity Lookup data sources! We
will make more configuration changes before
this should be refreshed.
Ingestion - Exam Tips
Salesforce Certified Data Cloud Consultant
Allotted Passing
Total Question
Time Score
Data Cloud
Consultant
60 105 min 62 %

Exam Outline

Solution Overview | 18%

Activation | 18% Setup & Administration | 12%

Segmentation and Insights | 18% Data Ingestion & Modeling | 20%

Identity Resolution | 14%

Test takers are strongly advised to complete the Data Cloud Partner Learning Camp Curriculum before
attempting the exam
Salesforce Certified Data Cloud Consultant
Allotted Passing
Total Question
Time Score
Data Cloud
Consultant
60 105 min 62 %

Study Tips for Ingestion

Things that require special attention - study these concepts!


● Key Terminology: Especially DSO, DLO, DMO
● Data Spaces: This is a new feature launched since the PLC was last revised
● Native Connectors: Understand what’s available & review hard limits
● Cloud Storage: Understand how to ingest data from Cloud Storage & how S3 differs from
Azure, GCS
● Bundles: Understand bundle offerings & objects ingested - watch this video on SFMC bundles
● Schedules & Modes: Understand ingestion timings & differences between a full refresh and a
upsert
● Data Categories: Understand the three types & benefits
● Formulas & Transformations: Understand when & how they’re applied & limitations
● Field Types: Understand what’s available (esp. difference b/t Date & DateTime)

Salesforce recommends a combination of hands-on experience, training courses in Partner Learning


Camp, and self-study in the areas listed in the Exam Outline section of this exam guide
Salesforce Certified Data Cloud Consultant
Allotted Passing
Total Question
Time Score
Data Cloud
Consultant
60 105 min 62 %

Study Tips for Ingestion

Ingestion Consideration - remember these concepts!


● Incorrect policy settings are the most
common issue when facing errors with S3 ● Understand what field types are available
ingestion. and particularly the difference between Date
& DateTime.
● Understand bundle offerings & objects
ingested
● Understand ingestion timings & differences
between a full refresh and a upsert.
● Carefully select the data category (profile,
engagement, or other). You cannot change
after saving the data stream.
● Understand when & how formulas and
transformations are applied and their
● Make sure to assign all new DLOs to the
limitations.
correct data space (default must be
assigned if only one).
Salesforce recommends a combination of hands-on experience, training courses in Partner Learning
Camp, and self-study in the areas listed in the Exam Outline section of this exam guide
Next Steps
Your Homework for Next Time
Goal (Homework) post this call -
Trailhead
Verify access to below links
1
● Data Cloud Consultant Certification: http://sfdc.co/DCCert
● Data Cloud Consultant Exam Guide: http://sfdc.co/DCCertGuide
Start with Prepare for your Salesforce Data Cloud Consultant Credential

2
● Salesforce Data Cloud Consultant Credential Trailmix

Complete
3 ● Data Cloud: Ingestion
Bookmark -> Program Guide
● Data Cloud : CRM Data Ingestion
sfdc.co/DCAcademyGuide
Goal (Homework) post this call -
Partner Learning Camp

➔ Register for Free S3 Trial Account


- Trial Account

➔ Complete
● Activity: Set Up Your Instance
● Activity: Prepare Your Data
● Activity: Configure Data Ingestion
● Activity: Configure Batch Transforms

➔ Extra Credit
Watch the last Marketing Cloud Moments featuring
Data Cloud & Bundles
https://mcmoments.hubs.vidyard.com/
Data Cloud
Pocket Guide
sfdc.co/datacloudpocketguide
Vouchers

❏ Voucher eligibility for the Data Cloud Consultant Exam

❏ Participant need to be a Salesforce Partner


❏ Academy Participation
❏ Trail-mix Completion
❏ Activity Completion

❏ We will be sharing the entire process towards the end of


this series (i.e. 2nd week of May). Keep Watching …
❏ All details will be shared in Program Guide.
Bookmark -> Program Guide

sfdc.co/DCAcademyGuide
Q&A
We will try to answer most of queries here in this sheet:
http://sfdc.co/DCAcademyQnA

Bookmark -> Program Guide

sfdc.co/DCAcademyGuide
Thank you
Please provide your valuable feedback
post closing this zoom session, your
feedback will be very valuable to us

You might also like