01 - Identifying the strategy for SAP Datasphere
01 - Identifying the strategy for SAP Datasphere
With the announcement of the strategic partnerships, we clearly underline our openness to best
integrate with external tools many of you might already be using. Each of these strategic partners
brings the unique strengths of their ecosystems:
• Collibra for data cataloging and data governance
• Databricks integrates SAP data with their Data Lakehouse platform
• Confluent sets your data in motion with real-time event and streaming data
• DataRobot empowers organizations to leverage augmented intelligence with AutoML
• Google, using SAP Datasphere together with Google’s data cloud, customers can build an
end-to-end data cloud that brings data from across the enterprise landscape + AI + Cloud
Storage/BigQuery
SAP Datasphere provides a multi-cloud, multisource business semantic service for enterprise
analytics and planning. SAP Datasphere is the latest innovation in the data warehousing portfolio of
SAP. It is based on the SAP HANA Cloud and follows a clear data warehouse as a service (DWaaS)
approach in the public cloud, with fast release cycles.
The figure, SAP Datasphere Architecture, shows a high-level architecture of SAP Datasphere.
Business modeling (self service): Use graphical low-code or no-code tools that support self-service
modeling needs for business users, multi-dimensional modeling with powerful analytical
capabilities, and a built-in data preview.
Data modeling: Use graphical low-code or no-code tools, powerful built-in SQL, and data-flow
editors for modeling, transformation, and replication needs. Enrich existing datasets with external
data, coming from the Data Marketplace, CSV uploads, and third-party sources.
Data Marketplace: Data Marketplace is fully integrated into SAP Datasphere. It is tailored for
businesses to easily integrate third-party data. You can search and purchase analytical data from
data providers. The data comes in the form of objects packaged as data products, which you can use
in spaces of your SAP Datasphere tenant. Data products are either provided for free, or require the
purchase of a license at a certain cost. Some data products are available as one-time shipments. Data
providers regularly update other data products.
Data space: To provide secure modeling environments for different departments and use cases,
centrally create and provision spaces. Allocate disk and in-memory storage to spaces, set their
priority, add users, and use monitoring and logging tools to manage spaces.
Data catalog: A catalog is a comprehensive solution that collects and organizes data and metadata,
enabling businesses and technical users to make confident data-driven decisions. Catalog improves
productivity and efficiency by building trust in enterprise metadata through consistent data quality
and governance.
Data quality (governance): Publish high-quality trusted data and analytic assets, glossary terms,
and key performance indicators to a catalog. This supports self-service discovery and promotes their
reuse.
Data orchestration: SAP Datasphere can connect to SAP, non-SAP cloud, and on-premise sources,
including data lakes, to federate, replicate, and transform and load data. Re-use and migrate trusted
meta and data models residing in SAP Business Warehouse and SAP SQL Data Warehouse
implementations.
SAP Datasphere is the latest innovation in the data warehousing portfolio of SAP. It is based on the
SAP HANA Cloud and follows a clear DWaaS (Data Warehouse as a Service) approach in the
public cloud with very fast release cycles.
SAP is committed to innovate and provide industry leading data warehousing technologies to enable
the Intelligent Enterprise. SAP Data Warehousing provides continuous innovations in SAP
Datasphere, and SAP BW/4HANA.
Typical SAP Datasphere use cases are extending existing on-premise Data Warehouses by
enhancing existing on-premise data with cloud sources to define new data models for broader
insights. Another flavor is providing flexible data marts in control of business users to create new
and agile models based on different sources and take the burden from the IT departments. In
addition, there are tools available to connect data without duplication across business applications to
provide a new and consistent view of the data.
SAP BW/4HANA is the strategic solution of SAP for on-premises or private cloud deployment.
Customers with SAP Business Warehouse (BW) as strategic asset benefit from future improvements
and investment protection until 2040 at least. SAP BW has been available since 1998 and is in use
by a large customer base. It represents an integrated data warehouse application which provides pre-
defined content (LoB and industry-specific), tools and processes to design and operate an enterprise
data warehouse end-to-end. This approach simplifies development, administration, and user
interface of your data warehouse, resulting in enhanced business agility for building a data
warehouse in a standardized way. SAP BW/4HANA was published in 2016 as the logical successor
of SAP NetWeaver-based BW with a strong focus on integration existing SAP ABAP application
components into the SAP HANA platform. It follows a simplified data warehouse approach, with
agile and flexible data modeling, SAP HANA-optimized processes, and user-friendly interfaces.
SAP's Data Warehouse approaches are not isolated concepts. In fact, SAP puts a special focus on
providing options to integrate them with each other. The interoperability of the application driven
approaches of SAP BW powered by SAP HANA and later SAP BW/4HANA with SAP HANA data
models started already in 2014 and is based on external SAP HANA Views generated from
InfoProviders on the one hand, and Integration options for SAP HANA artifacts in SAP BW or SAP
BW/4HANA on the other hand.
SAP Datasphere provides multiple integration options and thus hybrid use cases as well.
Customers do not have to convert SAP BW/4HANA to the next DWH solution SAP Datasphere like
in the past. There are different approaches available and customers could choose their most suitable
way.
Some customers also come to the conclusion that they want to start completely new with SAP
Datasphere, because their current BW data models are outdated and not worth bringing into the new
architecture. That means building up a business data fabric without any of the constraints of a
legacy deployment/approach. Complete focus on modernization and a simplified data landscape.
Other customers decide, for example, starting on a greenfield with the development of a common
data layer on legacy data in proven SAP BW technology with all unique functions such as
transforming data, managing data loading with partitioning, monitoring or error handling.
When it comes to securing investments and skills, there is no getting around of SAP BW bridge in
SAP Datasphere. You can move your SAP BW models and queries to the SAP BW bridge with a
tool-supported conversion.
The last option shown here is aimed specifically at customers who want to release data models from
SAP BW/4HANA and take them over in the direction of Datasphere.
Note
For more details, refer to following sources:
• SAP note 2741041: Maintenance for SAP NetWeaver 7.x Business Warehouse and SAP
BW/4HANA
• Blog: How to use SAP Datasphere as a source for SAP BW/4HANA
https://blogs.sap.com/2022/12/02/bestpractice-how-to-use-sap-data-warehouse-cloud-as-
source-for-bw-4hana/
• Blog: SAP Datasphere & SAP BW/4HANA Model Transfer – Step by Step Guidance
https://blogs.sap.com/2020/11/30/sap-data-warehouse-cloud-sap-bw-4hana-model-transfer-
step-by-step-guidance/
• SAP note 2932647: Supported and unsupported features with SAP BW/4HANA/SAP BW
Bridge Model Transfer in SAP Datasphere
• SAP note 3141688: Conversion from SAP BW or SAP BW/4HANA to SAP Datasphere,
SAP BW Bridge
• Blog: BW Query Transfer https://blogs.sap.com/2023/01/25/entity-import-model-transfer-
for-single-query-sap-bw-bridge/
Introducing SAP Datasphere Spaces
Spaces in SAP Datasphere
A space is a secure area that an administrator creates. In it, members can acquire, prepare, and
model data. The administrator allocates disc storage and in-memory storage to the space, sets its
priority, and can limit how much memory and how many threads its statements can consume.
If the administrator assigns one or more space administrators as members of the space, they can
then assign other members, create connections to source systems, secure data with data access
controls, and manage other aspects of the space.
Spaces are virtual work environments with their own databases. Spaces are decoupled, but are open
for flexible access, thus enabling your users to collaborate without being concerned about sharing
their data. To model your data and create stories, you must start off with a space. You can decide
how much and what kind of storage you need, as well as how important your space is compared to
other spaces. You also add space members and set up connections here. You might already have
transformed data you want to access through SAP Datasphere. Or, you may have data in SAP
Datasphere that you want to use in other tools or apps. In these cases, you can set up an open SQL
schema or a space schema.
Note
If you do not define a default role, then the system assigns a user the minimum required
permissions. The user can log in and request a role, but only if you configure one or more roles
for self-service and assign users a manager. For more information, refer to "Control User Access
to Your Space" in Integrating Data and Managing Spaces.
Spaces are secured virtual work environments which:
• Provide isolation for metadata objects and space resources
• Define storage quota, control resource usage, and workload class settings per space
• Maintain space-specific source system connections and a common time dimension
• Manage user access for space members in combination with scoped roles
• Enable sharing of data and currency conversion settings with other spaces
Watch this video to understand how to create spaces.
With SAP Datasphere Spaces, you can seamlessly share information across departments. The
sharing mechanism simplifies this process, allowing you to model master data once for use by
multiple departments.
Introducing SAP Datasphere Integration
Options
Integration of Data into SAP Datasphere
Introduction
SAP Datasphere provides a large set of default Built-in-connectors to access data from a wide
range of sources, in the cloud or on-premise, from SAP or from non-SAP sources or partner tools.
Connections provide access to data from a wide range of remote systems, cloud as well as on-
premise, SAP as well as non-SAP, and partner tools. They allow users assigned to a space to use
objects from the connected remote system as source to acquire, prepare, and access data from those
sources in SAP Datasphere. In addition, you can use certain connections to define targets for
replication flows.
To learn more about integrating SAP applications, refer to the how-to paper by SAP at:
https://help.sap.com/docs/SAP_DATASPHERE/be5967d099974c69b77f4549425ca4c0/8f98d3c917
f94452bafe288055b60b35.html?locale=en-US.
Watch this video to understand how to create connections for integrating SAP applications.
Create a connection to allow users assigned to a space to use the connected source or target for data
modeling and data access in SAP Datasphere.
Each connection type supports a defined set of features. Depending on the connection type and the
connection configuration, you can use a Connection for one or more of the following features:
• Remote Tables
The remote tables feature supports building views. After you create a connection in the
graphical view editor of the Data Builder, a modeler can add a source object (usually a
database table or view) from the connection to a view. The source object deploys a remote
table.
During import, the tables deploy as remote tables. Depending on the connection type, you
can use remote tables for the following tasks:
◦ Directly access data in the source (remote access)
◦ Copy the full set of data (snapshot or scheduled replication)
◦ Copy data changes in real time (real-time replication)
• Data Flows, Replication Flows, and Transformation Flows
The flow feature supports building data flows, replication flows, and transformation flows.
After you have created a connection, in the respective flow editors of the Data Builder, a
modeler can add a source object from the connection to a data flow to integrate and
transform your data.
• External Tools
SAP Datasphere is open to SAP and non-SAP tools to integrate data to SAP Datasphere.
By default, when you import a remote table, its data does not replicate and you must access it using
federation each time from the remote system. You can improve performance by replicating the data
to SAP Datasphere and you can schedule regular updates (or, for many connection types, enable
real-time replication) to keep data fresh and up-to-date.
Many connections (including most connections to SAP systems) support importing remote tables to
federate or replicate data (see Integrating Data via Connections).
You can import remote tables to make the data available in your space from the Data Builder start
page, in an entity-relationship model, or directly as a source in a view.
• By default, remote tables federate data, and each time the data is used, a call is made to the
remote system to load it. You can improve performance by enabling replication to store the
data in SAP Datasphere. Some connections support real-time replication and for others, you
can keep your data fresh by scheduling regular updates (see Replicate Remote Table Data).
• To optimize replication performance and reduce your data footprint, you can remove
unneccessary columns and set filters (see Restrict Remote Table Data Loads).
• To maximize access performance, you can store the replicated data in-memory (see
Accelerate Table Data Access with In-Memory Storage).
• Once a remote table is imported, it is available for use by all users of the space and can be
used as a source for views.
• You can automate sequences of data replication and loading tasks with task chains (see
Creating a Task Chain).
By default, when you import a remote table, its data is not replicated and must be accessed via
federation each time from the remote system. You can improve performance by replicating the data
to SAP Datasphere and you can schedule regular updates (or, for many connection types, enable
real-time replication) to keep the data fresh and up-to-date.
Certain connections support loading data from multiple source objects to SAP Datasphere via a
replication flow. You can enable a single initial load or request initial and delta loads and perform
simple projection operations (see Creating a Replication Flow).
Figure 12: SAP Datasphere replication flow
Create a replication flow to copy multiple data assets from a source to a target.
You can use replication flows to copy data from the following source objects:
• CDS views (in ABAP-based SAP systems) that are enabled for extraction
• Tables that have a primary key
• Objects from ODP providers, such as extractors or SAP BW artifacts (from any SAP system
that is based on SAP NetWeaver and has a suitable version of the DMIS add-on, see SAP
Note 3412110).
For more information about available connection types, sources, and targets, see Integrating Data
via Connections.
Create a transformation flow to load data from one or more source tables, apply transformations
(such as a join), and output the result in a target table. You can load a full set of data from one or
more source tables to a target table. You can add local tables and also remote tables located in SAP
BW Bridge spaces. Note that remote tables located in SAP BW Bridge spaces must be shared with
the SAP Datasphere space you are using to create your transformation flow. You can also load delta
changes (including deleted records) from one source table to a target table.
Group multiple tasks into a task chain and run them manually once, or periodically, through a
schedule. You can create linear task chains in which one task is run after another. (You can also nest
other task chains within a task chain.) Or, you can create task chains in which individual tasks are
run in parallel and successful continuation of the entire task chain run depends on whether ANY or
ALL parallel tasks are completed successfully. In addition, when creating or editing a task chain,
you can also set up email notification for deployed task chains to notify selected users of task chain
completion.
Introducing Data Modeling in the Data
Builder
The Data Builder
Initial Steps of Modeling in SAP Datasphere
IT driven users with the DW Modeler role can import data directly into the Data Builder from
connections and other sources, and use flows to replicate, extract, transform and load data.
Business users with the DW Modeler role can use the Business Builder editors to combine, refine,
and enrich Data Builder objects and expose lightweight, tightly-focused perspectives for
consumption by SAP Analytics Cloud and MS Excel.
SAP Datasphere offers multiple modeling capabilities that address different user groups – from
business analysts with deep business understanding to tech-savvy developers and power users. In a
typical end-to-end scenario, the following two closely related components of SAP Datasphere are
used/applied:"
• The SAP Datasphere Data Layer contains the basic modeling of the underlying data
sources and tables. The related set of tools is available in the SAP Datasphere Data
Builder. Here, developers and modelers use tables, views, and intelligent lookups to
combine, clean, and prepare data. You can expose views directly to SAP Analytics Cloud
and other analytics clients.
• The SAP Datasphere Business Layer enables users to create their own business models on
top based on business terms. The SAP Datasphere Business Builder provides the related
set of tools. The Business Builder can consume Data Builder objects for further modeling
and consumption. As a more semantic approach, business users create their models using the
Business Builder editors. They combine, refine, and enrich the SAP Datasphere Data Builder
artifacts with further semantic information before exposing lightweight, tightly-focused
objects. SAP Analytics Cloud and other analytic clients consume these objects.
• Fact - to indicate that your entity contains one or more measures that can be analyzed
• Dimensions - to indicate that your entity contains attributes that are used to analyze and
categorize measures defined in other entities
• Hierarchy - to indicate that your entity contains parent-child relationships for members in a
dimension
• Text - to indicate that your entity contains strings with language identifiers to translate text
attributes
SAP Datasphere provides an editor to model Data Builder artifacts in an intuitive graphical
interface. You can drag and drop sources from the Source Browser and join them as appropriate.
You can add other operators to remove or create columns and filter or aggregate data. You can also
specify measures and other aspects of your output structure in the output node.
Figure 17: Editor to model Data Builder artifacts
SAP Datasphere provides various methods for importing tables or database views into your space.
Examples:
• Data Builder start page: Use the import wizard to import remote tables from any connection.
• E/R Model: Import tables into the space and add them to the E/R model diagram.
• Graphical or SQL view: Import tables into the space and use them directly as sources in the
view.
Watch this video to see how to build a new Analytical View in the IT driven space.
Next, let's look at how you can access data from the spaces assigned to you.
Enable the "Expose for Consumption" switch (reporting layer) to make a view available for
consumption in SAP Analytic Cloud, other analytic clients, and in ETL and other tools. You can
only access views that have the Expose for Consumption switch enabled outside the space.
Note
It is recommended from SAP to use the model type Analytic Model for consumption with SAP
Analytic Cloud.
he Analytic Model is now the object for building stories in SAP Analytics Cloud.
The analytic model replaces Analytical Dataset which is exposed for consumption. Analytical
Datasets are still available, but you should now use the analytic model.
The Analytic Model offers more calculations and greater functionality. You can remove what you
want to expose in your object, avoiding unnecessary calculations and, as a result, achieving better
performance. It also offers calculated measures, restricted measures, and an analytical preview.
Although Analytical Datasets are still available, new features are only developed for the Analytic
Model. You can easily create an Analytic Model on top of Analytical Datasets.
The sources for analytic models are facts (or analytical datasets). These facts can contain
dimensions, texts, and hierarchies.
Analytic Model
• Rich multidimensional & analytical modeling reusing predefined measures, hierarchies,
filters, parameters & associations
• Built-in analytical preview incl. filtering, pivoting, hierarchies, etc.
• Seamlessly integrates with SAP Analytics Cloud and Office 365 to analyze data
Procedure
1. To open the editor, from the side navigation, choose Data Builder, select a space (if
necessary), and choose New Analytic Model.
2. Add a source.
3. To select or deselect any measures, associated dimensions, or attributes in the properties
panel on the right, choose your fact source on the canvas.
4. To edit the properties of the analytic model, choose the background of the canvas, showing
the analytic model's properties in the side panel.
5. To edit the properties of the fact source, choose the fact source on the canvas, showing its
properties in the side panel.
6. To edit the properties of a dimension, choose the dimension on the canvas, showing its
properties in the side panel.
7. When you click on a dimension or the fact source on the canvas, you can change the alias of
this item. The alias is the name that is shown in the story in SAP Analytics Cloud.
8. Choose Preview to check that the data appears as expected. There are two options: there is
the simple preview which is available via the context menu in the editor at the analytic
model.
Figure 24: Consuming data in SAP Analytics Cloud via live connection
• Direct consumption of data models in SAP Analytics Cloud via live connection
• Ability to connect live to source data sets in SAP Datasphere and create stories in SAP
Analytics Cloud
• SAP Datasphere tenants can be connected to any number of SAP Analytics Cloud systems,
and vice versa
• The live connectivity needs to be set up for each SAP Analytics Cloud system
• Metadata translation can be enabled for SAP Analytics Cloud stories
• SAP Note 2832606 - Limitations with live connections
Watch this video to see how to consume SAP Datasphere in SAP Analytics Cloud.
A Story involves the ability to analyze the data generated by the connections you made during your
modeling phase. You then visualize it and tell a story based on the data. With SAP Datasphere, you
can ensure you have the fresh data you need. You can also ensure your data is ready to be
interpreted without spending valuable time organizing it.
In SAP Datasphere, building a story is usually the last step in data analysis. Here, you visually
discover and communicate important data insights. Creating a story is as easy as choosing the charts
and selecting the appropriate dimensions and measures. However, you can deploy different
strategies and best practices to build engaging stories that memorably communicate exactly what
you want.
Define Your Goals
The best way to approach story building is to outline exactly what you are looking for from the
data. This provides a clear jumping-off point from which you can find other insights. Your primary
objective could be to learn what customers are buying the most through looking for any possible
buying trends. It could also be to discover which salesperson has the highest sales. It could be both.
To do these, you focus on answering each question with charts, then, after you answer the big
questions, unfold additional hidden insights.
Each business entity created in the Business Builder consumes data from a Data Builder entity. As
you can, at any time, switch the data source of a business entity to a different Data Builder entity,
this loose coupling allows you to maintain stable business entities for reporting, even as your
physical data sources change.
Business entities can define measures or attributes. Measures are quantifiable values that refer to an
aggregable field of the underlying model. An attribute is a descriptive element of a business entity
and provides meaningful business insights into measures. The underlying model is a view or a table,
which has been created in the Data Builder.
In order to model a meaningful consumption model, business entities define associations between
each other. All potential association targets can be pre-defined on the data layer in order to provide
business users a variety of modeling options to choose from when preparing their use case-specific
consumption model.
Business entities can be modeled as a Dimension or as a Fact. The definition doesn't really differ
but rather the intended usage within the Fact Model. Dimensions are generally used to contain
master data and must have a key defined. Facts are generally used to contain transactional data and
must have at least one measure defined.