0% found this document useful (0 votes)
63 views8 pages

SAP HANA Database: Data Management For Modern Business Applications

This document discusses the SAP HANA database, which is positioned as the core of the SAP HANA Appliance. The SAP HANA database consists of multiple data processing engines that provide full spectrum data processing capabilities. It supports both relational and non-relational data like graphs and text. The database also provides domain-specific languages and built-in business functions to better support modern application requirements beyond traditional SQL.

Uploaded by

Ajay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views8 pages

SAP HANA Database: Data Management For Modern Business Applications

This document discusses the SAP HANA database, which is positioned as the core of the SAP HANA Appliance. The SAP HANA database consists of multiple data processing engines that provide full spectrum data processing capabilities. It supports both relational and non-relational data like graphs and text. The database also provides domain-specific languages and built-in business functions to better support modern application requirements beyond traditional SQL.

Uploaded by

Ajay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/220416086

SAP HANA database: Data management for modern business applications

Article  in  ACM SIGMOD Record · December 2011


DOI: 10.1145/2094114.2094126 · Source: DBLP

CITATIONS READS
274 7,459

6 authors, including:

Franz Färber Sang Kyun Cha


SAP Research Seoul National University
42 PUBLICATIONS   2,324 CITATIONS    55 PUBLICATIONS   921 CITATIONS   

SEE PROFILE SEE PROFILE

Jürgen Primsch Christof Bornhövd


SAP Research SAP Labs, LLC, Palo Alto, United States
2 PUBLICATIONS   277 CITATIONS    58 PUBLICATIONS   1,380 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Adaptive Online Partitioning of Schema-flexible Data View project

Renewable Energy Forecasting (ReEF) View project

All content following this page was uploaded by Stefan Sigg on 01 June 2014.

The user has requested enhancement of the downloaded file.


SAP HANA Database - Data Management for Modern
Business Applications

Franz Färber #1 , Sang Kyun Cha +2 , Jürgen Primsch ?3 ,


Christof Bornhövd ∗4 , Stefan Sigg #5 , Wolfgang Lehner #6
#
SAP – Dietmar-Hopp-Allee 16 – 69190, Walldorf, Germany
+
SAP – 63-7 Banpo 4-dong, Seochoku – 137-804, Seoul, Korea
?
SAP – Rosenthaler Str. 30 – 10178, Berlin, Germany

SAP – 3412 Hillview Ave – Palo Alto, CA 94304, USA
1 2 3
[email protected] [email protected] [email protected]
4 5 6
[email protected] [email protected] [email protected]

ABSTRACT in transactional environments on the one hand are build-


The SAP HANA database is positioned as the core of ing the sums of already-delivered orders, or calculat-
the SAP HANA Appliance to support complex business ing the overall liabilities per customer. On the other
analytical processes in combination with transactionally hand, analytical queries require the immediate avail-
consistent operational workloads. Within this paper, ability of operational data to enable accurate insights
we outline the basic characteristics of the SAP HANA and real-time decision making. Furthermore, applica-
database, emphasizing the distinctive features that dif- tions demand a holistic, consistent, and detailed view
ferentiate the SAP HANA database from other classical of its underlying business processes, thus leading to
relational database management systems. On the tech- huge data volumes that have to be kept online, ready
nical side, the SAP HANA database consists of mul- for querying and analytics. Moreover, non-standard ap-
tiple data processing engines with a distributed query plications like planning or simulations require a flexi-
processing environment to provide the full spectrum of ble and graph-based data model, e.g., to compute the
data processing – from classical relational data support- maximum throughput of typical business relationship
ing both row- and column-oriented physical representa- patterns within a partner network. Finally, text re-
tions in a hybrid engine, to graph and text processing trieval technology is a must-have in state-of-the-art data
for semi- and unstructured data management within the management platforms to link unstructured or semi-
same system. structured data or results of information retrieval queries
From a more application-oriented perspective, we to structured business-related contents.
outline the specific support provided by the SAP HANA In a nutshell, the spectrum of required application
database of multiple domain-specific languages with a support is tremendously heterogeneous and exhibits a
built-in set of natively implemented business functions. huge variety of interaction patterns. Since classical
SQL – as the lingua franca for relational database sys- SQL-based data management engines are too narrow
tems – can no longer be considered to meet all require- for these application requirements, the SAP HANA
ments of modern applications, which demand the tight database presents itself as a first step towards a holis-
interaction with the data management layer. Therefore, tic data management platform providing robust and ef-
the SAP HANA database permits the exchange of ap- ficient data management services for the specific needs
plication semantics with the underlying data manage- of modern business applications [5].
ment platform that can be exploited to increase query The SAP HANA database is a component of the
expressiveness and to reduce the number of individual overall SAP HANA Appliance that provides the data
application-to-database round trips. management foundation for renovated and newly devel-
oped SAP applications (see Section 4). Figure 1 out-
1. INTRODUCTION lines the different components of the SAP HANA Ap-
pliance. The SAP HANA Appliance comprises repli-
Data management requirements for enterprise appli- cation and data transformation services to easily move
cations have changed significantly in the past few years. SAP and non-SAP data into the HANA system, model-
For example, it is no longer reasonable to continue the ing services to create the business models that can be
classical distinction between transactional and analyti- deployed and leveraged during runtime, and the SAP
cal access patterns. From a business perspective, queries

SIGMOD Record, December 2011 (Vol. 40, No. 4) 45




 

 
models” inside the database engine to push down
    more application semantics into the data manage-

 

ment layer. In addition to registering semantically


richer data structures (e.g., OLAP cubes with mea-

  # !$ sures and dimensions), SAP HANA also provides
 " access to specific business logics implemented di-
   


 rectly deep inside the database engine. The SAP
HANA Business Function Library encapsulates
those application procedures. Section 3 will ex-




 
%   plain this feature from different perspectives.

• Exploitation of current hardware developments:


Figure 1: Components of the SAP HANA Appliance Modern data management systems must con-
sider current developments with respect to large
HANA database as its core. For the rest of this paper, amounts of available main memory, the num-
we specifically focus on the SAP HANA database. ber of cores per node, cluster configurations, and
SSD/flash storage characteristics in order to effi-
Core Distinctive Features of the SAP HANA ciently leverage modern hardware resources and
Database to guarantee good query performance. The SAP
Before diving into the details, we will outline some gen- HANA database is built from the ground up to ex-
eral distinctive features and design guidelines to show ecute in parallel and main-memory-centric envi-
the key differentiators with respect to common rela- ronments. In particular, providing scalable par-
tional, SQL-based database management systems. We allelism is the overall design criteria for both
believe that these features represent the cornerstones of system-level up to application-level algorithms [6,
the philosophy behind the SAP HANA database: 7].
• Multi-engine query processing environment: In or- • Efficient communication with the application
der to cope with the requirements of managing en- layer: In addition to running generic application
terprise data with different characteristics in dif- modules inside the database, the system is required
ferent ways, the SAP HANA database comprises a to communicate efficiently with the application
multi-engine query processing environment. In or- layer. To meet this requirement, plans within
der to support the core features of enterprise appli- SAP HANA development are, on the one hand, to
cations, the SAP HANA database provides SQL- provide shared-memory communication with SAP
based access to relationally structured data with proprietary application servers and more closely
full transactional support. Since more and more align the data types used within each. On the other
applications require the enrichment of classically hand, we plan to integrate novel application server
structured data with semi-structured, unstructured, technology directly into the SAP HANA database
or text data, the SAP HANA database provides a cluster infrastructure to enable interweaved execu-
text search engine in addition to its classical rela- tion of application logic and database management
tional query engine. The HANA database engine functionality.
supports “joining” semi-structured data to rela-
tions in the classical model, in addition to support- 2. ARCHITECTURE OVERVIEW
ing direct entity extraction procedures on semi-
The SAP HANA database is a memory-centric data
structured data. Finally, a graph engine natively
management system that leverages the capabilities of
provides the capability to run graph algorithms on
modern hardware, especially very large amounts of
networks of data entities to support business ap-
main memory, multi-core CPUs, and SDD storage, in
plications like production planning, supply chain
order to improve the performance of analytical and
optimization, or social network analyses. Section
transactional applications. The HANA database pro-
2 will outline some of the details.
vides the high-performance data storage and processing
• Representation of application-specific business engine within the HANA Appliance.
objects: In contrast to classical relational Figure 2 shows the architecture of the HANA
databases, the SAP HANA database is able to pro- database system. The Connection and Session Man-
vide a deep understanding of the business objects agement component creates and manages sessions and
used in the application layer. The SAP HANA connections for the database clients. Once a session
database makes it possible to register “semantic has been established, database clients can use SQL (via

46 SIGMOD Record, December 2011 (Vol. 40, No. 4)


#!!!"!

"!! "

   


 "  * !+ * +

#" ) #"
("
 "(   "

&#"  !)
"

)  ' !!!

"" "
 *%#   &"
" +

 !!"'
'

$ '

 "

Figure 2: The SAP HANA database architecture

JDBC or ODBC), SQL Script, MDX or other domain- The Authorization Manager is invoked by other
specific languages like SAP’s proprietary language FOX HANA database components to check whether a user
for planning applications, or WIPE, which combines has the required privileges to execute the requested op-
graph traversal and manipulation with BI-like data ag- erations. A privilege grants the right to perform a spec-
gregation to communicate with the HANA database. ified operation (such as create, update, select, or exe-
SQL Script is a powerful scripting language to describe cute). The database also supports analytical privileges
application-specific calculations inside the database. that represent filters or hierarchy drill-down limitations
SQL Script is based on side-effect-free functions that for analytical queries as well as control access to val-
operate on database tables using SQL queries, and it has ues with a certain combination of dimension attributes.
been designed to enable optimization and paralleliza- Users are either authenticated by the database itself, or
tion. the authentication is delegated to an external authentica-
As outlined in our introduction, the SAP HANA tion provider, such as an LDAP directory.
database provides full ACID transactions. The Trans- Metadata in the HANA database, such as table defi-
action Manager coordinates database transactions, con- nitions, views, indexes, and the definition of SQL Script
trols transactional isolation, and keeps track of run- functions, are managed by the Metadata Manager. Such
ning and closed transactions. For concurrency con- metadata of different types is stored in one common cat-
trol, the SAP HANA database implements the classical alogue for all underlying storage engines.
MVCC principle that allows long-running read transac- The center of Figure 2 shows the three In-Memory
tions without blocking update transactions. MVCC, in Storage Engines of the HANA database, i.e., the Re-
combination with a time-travel mechanism, allows tem- lational Engine, the Graph Engine, and the Text En-
poral queries inside the Relational Engine. gine. The Relational Engine supports both row- and
Client requests are parsed and optimized in the Opti- column-oriented physical representations of relational
mizer and Plan Generator layer. Based on the optimized tables. The Relational Engine combines SAP’s P*Time
execution plan, the Execution Engine invokes the differ- database engine and SAP’s TREX engine currently be-
ent In-Memory Processing Engines and routes interme- ing marketed as SAP BWA to accelerate BI queries in
diate results between consecutive execution steps. the context of SAP BW. Column-oriented data is stored
SQL Script and supported domain-specific languages in a highly compressed format in order to improve the
are translated by their specific compilers into an inter- efficiency of memory resource usage and to speed up the
nal representation called the “Calculation Model”. The data transfer from storage to memory or from memory
execution of these calculation models is performed by to CPU. A system administrator specifies at definition
the Calculation Engine. The use of calculation models time whether a new table is to be stored in a row- or in
facilitates the combination of data stored in different In- a column-oriented format. Row- and column-oriented
Memory Storage Engines as well as the easy implemen- database tables can be seamlessly combined into one
tation of application-specific operators in the database SQL statement, and subsequently, tables can be moved
engine. from one representation form to the other [4]. As a

SIGMOD Record, December 2011 (Vol. 40, No. 4) 47


rule of thumb, user and application data is stored in a 3. SAP HANA DATABASE: BEYOND
column-oriented format to benefit from the high com- SQL
pression rate and from the highly optimized access for
As outlined in our introduction, the SAP HANA
selection and aggregation queries. Metadata or data with
database is positioned as a modern data management
very few accesses is stored in a row-oriented format.
and processing layer to support complex enterprise-
The Graph Engine supports the efficient representa-
scale applications and data-intensive business processes.
tion and processing of data graphs with a flexible typing
In addition to all optimizations and enhancements at the
system. A new dedicated storage structure and a set of
technical layer (modern hardware exploitation, colum-
optimized base operations are introduced to enable ef-
nar and row-oriented storage, support for text and irreg-
ficient graph operations via the domain-specific WIPE
ularly structured data, etc.), the core benefit of the sys-
query and manipulation language. The Graph Engine is
tem is its ability to understand and directly work with
positioned to optimally support resource planning appli-
business objects stored inside the database. Being able
cations with huge numbers of individual resources and
to exploit the knowledge of complex-structured business
complex mash-up interdependencies. The flexible type
objects and to perform highly SAP application-specific
system additionally supports the efficient execution of
business logic steps deep inside the engine is an impor-
transformation processes, like data cleansing steps in
tant differentiator of the SAP HANA database with re-
data-warehouse scenarios, to adjust the types of the indi-
spect to classical relational stores.
vidual data entries, and it enables the ad-hoc integration
More specifically, the “Beyond SQL” features of the
of data from different sources.
SAP HANA database are revealed in multiple ways. On
The Text Engine provides text indexing and search
a smaller scale, specific SQL extensions enable the ex-
capabilities, such as exact search for words and phrases,
posure of the capabilities of the specific query process-
fuzzy search (which tolerates typing errors), and lin-
ing engines. For example, an extension in the WHERE
guistic search (which finds variations of words based
clause allows the expression of fuzzy search queries
on linguistic rules). In addition, search results can be
against the text engine. An explicit “session” concept
ranked and federated search capabilities support search-
supports the planning of processes and What if? analy-
ing across multiple tables and views. This functionality
ses. Furthermore, SQL Script provides a flexible pro-
is available to applications via specific SQL extensions.
gramming language environment as a combination of
For text analyses, a separate Preprocessor Server is used
imperative and functional expressions of SQL snippets.
that leverages SAP’s Text Analysis library.
The imperative part allows one to easily express data
The Persistency Layer, illustrated at the bottom of
and control flow logic by using DDL, DML, and SQL-
Figure 2, is responsible for the durability and atomicity
Query statements as well as imperative language con-
of transactions. It manages data and log volumes on disk
structs like loops and conditionals. Functional expres-
and provides interfaces for writing and reading data that
sions, on the other hand, are used to express declarative
are leveraged by all storage engines. This layer is based
logics for the efficient execution of data-intensive com-
on the proven persistency layer of MaxDB, SAP’s com-
putations. Such logic is internally represented as data
mercialized disk-centric relational database. The per-
flows that can be executed in parallel. As a consequence,
sistency layer ensures that the database is restored to the
operations in a data flow graph must be free of side ef-
most recent committed state after a restart and that trans-
fects and must not change any global states, neither in
actions are either completely executed or completely un-
the database nor in the application. This condition is
done. To achieve this efficiently, it uses a combination
enforced by allowing only a limited subset of language
of write-ahead logs, shadow paging, and savepoints.
features to express the logic of the procedure.
To enable scalability in terms of data volumes and
On a larger scale, domain-specific languages can be
the number of application requests, the SAP HANA
supported by specific compilers to the same logical con-
database supports scale-up and scale-out. For scale-
struct of a “Calculation Model” [2]. For example, MDX
up scalability, all algorithms and data structures are de-
will be natively translated into the internal query pro-
signed to work on large multi-core architectures espe-
cessing structures by resolving complex dimensional ex-
cially focusing on cache-aware data structures and code
pressions during the compile step as much as possible
fragments. For scale-out scalability, the SAP HANA
by consulting the registered business object structures
database is designed to run on a cluster of individual ma-
stored in the metadata catalog. In contrast to classical BI
chines allowing the distribution of data and query pro-
application stacks, there is no need for an extra OLAP
cessing across multiple nodes. The scalability features
server to generate complex SQL statements. In addition,
of the SAP HANA database are heavily based on the
the database optimizer is not required to “guess” the se-
proven technology of the SAP BWA product.
mantics of the SQL statements in order to generate the
best plan – the SAP HANA database can directly ex-

48 SIGMOD Record, December 2011 (Vol. 40, No. 4)


(a) SAP Information Modeler (b) Concurrency conversion dialog

Figure 3: Modeling of currency conversion within SAP HANA

ploit the knowledge of the OLAP models carrying much application semantics of currency conversion comprises
more semantics compared to plain relational structures. more than 1,000 lines of code. Figure 3(a) illustrates
As an additional example of this “Beyond SQL” fea- the graphical tool to create a “Calculation Model” in the
ture, consider the disaggregation step in financial plan- SAP HANA database by applying a currency conversion
ning processes [3]. In order to distribute coarse-grained function to an incoming data stream. As can be seen, the
planning figures to atomic entries—for example, from data source itself comprises not only simple columns but
business unit level to department level—different dis- also comprehensive metadata such as type information
tribution schemes have to be supported: relative to the with respect to plain, calculated, or derived measures.
actual values of the previous period, following constant The application designer creates a logical view us-
distribution factors, etc. Since disaggregation is such a ing the Information Modeler and applies pre-defined ap-
crucial operation in planning, the SAP HANA database plication logics provided by the BFL. As shown in the
provides a special operator, available within its domain- modeling dialog of Figure 3(b), the parameters of the
specific programming language, for planning scenarios. currency conversion function can be set in multiple ways
Obviously, such an operator is not directly accessible via to instrument the business logic. In the current example,
SQL. Following this principle, the SAP HANA database the function performs a conversion to the currency with
also provides a connector framework to work with “ex- respect to the specific company code (given in column
ternal” language packages like the statistical program- AT_COMPANY_CODE.WAERS).
ming environment R [1]. To summarize, the SAP HANA database provides a
In addition to specifically tailored operators, the SAP classical SQL interface including all transactional prop-
HANA database also provides a built-in Business Func- erties required from a classical database management
tion Library (BFL) that offers SAP-specific application system. In addition, the SAP HANA database posi-
code. All business logic modules are natively integrated tions itself as a system “Beyond SQL” by providing an
into the database kernel with a maximum degree of par- ecosystem for domain-specific languages with particu-
allelism. Compared to classical stored procedures or lar internal support on the level of individual operators.
stored functions, the BFL is included in the database Moreover, the concept of a BFL to provide a set of com-
engine using all the technical advantages of deep in- plex, performance-critical, and standardized application
tegration. A prominent example of an application- logic modules deep inside the database kernel creates
specific algorithm is the procedure of currency conver- clear benefits for SAP and customer-specific applica-
sion. Though supposedly simple in nature—a scalar tions.
multiplication of a monetary figure with the conversion
rate—the actual implementation covering the complete

SIGMOD Record, December 2011 (Vol. 40, No. 4) 49


shown in Figure 4(b), comprises supporting the full SAP
    
  BW application stack. Step by step, the customer is
 

able to move more critical applications (like data ware-
   housing) to the SAP HANA Appliance. This phase


 also positions the SAP HANA Appliance as the pri-


mary persistent storage layer for managed analytical
  data. Switching the data management platform will be
(a) Supporting local BI
a non-disruptive move from the application’s point of
view. In addition to providing the data management
layer for a centralized data-warehouse infrastructure, the

 




!
  SAP HANA Appliance is also planned to be used to con-
solidate local BI data marts exploiting a built-in multi-

  tenancy feature.


 
The third step in the current roadmap – introducing
the SAP HANA Appliance to the market in an evolution-
  ary way – consists of extending the HANA ecosystem
with new applications using the modeling and program-
(b) Running SAP BW ming paradigm of the SAP HANA database in combina-
tion with application servers. Depending on the specific


 
 
  customer setup, long-term plans are to put HANA also
 
under the classical SAP ERP software stack.
 
  To summarize, the basic steps behind the HANA
roadmap are designed to integrate with customers’ SAP
 
installations without disrupting existing software land-
  scapes. Starting small with local BI installations, putting
the complete BW stack on top of HANA in combination
(c) Platform for new applications with a framework to consolidate local BI installations, is
considered a cornerstone in the SAP HANA roadmap.
Figure 4: Planned SAP HANA roadmap
5. SUMMARY
4. THE HANA ROADMAP Providing efficient solutions for enterprise-scale ap-
Although, from a technology perspective, the SAP plications requires a robust and efficient data manage-
HANA database is based on the SAP BWA system with ment and processing platform with specialized sup-
its outstanding record of successful installations, the port for transaction, analytical, graph traversal, and text
generally novel approach of a highly distributed system retrieval processing. Within the SAP HANA Appli-
with an understanding of semantic business models re- ance, the HANA database represents the first step to-
quires time for customers to fully leverage their data wards a new generation of database systems designed
management infrastructure. SAP intends to pursue an specifically to provide answers to questions raised by
evolutionary, step-wise approach to introduce the tech- demanding enterprise applications. The SAP HANA
nology to the market. database, therefore, should not be compared to classical
In a first step, the SAP HANA Appliance is posi- SQL or typical key-value, document-centric, or graph-
tioned to support local BI scenarios. During this step, based NoSQL databases. HANA is a flexible data stor-
customers can familiarize themselves with the technol- age, manipulation, and analysis platform, comprehen-
ogy exploiting the power of the new solution without sively exploiting current trends in hardware to achieve
taking any risk for existing mission-critical applications. outstanding query performance and throughput at the
SAP data of ERP systems will be replicated to the SAP same time. The different engines within the distributed
HANA Appliance in real-time fashion. Data within the data processing framework provide an adequate solu-
SAP HANA Appliance can be optionally enhanced by tion for different application requirements. In this pa-
external non-SAP data sources and consumed using the per, we outlined our overall idea of the SAP HANA
SAP BOBJ analytical tools. Aside from new analytical database, sketched out its general architecture, and fi-
applications on top of HANA, the primary use case here nally gave some examples to illustrate how an SAP
is the acceleration of operational reporting processes di- HANA database positions itself “Beyond SQL” by na-
rectly on top of ERP data. tively supporting performance-critical application logics
The plan for the second phase of the roadmap, as as an integral part of the database engine.

50 SIGMOD Record, December 2011 (Vol. 40, No. 4)


6. ACKNOWLEDGMENTS [4] J. Krüger, M. Grund, C. Tinnefeld, H. Plattner,
We would like to express our sincere thanks to all of A. Zeier, and F. Faerber. Optimizing Write
our NewDB colleagues for making the HANA story a Performance for Read Optimized Databases. In
reality. We also would like to thank Glenn Pauley, SIG- DASFAA Conference, pages 291–305, 2010.
MOD Record Editor for Industrial Perspectives, for his [5] H. Plattner and A. Zeier. In-Memory Data
helpful comments. Management: An Inflection Point for Enterprise
Applications. Springer, Berlin Heidelberg, 2011.
7. REFERENCES [6] J. Schaffner, B. Eckart, D. Jacobs, C. Schwarz,
[1] P. Grosse, W. Lehner, T. Weichert, F. Färber, and H. Plattner, and A. Zeier. Predicting In-Memory
W.-S. Li. Bridging two worlds with RICE. In Database Performance for Automating Cluster
VLDB Conference, 2011. Management Tasks. In ICDE Conference, pages
[2] B. Jaecksch, F. Färber, F. Rosenthal, and 1264–1275, 2011.
W. Lehner. Hybrid Data-Flow Graphs for [7] C. Weyerhäuser, T. Mindnich, F. Färber, and
Procedural Domain-Specific Query Languages. In W. Lehner. Exploiting Graphic Card Processor
SSDBM Conference, pages 577–578, 2011. Technology to Accelerate Data Mining Queries in
[3] B. Jaecksch, W. Lehner, and F. Färber. A plan for SAP NetWeaver BIA. In ICDM Workshops, pages
OLAP. In EDBT conference, pages 681–686, 2010. 506–515, 2008.

SIGMOD Record, December 2011 (Vol. 40, No. 4) 51

View publication stats

You might also like