Big DataAnalytics 2014
Big DataAnalytics 2014
:Trends
First Mover Advantage
:Think Tank
Anti Financial
Crime Management
:Practice
OTTO, PAYD, dm drogerie
:Contents
Preface
Executive summary
Leading edge knowledge
Trends
First Mover Advantage
Think Tank
On the data treadmill
Hype or disruption?
Anti Financial Crime Management
11
16
Tools
Getting started with Big Data
13
Cross-check
14
Practice
_ OTTO: Automated decisions
18
20
22
Viewpoint
Clarifying legal issues
23
References
24
Glossary
26
Publication data
Disclaimer: All information in this booklet has been carefully researched. The
editor, publisher and distributor accept no liability for the accuracy and
completeness of the content or for any interim changes.
March 2014
Steria Mummert Consulting GmbH
Hans-Henny-Jahnn-Weg 29, 22085 Hamburg
ISBN: 978-3-89981-387-6
:Preface
<<
Huge opportunities arise as a result
of the intelligent handling of very
large amounts of data. Thus, targeted
customer groups experience a
paradigm change: using Big Data,
companies like ours can make the
right offer to clients at the right time
through the right channels independent of the environment or
the hitherto commonly-used online
targeting methods.
Bernhard Brugger, CEO Central
Europe, PAYBACK GmbH
>>
Using Best Practice, the Big Data Analytics
management compass reveals the possibilities that
are opened up by new data analysis methods.
However, these are still early days for the subject,
and company decision-makers need to throw their
own ideas into the mix. In principle, every sector and
business segment can use Big Data. You are limited
only by your own imagination.
<<
Big Data has been integrated into
the logistics sector for a long time.
A good example of this is our
Resilience 360 solution, in which
the analysis of aggregated data
helps to improve supply chains and
protect them against disruption.
This ensures smooth operations
and improves customer
satisfaction.
>>
The benchmarks of value-orientated company
management form seven central management
disciplines. Each article in this document contains a
management compass with the relevant disciplines
highlighted. The cross-check on page 14 provides a
general overview.
Big Data Analytics provides new insights into your
business, some of which may be counter-intuitive. In
this sense, Big Data can really open our eyes. As
Goethe said, you only see what you know. We hope
you find this booklet interesting. II
1
2
3
4
5
6
7
Costs Management
Transformation Management
Process Management
Innovation Management
Customer Management
Cooperation Management
Risk Management
:Executive summary
1 : Management recommendation
Hype or revolution: you should use the new methods
and technologies for the analysis of large volumes of
diverse - and polystructured - data. As best practices
from different sectors shows, Big Data has the
potential for both new business models and new
competitors. As a disruptive technology, Big Data
may unbalance existing markets. Therefore, it is
worth taking a closer look.
Big Data delivers specific information with commercial
relevance. Thus banks, for example, can plan which
customers they should contact, when and about what
financial product. If specific information is brought
together, more precise statements about the
creditworthiness of people or companies can be
made. Attempted insurance fraud can be discovered
using specific data patterns, etc.
The Big Data data scientists have a different
approach than traditional analysts. Instead of gaining
a causal understanding of relationships and then
testing them against reality, these data scientists
allow the data to speak for itself, while they apply
intelligent algorithms to process large volumes of the
most varied data. New knowledge arises through
creativity and experimentation, and through testing
hypotheses or by chance.
Big Data goes beyond Business Intelligence (BI). It
requires new technologies, as well as new forms of
analysis and presentation, of which many IT
departments have had no experience. Compared to
BI, Big Data has greater tool flexibility. Companies
can assemble their toolbox individually and integrate
existing BI applications into it. Much of this is still in
development.
2 : Management recommendation
Big Data supports the control of fast-growing data
volumes. It helps with in-depth analysis of these data
volumes and in the rapid production of
recommendations for your business. Without suitable
analysis tools, this would be like looking for a needle
in a haystack.
In Big Data applications, the processing of datasets is
spread among computer clusters. This also enables
the efficient analysis of large volumes of data. One
consequence of this is that data specialists do not
have to take any statistical samples for their analyses,
but can work with the entire data stock. This makes
prognoses more reliable.
3 : Management recommendation
Use additional data sources to gain a deeper
understanding of your business. As well as analysing
structured data from your operational systems, Big
Data enables you to consider unstructured data like
text, speech, photos and videos from internal and
external sources. For example, an analysis of social
media comments can teach you more about your
customers needs.
The bulk of existing data is unstructured. A contentbased classification through textual analysis
transforms unstructured data into a structured form.
This is how quantitative and qualitative analyses
become possible. For instance, companies already
use unstructured social media data in order to gain
advance warning of market trends and reputational
risks.
4 : Management recommendation
Test the possibilities of Big Data for real-time data
analysis. This is especially important if you offer
digital and mobile services, for which you must react
quickly to customer wishes to gain a competitive
advantage. Big Data Analytics can also be integrated
into business processes in order to make decisions
automatically.
Processes supported by Big Data can provide
customers with timely offers that are suitable for their
situations, based on the location data of their mobile
devices or their digital surfing patterns. Credit card
fraud attempts and the like can be automatically
stopped.
5 : Management recommendation
Volume
Data volume
Variety
Data variety
Velocity
Data speed
Since IT-supported business processes continuously produce data, more and more
companies and institutions are holding gigantic, petabyte-sized data mountains.
Alongside internal data, a variety of external data sources, devices and machines are
producing a constant data stream.
Data comes from many sources and is of various kinds. In broad terms, it can be
classified as unstructured, semi-structured and structured. The technological
evaluation of such polystructured data through textual analysis or image recognition
has improved massively.
Data-supported processes require data collection, integration and analysis to be
carried out increasingly quickly - often in real time - in order to reach relevant
conclusions or induce business actions. Furthermore, data structures, sources and
interfaces are changing very quickly.
:Trends
Eric Czotscher
is Head of
Research/Market research
and editor-in-chief of the
F.A.Z. Institute for
Management, Market and
Media Information.
: Think Tank
Jens Kaufmann
is on the staff of the
Department of Business IT,
esp. Business Intelligence,
at the Mercator School of
Management, University of
Duisburg-Essen.
Big Data
on time
in real time
streaming
Velocity
Variety
: Recognising
patterns
Graphic
representations of
complicated
situations make
analysis easier
: Profitable Predictions
Most sectors could benefit from Big Data - the only
issue is at what point. Now it appears that one of the
most exciting and biggest challenges for many
companies is within reach: the ability to look into the
future. Using the catchphrase Predictive Analytics,
todays algorithms can, for instance, not only check
several tens of thousands of fraud cases or loans that
have been carefully prepared and fed into the
database, but also include individual characteristics in
the check in order to make predictions about possible
fraudsters.
In relation specifically to Germany, wherever the strict
data protection laws try to prevent citizen
transparency, there is a wide scope of application for
Predictive Analytics. Thus telecommunications
providers can predict their customers general
willingness to change providers on the basis of
millions of items of connection data and entries in
social media; they can then very quickly identify
changes in user behaviour in order to make early
contact with potential leavers. For example, according
to Bitkom (the German association for information
technology, telecommunication and new media),
Telecom Italia is permanently evaluating a pool of 500
million items of connection data for this purpose.
: Data protection
In spite of strict
regulations, Big Data
analyses are possible
: Credit cards
Big Data is quicker
than the fraudsters
Analysis
demand not met
Analysis
demand
comfortably met
IT (technology for analysis)
10
: Think Tank
Hype or disruption?
The term Big Data is on everyones lips. The prospect of ones business being
able to garner valuable gems of information from a host of varied data makes
Big Data attractive. However, behind the concept, is there really something
completely new, something that is different from current approaches to data
analysis and which opens up new possibilities? Or is this just more marketing
hype?
11
Market
Company
Market analysis
1 More
efficient
processes
and
manageme
nt
Intelligent
products
Mass
Customisat
ion
12
Mass Customisation:
Analysis of customer behaviour (360 degree vision) for more
customised customer communications or tailored services (e.g.
Next Best Offer, campaign management, use-based billing) and
Fraud Management.
Market analysis:
Identifying opinion-shapers and trends in companies and
products (e.g. advertising impact, brand perception, sentiment
analysis).
Intelligent products:
Machines or devices that regulate themselves using sensor analysis
(e.g.: driverless cars, self-regulating houses).
: Tools
Setting targets
Why does the department concerned want to introduce the data analysis? Which challenges can it overcome with
Big Data that cannot be managed by using existing Business Intelligence solutions (see page 5, the 3Vs of Big
Data)? What Best Practices from other companies could provide ideas for introducing Big Data?
Which commercial expectations are associated with a Big Data solution? Will additional revenue be generated or
costs saved? Will risks be reduced? Will product quality or service levels be improved? Precise quantitative or
qualitative targets will enable a subsequent actual vs target comparison (e.g. Cross-/Up-Selling volumes, customer
churn rate, error ratio) and the calculation of the RoI.
Data situation
Which data sources are available for the desired analyses? What additional data can be used? Examples: internal
data produced by existing business processes but not used; external data from suppliers and customers; external
data from private and public data providers.
Is data from social media (posts, text, photos, videos) used to obtain meaningful results? Marketing and
distribution can benefit from insights into (potential) customers tastes and purchasing behaviour, as can product
development and reputation management.
What granularity and quality does the available data have? Are there possibilities for improvement here?
How is data protection and data security guaranteed in the evaluation of the intended data sources (e.g.
anonymisation)? Are all legal and internal requirements met?
IT infrastructure
Data systems should be as simple, robust and error-tolerant as possible and allow for expansion to new functions.
Ad hoc requests should also be possible. When real-time processing is necessary, the latency period must be low.
Systems often used for Big Data are Hadoop (parallel data processing shared across computer clusters using
MapReduce) or in-memory computing.
Can an existing BI platform be used as a basis for the first Big Data project? Can new analytical tools be linked
to it? The advantage of this is a uniform solution for BI and Big Data.
Are there cost-efficient solutions to close gaps? Which resources can be used in the Public Cloud (e.g. storage
capacity, processor capacities, data analysis tools)?
IT alignment
How can the cooperation between the IT and other departments be optimised in order to increase the companys
agility? The company management should expressly support the Big Data project, since the results may make it
necessary to change the processes and the business model.
Complex results should be represented graphically, so that they are easier to grasp and practically useful.
Implementation
How is the timely use of new information in the business guaranteed? Which processes must be adapted? Is it
possible to automate processes in connection with Big Data?
13
: Cross-check
COST MANAGEMENT
RISK MANAGEMENT
COOPERATION MANAGEMENT
Big Data does not just provide added value for your
own business. Cooperation partners and customers
can also benefit from the results of the analysis. For
instance, DHL offers a predictive tool for companies
that estimates future sales in a particular region.
Other data, including data on goods delivered,
geodata and company code numbers, is also fed into
this process. Conversely, companies should test
which external data from suppliers, customers and
service providers could be useful for their own
analysis.
14
CUSTOMER
MANAGEMENT
TRANSFORMATION
MANAGEMENT
PROCESS MANAGEMENT
INNOVATION MANAGEMENT
15
: Think
Tank
Benjamin Rische
is Risk Consultant, Finance
& Compliance at Steria
Mummert Consulting.
Costs
Transformation
Process
Innovation
Customers
Cooperation
Risk
Substantial
savings
10 million
customers
110 million
transactions
time
>100
hours
12 Minutes
<12 minutes
Standard
system
Customers
16
In-memory system
Paradigm change
2
Adaptation of model
Damage
prevention
Quality
improvem
ent
Potential benefit
Technology
transformation
Time
savings
Time
3. Paradigm
change
through
improved
transparency and dynamic loss prevention. New
analytical methods and check methods are introduced
as a complement or as a substitute, e.g. customer
segmentation based on transaction behaviour.
Detection is supplemented with independent pattern
identification mechanisms. Predictive Analytics
methods allow suspicious items to be collected and
fraud to be predicted. Risky transactions can be
subjected to a second checking and authorisation
process before being carried out. This prevents
losses. Expertise is essential, if only at this late point,
but it can also be built into the first two steps.
This staged procedure allows potential benefits of Big
Data technologies to be accessed gradually, thus
streamlining investment and process changes.
Big Data technologies allow detection, investigation
and prevention, as well as case management and
reporting on different subject areas to be integrated
into one application. Under the heading Financial
Crime there is now a single solution for
comprehensive MaRisk compliance. II
17
: Practice
Michael Sinn
is Director of
Category Support at
OTTO.
Trend identification
Early identification
New information sources
Which product will customers
want in the future?
Sales optimisation
Flexible pricing
Recommendation engine
Returns management
Efficient stock control
4
CLOSED
LOOP
Prediction
Sourcing
Sale/return prediction
Publication management
Source: OTTO.
18
Planning
Creation of product range
Design of product range
Volume estimates
Commercial control over internal
management
: Cycle-oriented optimisation
: Returns factor
Competitive advantage
versus costs
: Lowering the
returns rate
Result of Big Data
analysis
19
: Practice
Janina Rttger
is Senior Insurance
Manager at Steria
Mummert Consulting.
Andreas Behrens-Ziegler
is Senior Insurance
Consultant at Steria
Mummert Consulting.
20
Segment 2
Used cars
With
retrofitting of OBU1)
Stock
actions
Services I+II
Application Management, Testing,
Infrastructure management,
2)
BPO Services
Technology II
Connection and management of
external IT
1) On-board unit.
2) Business Process Management.
Source: Steria Mummert Consulting
: Fear of data
abuse
Greater trust
through
transparency
21
: Practice
Roman Melcher
is IT Director at dmdrogerie markt.
22
: External data
Alongside the daily turnover, the distribution centres
pallet delivery predictions and parameters specific to
individual branches, such as opening times, are also
included in the plan. Both are needed in order to
predict the staff requirement as accurately as
possible. Even incoming goods have a significant
impact on planning.
In addition, expected turnover is essential for capacity
planning. On days when we expect a higher turnover,
more of our colleagues are in place in order to serve
our customers with the least delay possible.
Furthermore, additional data relevant to capacity is
included, even external data. This data includes
market days, holidays in neighbouring countries, or
perhaps a construction site on an access road. Even
the weather forecast could be taken into account in
the future. Predictions are now so close to reality that
we can base our operational processes on them and
plan for the future. In the meantime, dm is putting its
predictive analytics solution into action in another
area: supplier requirements. II
: Viewpoint
: Data use
protection
and
data
: Open
Innovation
Data
den
and
23
: references
24
Eric Redmond and Jim R. Wilson: Seven weeks, seven databases - a guide to modern databases and the
NoSQL movement. OReilly 2012.
A book for IT developers looking for the right database for their Big Data problem. The authors present seven
open source databases from different fields of application: Redis, Neo4j, CouchDB, MongoDB, HBase, Riak and
PostgreSQL. Readers can test what they learn immediately on their own computers with the help of downloads;
they can also learn how to create platforms from several databases.
Lateral thinkers
Nate Silver: The Signal and the Noise: Why Most Predictions Fail but Some Dont. Heyne 2013.
The statistician Nate Silver shows why predictions by experts often fail and how prediction can be improved
through a more open, self-critical method. He warns against overrating Big Data. The flood of data may disguise
our view of causal relationships. Correlations alone are not enough to make good predictions.
Daniel C. Dennett: Intuition Pumps. Allen Lane 2013 (English).
The cognitive scientist and philosopher Daniel Dennett uses selected tools for thinking to show how people
think, decide and act. His intuition pumps critically examine received wisdom about human behaviour. The
book is a useful companion for data scientists wanting to generate new knowledge from Big Data.
25
: Glossary
Algorithm
Step-by-step solution to a problem through the
application of precisely defined calculation rules.
Big Data
Methods and technologies for the highly scalable
capture, storage and analysis of polystructured data.
Large volumes of data of highly varied structure and
origin, obtained partly in real time, can be used by Big
Data technologies for complex analyses. Big Data
Analytics includes data mining methods for the
analysis of this data.
Data mining
Systematic use of statistical/mathematical methods in
order to identify causal patterns in data sets.
Data scientist
New profession connected with Big Data. Compared
to conventional data analysts, data scientists have
more mathematical and technological knowledge, as
well as business knowledge and special creativity.
They are tasked with using data mining and intelligent
algorithms
to
track
down
business-relevant
information in the data. Data artists visualise this
information
by
depicting
these
complicated
relationships graphically.
Data warehouse
Historic, operative data from various data silos is
placed in databases for analysis in order to analyse
the data (data mining) and prepare to make
management decisions. In hybrid models, the data
warehouse sets the context for Big Data analyses.
Geofencing
Linking of geoinformation systems and localisation of
objects in order to act, if a specific object leaves or
enters a pre-defined area.
Hadoop
Open source data system by the Apache Software
Foundation, based on MapReduce. Large analysis
tasks are split into small jobs for distributed computer
clusters to solve in parallel.
MapReduce
Algorithm that shares the processing of large datasets
among several computers working in parallel. Initially,
the jobs are shared among various computer clusters
(map), next, the individual results produce an overall
result (result). In doing so, different data sources
and formats are used.
MaRisk
Circular 10/2012 (BA) with minimum requirements for
risk management as specified by German regulatory
authority BaFin for risk management by credit
institutions and financial services institutions in
Germany.
mTan
German security standard.
26
: Current studies
Managementkompass Demographiemanagement
Demographic change is draining the available labour force.
Companies therefore need to orient their staff planning over the
long term. Topics: strategic staff planning, talent management, lifephase-orientated staff policies, Generation Z. With articles by
Fraport, the City of Munich and Deutsche Bank.
Managementkompass Customer Centricity
Customer focus means more than excellent services. In the
internet, mobile and social media era, the design of customer
relations has reached a new dimension with a host of possibilities
and requirements. With articles by the Generali insurance group,
SCHUFA and Fidor Bank.
Branchenkompass 2013 Versicherungen
Recent survey of 100 top decision-makers in 100 of the largest
insurance firms and brokers on sector trends, as well as growth
strategies and investment goals up to 2016. Core topics:
marketing and customer management, new regulations and M&A.
Contact:
Steria Mummert Consulting GmbH
Corporate Communications
Birgit Eckmller
Hans-Henny-Jahnn-Weg 29
22085 Hamburg
Phone: 0 40 / 2 27 03 - 52 19
Fax: 0 40 / 2 27 03 - 12 19
E-Mail: [email protected]
F.A.Z.-Institut fr Management-,
Markt- und Medieninformationen GmbH
Eric Czotscher
P.O. Box 20 01 63
60605 Frankfurt am Main
Phone: 0 69 / 75 91 - 32 75
Fax: 0 69 / 75 91 - 19 66
E-Mail: [email protected]
ISBN: 978-3-89981-387-6