20200321145947_DSS_Chapter_SIX
20200321145947_DSS_Chapter_SIX
6.1. INTRODUCTION
Today, many organisations are collecting, storing, and analysing massive
amounts of data. This data is commonly referred to as “big data” because of its
volume, the velocity with which it arrives, and the variety of forms it takes. Big
data is creating a new generation of decision support data management.
Businesses are recognising the potential value of this data and are putting the
technologies, people, and processes in place to capitalize on the opportunities. A
key to deriving value from big data is the use of analytics. Collecting and storing
big data creates little value; it is only data infrastructure at this point. It must be
analysed and the results used by decision makers and organisational processes in
order to generate value.
1
Volume
Volume refers to the magnitude of the data that is being generated and collected.
It is increasing at a faster rate from terabytes to petabytes (1024 terabytes). With
increase in storage capacities, what cannot be captured and stored now will be
possible in future. The classification of Big Data on the basis of volume is
relative with respect to the type of data generated and time. In addition, the type
of data, which is often referred as Variety, defines Big Data. Two types of data,
for instance, text and video of same volume may require different data
management technologies.
Variety
Big Data is generated in multiple varieties. Compared to the traditional data like
phone numbers and addresses, the latest trend of data is in the form of photos,
videos, and audios and many more, making about 80% of the data to be
completely unstructured.
Veracity
Veracity basically means the degree of reliability that the data has to offer. Since
a major part of the data is unstructured and irrelevant, Big Data needs to find an
2
alternate way to filter them or to translate them out as the data is crucial in
business developments.
Value
Value is the major issue that we need to concentrate on. It is not just the amount
of data that we store or process. It is actually the amount of valuable, reliable and
trustworthy data that needs to be stored, processed, analysed to find insights.
Velocity
Last but not least, Velocity plays a major role compared to the others; there is no
point in investing so much to end up waiting for the data. So, the major aspect of
Big Data is to provide data on demand and at a faster pace.
3
With the help of descriptive analysis, we analyse and describe the features of
a data. It deals with the summarisation of information. Descriptive analysis,
when coupled with visual analysis provides us with a comprehensive
structure of data. In the descriptive analysis, we deal with the past data to
draw conclusions and present our data in the form of dashboards. In
businesses, descriptive analysis is used for determining the Key Performance
Indicator or KPI to evaluate the performance of the business.
With the help of predictive analysis, we determine the future outcome. Based on
the analysis of the historical data, we are able to forecast the future. It makes use
of descriptive analysis to generate predictions about the future. With the help of
technological advancements and machine learning, we are able to obtain
predictive insights about the future. Predictive analytics is a complex field that
requires a large amount of data, skilled implementation of predictive models and
its tuning to obtain accurate predictions. This requires a skilled workforce that is
well versed in machine learning to develop effective models.
4
effectively forecast for inventory and required production rates, while also using
past data to estimate potential production failures. They can then use this to
prevent the same errors from occurring.
Reduce risk
Dependent on the industry, predictive analytics can be used considerably when it
comes to risk reduction. Sectors such as finance and insurance use predictive
analytics to help construct a valid depiction of a person or business they’re
screening based on all data available to them. This can then form a more reliable
interpretation of that person, business or incident which can be used to make
sensible, effective decisions.
Detect fraud
One of the most beneficial uses of predictive analysis is fraud detection. The
process is particularly attuned to fraud detection and prevention by recognising
patterns in behaviour. By tracking changes in this behaviour on a site or network,
it can easily spot anomalies that may indicate threat or fraud, which can then be
highlighted and prevented.
5
NB: Research for more practical examples which explain how the industry has
benefited from predictive analytics
In-memory BI tools also move data between disk (e.g., in the data warehouse)
and the local desktop memory so that the most frequently used data (so called
“hot data") is available in memory. Some applications are especially well suited
6
for in-memory analytics. For example, with OLAP (often incorporated in reports
and dashboards/scorecards), users want to “slice and dice” data to look at the
business from different perspectives, such as comparing this and last year’s sales
in different locations. When all of the data is in memory, this analysis can be
done very quickly providing analysis at the speed of thought. Not all applications
require or are well suited for in-memory analytics. Some applications, such as a
market basket analysis that is run weekly or applications that require more data
than can be provided by current in-memory technologies are not good candidates.
While the cost and reliability of in-memory technology continues to improve, it is
still relatively expensive and prone to failure.
Much of that data is unstructured, meaning that it does not reside in a database.
Documents, photos, audio, videos and other unstructured data can be difficult to
search and analyse. In order to deal with data growth, organisations are turning to
a number of different technologies. When it comes to storage, converged and
hyper-converged infrastructure and software-defined storage can make it easier
for companies to scale their hardware. And technologies like compression,
deduplication and tiering can reduce the amount of space and the costs associated
with big data storage.
7
Generating insights in a timely manner
Of course, organisations do not just want to store their big data — they want to
use that big data to achieve business goals. According to the NewVantage
Partners survey, the most common goals associated with big data projects
included the following:
All of those goals can help organisations become more competitive — but only if
they can extract insights from their big data and then act on those insights
quickly.
8
streams, email systems, employee-created documents, etc. Combining all that
data and reconciling it so that it can be used to create reports can be incredibly
difficult. Vendors offer a variety of Extract, Transform and Load (ETL) and data
integration tools designed to make the process easier, but many enterprises say
that they have not solved the data integration problem yet.
In response, many enterprises are turning to new technology solutions. In the IDG
report, 89 percent of those surveyed said that their companies planned to invest in
new big data tools in the next 12 to 18 months. When asked which kind of tools
they were planning to purchase, integration technology was second on the list,
behind data analytics software.
Validating data
Closely related to the idea of data integration is the idea of data validation. Often
organisations are getting similar pieces of data from different systems, and the
data in those different systems does not always agree. For example, the e-
commerce system may show daily sales at a certain level while the enterprise
resource planning (ERP) system has a slightly different number. Or a hospital's
electronic health record (EHR) system may have one address for a patient, while
a partner pharmacy has a different address on record.
The process of getting those records to agree, as well as making sure the records
are accurate, usable and secure, is called data governance. Solving data
governance challenges is very complex and is usually requires a combination of
policy changes and technology. Organisations often set up a group of people to
oversee data governance and write a set of policies and procedures. They may
also invest in data management solutions designed to simplify data governance
and help ensure the accuracy of big data stores — and the insights derived from
them.
9
threats (APTs). However, most organisations seem to believe that their
existing data security methods are sufficient for their big data needs as well.
Organisational resistance
It is not only the technological aspects of big data that can be challenging —
people can be an issue too. Usual organisations are confronted with the following
issues in their bid to creating a data-driven culture:
Insufficient organisational alignment
Lack of middle management adoption and understanding
Business resistance or lack of understanding
10