0% found this document useful (0 votes)

17 views

Big Data and BDA

Uploaded by

Dev Gupta

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Big Data and BDA

Uploaded by

Dev Gupta

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

🞂 Big data is the term for a collection of data sets

so large and complex that it becomes difficult to

process using on-hand database management
tools or traditional data processing applications.
🞂 The challenges include capture, curation,
storage, search, sharing, transfer, analysis, and
visualization.
🞂 The trend to larger data sets is due to the additional
information derivable from analysis of a single large
setsofwith
set the data,
related sameastotal amountto of
compared data, allowing
separate smaller
correlations to be found to "spot business trends,
determine quality of research, prevent diseases, link
legal citations, combat crime, and determine real-time
roadway traffic conditions.”

1
 Massive sets of unstructured/semi-structured data
from Web traffic, social media, sensors, etc.
 Information from multiple internal and external
sources:
• Transactions
• Social media
• Enterprise content
• Sensors
• Mobile devices
 In the last minute there were …….
• 204 million emails sent
• 61,000 hours of music listened to on Pandora
• 20 million photo views
• 100,000 tweets
• 6 million views and 277,000 Facebook Logins
• 2+ million Google searches
• 3 million uploads on Flickr
🞂 Lots
of data is being collected
and warehoused
◦ Web data, e-commerce
◦ purchases at department/
grocery stores
◦ Bank/Credit Card
transactions
◦ Social Network
4
🞂 Data Volume
◦ 44x increase from 2009 to 2020
◦ From 0.8 zettabytes to 35zb
🞂 Data volume is increasing
exponentially

Exponential increase in
collected/generated
data

5
4.6
30 billion billion
RFID tags today
12+ TBs camera
(1.3B in 2005)
of tweet data phones
every day world
wide

100s
of
data every day

millions
of

of GPS
? TBs

enabled
devices
sold
annually
25+ TBs of 5+
log data
every day billion
people
on the
76 million Web by
smart meters in end
2009… 300M by 2020
2020
🞂 Relational Data (Tables/Transaction/Legacy
Data)
🞂 Text Data (Web)
🞂 Semi-structured Data (XML)
🞂 Graph Data
◦ Social Network, Semantic Web (RDF), …

🞂 Streaming Data
◦ You can only scan the data once

🞂 A single application can be

generating/collecting many types of data

🞂 Big Public Data (online, weather, finance,

etc)

To extract knowledge all these

types of data need to linked together
7
A Single View to the Customer

Bankin
Social g
Media Financ
e

Our
Know
Gamin
n
g
Histor
y

Entertai Purcha
n se
🞂 Data is begin generated fast and need to be
processed fast
🞂 Online Data Analytics
🞂 Late decisions  missing opportunities
🞂 Examples
◦ E-Promotions: Based on your current location, your purchase
history, what you like  send promotions right now for
store next to you

◦ Healthcare monitoring: sensors monitoring your activities

and body  any abnormal measurements require immediate
reaction

9
Mobile devices
(tracking all objects all the time)

Social media and Scientific instruments

networks
(all of us are generating data) (collecting all sorts of data)

Sensor technology and

networks
(measuring all kinds
of data)
🞂 The progress and innovation is no longer hindered by the ability to collect
data
🞂 But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable
fashion
10
11
🞂 The Model of Generating/Consuming Data
has Changed

Old Model: Few companies are generating data, all others are
consuming data

New Model: all of us are generating data, and all of us are

consuming data

12
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time

- Ad-hoc querying and reporting

- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets

13
🞂 Let us take an analogy of a restaurant to understand the
problems associated with Big Data and how Hadoop solved
that problem

🞂 Bob is a businessman who has opened a small restaurant.

Initially, in his restaurant, he used to receive two orders
per hour and he had one chef with one food shelf in his
restaurant which was sufficient enough to handle all the
orders
🞂 Now let us compare the restaurant example
with the traditional scenario where data was
getting generated at a steady rate and
our traditional systems like RDBMS is
capable enough to handle it, just like
Bob’s chef. Here, you can relate the data
storage with the restaurant’s food shelf
and the traditional processing unit with the
chef as shown in the figure above.
Scenario 2
Scenario 3
◦ Hadoop is an open source software product(or, more
accurately, software library framework) that is
collaboratively produced and freely distributed by the
Apache Foundation – effectively, it is a developer’s
toolkit designed to simplify the building of Big Data
solutions.
◦ Hadoop is used by companies with very large volumes
of data to process. Among them are web giants such
as Facebook, Twitter, LinkedIn, eBay and Amazon
◦ Hadoop is a distributed data processing and
management system.
◦ It contains many , including : HDFS ,
YARN , Map Reduce.
components

Cont…..
There are three components of Hadoop.
🞂 Hadoop HDFS - Hadoop
Distributed File System (HDFS) is the
storage unit of Hadoop.
🞂 Hadoop MapReduce - Hadoop
MapReduce is the processing unit of Hadoop.
🞂 Hadoop YARN - Hadoop YARNis
a resource management unit of
Hadoop.
🞂 Data is stored in a distributed manner
in HDFS.
🞂 There are two components of HDFS – name
node and data node. While there is only one
name node, there can be multiple data nodes.
🞂 Master and slave nodes form the
HDFS cluster. The name node is called the
master, and the data nodes are called the
slaves.
🞂 The name node is responsible for
the workings of the data nodes. It also
stores the metadata.
🞂 The data nodes read, write, process,
andreplicate the data. They also
signals, known send as
name node. These heartbeats,
heartbeats show
to the
status of the data node.the
🞂 Consider that 30TB of data is loaded into
the name node. The name node distributes it
across the data nodes, and this data is replicated
among the data notes. We can see in the
image above that the blue, grey, and red data
are replicated among the three data nodes.
🞂 Replication of the data is performed three times
by default. It is done this way, so if a commodity
machine fails, you can replace it with a new
machine that has the same data.
🞂 It is the processing unit of Hadoop. In
the MapReduce approach, the processing is
done at the slave nodes, and the final result
is sent to the master node.
🞂 A data containing code is used to process the
entire data. This coded data is usually very
small in comparison to the data itself. You
only need to send a few kilobytes worth of
code to perform a heavy-duty process on
computers.
• The input dataset is first split into chunks of data. In this example,
the input has three lines of text with three separate entities -
“bus car train,” “ship ship train,” “bus ship car.” The dataset is
then split into three chunks, based on these entities, and
processed parallelly.
• In the map phase, the data is assigned a key and a value of 1. In this
case, we have one bus, one car, one ship, and one train.
• These key-value pairs are then shuffled and sorted together based
on their keys. At the reduce phase, the aggregation takes place, and
the final output is obtained.
🞂 It stands for Yet Another Resource Negotiator.
🞂 It is the resource management unit of Hadoop
and is available as a component of Hadoop
version 2.
🞂 Hadoop YARN acts like an OS to Hadoop. It is a
file system that is built on top of HDFS.
🞂 It is responsible for managing cluster resources
to make sure you don't overload one machine.
🞂 It performs job scheduling to make sure that the
jobs are scheduled in the right place.
• Suppose a client machine wants to do a query or fetch some
code for data analysis. This job request goes to the resource
manager (Hadoop Yarn), which is responsible for resource
allocation and management.
• In the node section, each of the nodes has its node
managers. These node managers manage the nodes and
monitor the resource usage in the node. The containers
contain a collection of physical resources, which could be
RAM, CPU, or hard drives. Whenever a job request comes in,
the app master requests the container from the node
manager. Once the node manager gets the resource, it goes
back to the Resource Manager.
🞂 Big Data requires tools and methods that can be
applied to analyze and extract patterns from
large-scale data.
🞂 Big Data Analytics refers to the process of
collecting, organizing, analyzing large data sets
to discover different patterns and other useful
information.
🞂 Big data analytics is a set of technologies and
techniques that require new forms of integration
to disclose large hidden values from large
datasets that are different from the usual ones,
more complex, and of a large enormous scale.
🞂 It mainly focuses on solving new problems or old
problems in better and effective ways.
🞂 Big data is more real-time
in nature than traditional
DW applications
🞂 Traditional DW architectures
(e.g. Exadata,Teradata) are not
well-suited for big data apps
🞂 Shared nothing,
parallel massively processing,
architectures
scale are
out well-suited
for big data apps

29
Traditional Big Data Analytics
Analytics
(BI)
Focus on • Descriptive •Predictive analytics
analytics •Data Science
• Diagnosis
analytics
Data Sets • Limited data sets • Large scale data
• Cleansed data sets
• Simple models • More types of
data
• Raw data
• Complex data
models
Supports • Causation: what • Correlation: new
happened, and insight More
1. Descriptive Analytics:
It consists of asking the question: What is
happening? It is a preliminary stage of data
processing that creates a set of historical data. Data
mining methods organize data and help uncover
patterns that offer insight. Descriptive analytics
provides future probabilities and trends and gives
an idea about what might happen in the future.
2. Diagnostic Analytics:
It consists of asking the question: Why did it
happen? Diagnostic analytics looks for the root
cause of a problem. It is used to determine why
something happened. This type attempts to find
and understand the causes of events and behaviors.
3. Predictive Analytics:
It consists of asking the question: What is likely
to happen? It uses past data in order to predict
the future. It is all about forecasting. Predictive
analytics uses many techniques like data mining and
artificial intelligence to analyze current data and make
scenarios of what might happen.

4. Prescriptive Analytics:
It consists of asking the question: What should be
done? It is dedicated to finding the right action to be
taken. Descriptive analytics provides a historical data,
and predictive analytics helps forecast what might
happen. Prescriptive analytics uses these parameters to
find the best solution.
DESCRIPTIVE ANALYTICS
• Descriptive analytics, such as reporting/OLAP,
dashboards, and data visualization, have been widely used
for some time.
• They are the core of traditional BI.

What has occurred?

Descriptive analytics, such as data
visualization, is important in helping
users interpret the output from
predictive and prescriptive analytics.
PREDICTIVE
• ANALYTICS
Algorithms for predictive analytics, such as regression analysis,
machine learning, and neural networks, have also been around
for some time.

What will occur?

• Marketing is the target for many predictive analytics applications.

• Descriptive analytics, such as data visualization, is important in
helping users interpret the output from predictive and
prescriptive analytics.
PRESCRIPTIVE
• Prescriptive ANALYTICS
analytics are often referred to as advanced
analytics.
• Often for the allocation of scarce resources
• Optimization

What should occur?

Prescriptive analytics can benefit healthcare strategic planning by using analytics to leverage
operational and usage data combined with data of external factors such as economic data,
population demographic trends and population health trends, to more accurately plan for
future capital investments such as new facilities and equipment utilization as well as
understand the trade-offs between adding additional beds and expanding an existing facility
versus building a new one.
CHALLENGES FACED IN BIG DATA
ANALYSIS
 Need For Synchronization Across
Disparate Data Sources
 Acute Shortage of Professionals
Who Understand Big Data Analysis
 Getting Meaningful Insights Through The Use
of Big Data Analytics
 Getting Voluminous Data Into The Big
Data Platform
 Uncertainty Of Data Management Landscape.
 Data Storage And Quality
 Security And Privacy Of Data
DATA ANALYTICS TOOLS FOR BIG DATA
ANALYSIS

Apache Spark is one of the powerful open source big data analytics tools.
It offers over 80 high-level operators that make it easy to build parallel
apps. It is one of the open source data analytics tools used at a wide
range of organizations to process large datasets.
Features:
• It helps to run an application in Hadoop cluster, up to 100 times faster
in memory, and ten times faster on disk.
• It is one of the open source data analytics tools that offers lighting Fast
Processing.
• Support for Sophisticated Analytics.
• Ability to Integrate with Hadoop and Existing Hadoop Data.
• It is one of the open source big data analytics tools that provides built-
in APIs in Java, Scala, or Python.
Plotly is one of the big data analysis tools that lets
users create charts and dashboards to share online.
Features:
• Easily turn any data into eye-catching
and informative graphics.
• It provides audited industries with fine-
grained information on data provenance.
• Plotly offers unlimited public file hosting
through its free community plan.
Azure HDInsight is a Spark and Hadoop service in the cloud. It provides
big data cloud offerings in two categories, Standard and Premium. It
provides an enterprise-scale cluster for the organization to run their big
data workloads.
Features:
• Reliable analytics with an industry-leading SLA.
• It offers enterprise-grade security and monitoring.
• Protect data assets and extend on-premises security and governance
controls to the cloud.
• High-productivity platform for developers and scientists.
• Integration with leading productivity applications.
• Deploy Hadoop in the cloud without purchasing new hardware or
paying other up-front costs.
Skytree is one of the best big data analytics tools that empowers data
scientists to build more accurate models faster. It offers accurate
predictive machine learning models that are easy to use.
Features:
• Highly Scalable Algorithms.
• Artificial Intelligence for Data Scientists.
• It allows data scientists to visualize and understand the logic
behind
ML decisions.
• Skytree via the easy-to-adopt GUI or programmatically in Java
• Model Interpretability.
• It is designed to solve robust predictive problems with data
preparation capabilities.
• Programmatic and GUI Access.
Talend is a big data analytics software that simplifies and
automates big data integration. Its graphical wizard generates
native code. It also allows big data integration, master data
management and checks data quality.
Features:
• Simplify ETL & ELT for big data.
• Talend Big Data Platform simplifies using MapReduce and Spark
by generating native code.
• Smarter data quality with machine learning and
natural language processing.
• Agile DevOps to speed up big data projects.
• Streamline all the DevOps processes.
🞂 Big data refers to the set of numerical data
produced by the use of new technologies for
personal or professional purposes.
🞂 Big Data analytics is the process of
examining these data in order to uncover
hidden patterns, market trends, customer
preferences and other useful information in
order to make the right decisions.
🞂 Big Data Analytics is a fast growing technology. It
has been adopted by the most unexpected
industries and became an industry on its own.
🞂 But analysis of these data in the framework of the
Big Data is a process that seems sometimes quite
intrusive.
🞂 Analytics is a data science.
🞂 BI takes care of the decision-making
part while Data Analytics is the process of
asking questions.
🞂 Analytics tools are used when company needs
to do a forecasting and wants to know what
will happen in the future, while BI tools help
to transform those forecasts into common
language .
🞂 More often, Big Data is considered as
the successor to Business Intelligence.
Thank you

Big Data & Hadoop Training Material 0 1 PDF
50% (2)
Big Data & Hadoop Training Material 0 1 PDF
168 pages
Arth 2470 Exam 1 Notes
No ratings yet
Arth 2470 Exam 1 Notes
30 pages
BDA
No ratings yet
BDA
8 pages
Big Data
No ratings yet
Big Data
4 pages
Data Analytics and Hadoop
No ratings yet
Data Analytics and Hadoop
21 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
5 pages
Data Science
No ratings yet
Data Science
87 pages
BDA Unit 2 1
No ratings yet
BDA Unit 2 1
42 pages
Assignment questions BDA Lec 6
No ratings yet
Assignment questions BDA Lec 6
51 pages
The Age OF: Every Minute
No ratings yet
The Age OF: Every Minute
47 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
Lecture 02
No ratings yet
Lecture 02
60 pages
IRJET - Big Data-A Review Study With Comp
No ratings yet
IRJET - Big Data-A Review Study With Comp
6 pages
Hadoop Architecture and Its Functionality
No ratings yet
Hadoop Architecture and Its Functionality
7 pages
BDA Answers-1
No ratings yet
BDA Answers-1
15 pages
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
No ratings yet
BigData Terminology Hadoop MapReduce Yarn Spark File Formats
42 pages
Big Data Analytics
No ratings yet
Big Data Analytics
44 pages
Big Data QB
No ratings yet
Big Data QB
37 pages
Hadoop ISE 2
No ratings yet
Hadoop ISE 2
25 pages
Elementary Concepts of Big Data and Hadoop
No ratings yet
Elementary Concepts of Big Data and Hadoop
4 pages
Big Data Overview
No ratings yet
Big Data Overview
18 pages
biggdata
No ratings yet
biggdata
24 pages
SDCBDASPARKWEEK1-1
No ratings yet
SDCBDASPARKWEEK1-1
9 pages
Big Data Analytics - Project
50% (2)
Big Data Analytics - Project
27 pages
Big Data Streams Analytics: Challenges, Analysis, and Applications
No ratings yet
Big Data Streams Analytics: Challenges, Analysis, and Applications
55 pages
A Review Paper On Big Data
No ratings yet
A Review Paper On Big Data
5 pages
Big Data Analysis pdf 2
No ratings yet
Big Data Analysis pdf 2
18 pages
Big Data: Presented By, Nishaa R
No ratings yet
Big Data: Presented By, Nishaa R
24 pages
Lect7 IoT BigData1
No ratings yet
Lect7 IoT BigData1
28 pages
hadoop-big-data-unit-2
No ratings yet
hadoop-big-data-unit-2
23 pages
Hadoop - MapReduce
No ratings yet
Hadoop - MapReduce
51 pages
Part2 HDFS
No ratings yet
Part2 HDFS
33 pages
Module 1 - Introduction To Big Data
100% (1)
Module 1 - Introduction To Big Data
40 pages
IOT and Comp.architecture
No ratings yet
IOT and Comp.architecture
17 pages
HADOOP
No ratings yet
HADOOP
43 pages
Bigdata Analysis: Streaming Twitter Data With Apache Hadoop and V Isualizing Using Biginsights
No ratings yet
Bigdata Analysis: Streaming Twitter Data With Apache Hadoop and V Isualizing Using Biginsights
5 pages
cloud computing Unit-5
No ratings yet
cloud computing Unit-5
22 pages
Hadoop Ecosystem Large PDF
No ratings yet
Hadoop Ecosystem Large PDF
229 pages
ucPDF (14)
No ratings yet
ucPDF (14)
10 pages
Notes Hadoop
No ratings yet
Notes Hadoop
19 pages
Hadoop Week 1
No ratings yet
Hadoop Week 1
25 pages
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
No ratings yet
BIT4440 BSE4040 CloudComputing 3.big Data Technologies
43 pages
Hadoop PPT
No ratings yet
Hadoop PPT
25 pages
DOC-20250306-WA0000.
No ratings yet
DOC-20250306-WA0000.
35 pages
Chapter - 2 Hadoop
No ratings yet
Chapter - 2 Hadoop
32 pages
Big Data Analytics
No ratings yet
Big Data Analytics
37 pages
Unit - I Introduction To Big Data
No ratings yet
Unit - I Introduction To Big Data
38 pages
Big Data Analytics - Unit 2
No ratings yet
Big Data Analytics - Unit 2
10 pages
BDA_Question_bank
No ratings yet
BDA_Question_bank
33 pages
Computer Networks TCP
No ratings yet
Computer Networks TCP
48 pages
BDA simple 1 to 4
No ratings yet
BDA simple 1 to 4
11 pages
International Journal of Engineering Research and Development (IJERD)
No ratings yet
International Journal of Engineering Research and Development (IJERD)
6 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
BDA-UNIT-1
No ratings yet
BDA-UNIT-1
32 pages
BIG DATA_UNIT-I
No ratings yet
BIG DATA_UNIT-I
17 pages
bioDiesel_research
No ratings yet
bioDiesel_research
29 pages
Updated Unit-2
0% (1)
Updated Unit-2
55 pages
Module 1.ppt
No ratings yet
Module 1.ppt
29 pages
The Data Whisperer - Making Sense of Big Data
From Everand
The Data Whisperer - Making Sense of Big Data
Keaton Rivers
No ratings yet
Big Data: the Revolution That Is Transforming Our Work, Market and World
From Everand
Big Data: the Revolution That Is Transforming Our Work, Market and World
PAT NAKAMOTO
No ratings yet
The Power of Big Data: Transforming Industries and Shaping the Future
From Everand
The Power of Big Data: Transforming Industries and Shaping the Future
Tom Henricksen
No ratings yet
NetBrain Workstation CE Quick Start Guide
No ratings yet
NetBrain Workstation CE Quick Start Guide
42 pages
PDF Company
0% (1)
PDF Company
113 pages
Band Clamp Sheet
No ratings yet
Band Clamp Sheet
5 pages
Film Noir List
No ratings yet
Film Noir List
4 pages
BMD - Guidance and Counseling - Evaluation of Academic Areas
No ratings yet
BMD - Guidance and Counseling - Evaluation of Academic Areas
154 pages
Review Materials: Prepared By: Junior Philippine Institute of Accountants UC-Banilad Chapter F.Y. 2019-2020
No ratings yet
Review Materials: Prepared By: Junior Philippine Institute of Accountants UC-Banilad Chapter F.Y. 2019-2020
14 pages
ABB COMBIFLEX Mounting and Engineering System For Relay and Control Panels
No ratings yet
ABB COMBIFLEX Mounting and Engineering System For Relay and Control Panels
10 pages
Industrial Engineer - Career Episode 2
No ratings yet
Industrial Engineer - Career Episode 2
6 pages
The Smartpls Analyzes Approach in Validity and Reliability of Graduate Marketability Instrument
No ratings yet
The Smartpls Analyzes Approach in Validity and Reliability of Graduate Marketability Instrument
16 pages
ZNV Atlantis AI-Box
No ratings yet
ZNV Atlantis AI-Box
4 pages
L01 - ProLogic L01 - Operation Instructions
No ratings yet
L01 - ProLogic L01 - Operation Instructions
21 pages
Itemized
100% (1)
Itemized
7 pages
Chapter 9 Mechanical Properties of Solids
No ratings yet
Chapter 9 Mechanical Properties of Solids
16 pages
TCS Aptitude Questions Paper With Solved Answers - Students3k
100% (1)
TCS Aptitude Questions Paper With Solved Answers - Students3k
5 pages
Six Phrase Aptitude Training Short Cuts I
100% (4)
Six Phrase Aptitude Training Short Cuts I
17 pages
Fola 002 Pre Final
No ratings yet
Fola 002 Pre Final
15 pages
2 SC 2078
No ratings yet
2 SC 2078
3 pages
Blueprint for Revolution How to Use Rice Pudding Lego Men and Other Nonviolent Techniques to Galvanize Communities Overthrow Dictators or Simply Change the World Srdja Popovic - The 2025 ebook edition is available with updated content
100% (1)
Blueprint for Revolution How to Use Rice Pudding Lego Men and Other Nonviolent Techniques to Galvanize Communities Overthrow Dictators or Simply Change the World Srdja Popovic - The 2025 ebook edition is available with updated content
61 pages
Ojt Documentation - FINAL
No ratings yet
Ojt Documentation - FINAL
10 pages
GMR Infra
No ratings yet
GMR Infra
59 pages
Shiva and Bannari Gap Study
No ratings yet
Shiva and Bannari Gap Study
21 pages
Study-Guide-Automotive Servicing NC II
No ratings yet
Study-Guide-Automotive Servicing NC II
7 pages
Electrical Chelopara Uposhakha, Bogura-Model PDF
No ratings yet
Electrical Chelopara Uposhakha, Bogura-Model PDF
1 page
Parking Guidance System: Ultrasonic Censor
No ratings yet
Parking Guidance System: Ultrasonic Censor
9 pages
-Drill-Collars-9.500
No ratings yet
-Drill-Collars-9.500
1 page
Modern Periodic Table
No ratings yet
Modern Periodic Table
8 pages
A18 HBR 01 Gen Ele Spe 0003 Rev b1
No ratings yet
A18 HBR 01 Gen Ele Spe 0003 Rev b1
25 pages
CV - Mohammed Ahmed Saleh
No ratings yet
CV - Mohammed Ahmed Saleh
2 pages
Armstrong 4030 3x1.5x10 L
No ratings yet
Armstrong 4030 3x1.5x10 L
3 pages

Big Data and BDA

Uploaded by

Big Data and BDA

Uploaded by

🞂​ Big data is the term for a collection of data sets

so large and complex that it becomes difficult to

🞂​ A single application can be

🞂​ Big Public Data (online, weather, finance,

To extract knowledge all these

◦ Healthcare monitoring: sensors monitoring your activities

Social media and Scientific instruments

Sensor technology and

New Model: all of us are generating data, and all of us are

- Ad-hoc querying and reporting

🞂​ Bob is a businessman who has opened a small restaurant.

What has occurred?

What will occur?

• Marketing is the target for many predictive analytics applications.

What should occur?

You might also like

🞂 Big data is the term for a collection of data sets

🞂 A single application can be

🞂 Big Public Data (online, weather, finance,

🞂 Bob is a businessman who has opened a small restaurant.