0% found this document useful (0 votes)
61 views

Big Data Analytics: Jaydip Sen

This document defines big data and discusses its key characteristics. It begins by defining common data size units from kilobytes to yottabytes. It notes there is no single definition of big data, but provides an example definition. The main characteristics of big data discussed are scale/volume, variety/complexity, and velocity. Examples are given of how much data volume is increasing and the different data formats. It discusses how big data is generated from sources like social media, scientific instruments, mobile devices, and sensors. Challenges in handling big data are the need for new architectures, algorithms, and technical skills. Strategic uses of analytics are discussed for questions about employees, products, finances, and customers.

Uploaded by

debmatra
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views

Big Data Analytics: Jaydip Sen

This document defines big data and discusses its key characteristics. It begins by defining common data size units from kilobytes to yottabytes. It notes there is no single definition of big data, but provides an example definition. The main characteristics of big data discussed are scale/volume, variety/complexity, and velocity. Examples are given of how much data volume is increasing and the different data formats. It discusses how big data is generated from sources like social media, scientific instruments, mobile devices, and sensors. Challenges in handling big data are the need for new architectures, algorithms, and technical skills. Strategic uses of analytics are discussed for questions about employees, products, finances, and customers.

Uploaded by

debmatra
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Big Data

Analytics
Jaydip Sen
Data Size Units

 1 Kilobyte = 10^3 bytes


 1 Megabyte = 10^6 bytes
 1 Gigabyte = 10^9 bytes
 1 Terabyte = 10^12 bytes
 1 Petabyte = 10^15 bytes
 1 Exabyte = 10^18 bytes
 1 Zettabyte = 10^21 bytes
 1 Yottabyte 10^24 bytes
Big Data Definition

 No single standard definition…

“Big Data” is data whose scale, diversity, and complexity


require new architecture, techniques, algorithms, and
analytics to manage it and extract value and hidden
knowledge from it…
Characteristics of Big Data: 1-Scale (Volume)
 Data Volume
 44x increase from 2009 2020
 From 0.8 zettabytes to 35zb
 Data volume is increasing exponentially
Characteristics of Big Data: Complexity (Varity)
 Various formats, types, and structures
 Text, numerical, images, audio, video,
sequences, time series, social media data,
multi-dim arrays, etc…
 Static data vs. streaming data
 A single application can be
generating/collecting many types of data

To extract knowledge all these types


of data need to linked together
Characteristics of Big Data: Speed (Velocity)

 Data is begin generated fast and need to be processed fast


 Online Data Analytics
 Late decisions missing opportunities
Examples
 E-Promotions: Based on your current location, your purchase history, what you
like send promotions right now for store next to you

 Healthcare monitoring: sensors monitoring your activities and body any


abnormal measurements require immediate reaction
Big Data: 3V’s
Some Make it 4V’s
Harnessing Big Data

 OLTP: Online Transaction


Processing (DBMSs)
 OLAP: Online Analytical
Processing (Data Warehousing)
 RTAP: Real-Time Analytics
Processing (Big Data
Architecture & Technology)
Who’s Generating Big Data ?

Social media and Scientific instruments Mobile devices


networks (collecting all sorts of data) (tracking all objects Sensor technology
(all of us are all the time) and networks
generating data) (measuring all kinds
of data)
• The progress and innovation is no longer hindered by the ability to collect data

• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected
data in a timely manner and in a scalable fashion
The Model Has Changed…
 The Model of Generating/Consuming Data has Changed (Web2.0)
Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data
What’s driving Big Data ?

- Optimizations and predictive analytics


- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time

- Ad-hoc querying and reporting


- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets
Value of Big Data Analytics

 Big data is more real-time in


nature than traditional DW
applications
 Traditional DW architectures (e.g.
Exadata, Teradata) are not well-
suited for big data apps
 Shared nothing, massively
parallel processing, scale out
architectures are well-suited for
big data apps
Challenges in Handling Big Data

 The Bottleneck is in technology


 New architecture, algorithms, techniques are needed
 Also in technical skills
 Experts in using the new technology and dealing with big data
Strategic use of Analytics

 Strategic Employee Questions


 Strategic Product Questions
 Strategic Financial Questions
 Strategic Customer Questions
Strategic Employee Questions

 Who are the most productive salespeople, employee?


 Who have the right skills for the next key product line?
 Which employees have the strongest customer relationships?
 Which managers have the highest retention rates? What do they do?
 Which hires work out the best (faculty)?
 What is our retention rate? Why do people leave?
 What is the cost of turnover?
 Why do people join the organization?
Strategic Product Questions

 What are our most/least profitable products?


 What are our production costs & how can we lower them?
 What is our quality level & how can we improve that (Fed Ex)?
 What is our cycle time & how can we lower it?
 What are the sources of product innovation?
 What impacts the demand of our product?
Strategic Financial Questions

 How accurate are the financial forecasts?


 How much financial data is used to answer business decisions?
 What items are affecting our margins the most (Wal-Mart)?
Strategic Customer Questions

 Who are the most/least profitable customers?


 Who are the most/least satisfied customers?
 What is fastest/slowest customer segment?
 What type of ads bring most customers?
 What is our customer experience like & how can we improve it?
 What is the cost of customer acquisition?
 What are the reasons for losing customer?
 What are the costs of customer transactions?
Has this technique been successful elsewhere?

 Revenue Management (airlines, hotels)


 Logistics
 Customer turnover (Churn Analysis)
 Customer service
 Pricing of Products
 Trading of Stocks
 Product selection (pharmaceutical companies)
 Employee performance (Baseball, Soccer)
Thank You

You might also like