0% found this document useful (0 votes)
37 views

Introduction To Big Data BS (CS) 6 Lecture # 2: Dr. Syed Attique Shah (PH.D.)

This document provides an introduction to big data. It defines big data as large datasets that require new technologies to capture, manage and process the data. It discusses how big data is generated from a variety of sources, including people, machines, and organizations. Specifically, it notes that big data comes from social media posts, sensors in devices, and structured records from business transactions. It also explains some of the challenges around analyzing unstructured data from people and machine sources for insights.

Uploaded by

Ahsan Iqbal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Introduction To Big Data BS (CS) 6 Lecture # 2: Dr. Syed Attique Shah (PH.D.)

This document provides an introduction to big data. It defines big data as large datasets that require new technologies to capture, manage and process the data. It discusses how big data is generated from a variety of sources, including people, machines, and organizations. Specifically, it notes that big data comes from social media posts, sensors in devices, and structured records from business transactions. It also explains some of the challenges around analyzing unstructured data from people and machine sources for insights.

Uploaded by

Ahsan Iqbal
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Introduction to Big Data

BS (CS) 6th
Lecture # 2

Dr. Syed Attique Shah (Ph.D.)


Assistant Professor
Department of Information Technology,
Faculty of Information & Communication Technologies (FICT), 
BUITEMS, Quetta, Pakistan
Big Data
• Big data is a blanket term for the non-traditional
strategies and technologies needed to gather,
organize, process, and gather insights from large
datasets.

• Big Data may well be the Next Big Thing in the ICT
world.

• Like many new information technologies, big data


can bring about dramatic cost reductions,
substantial improvements in the time required to
perform a computing task, or new product and
service offerings.
• Big Data generates value from the storage and
processing of very large quantities of digital
information that cannot be analyzed with traditional
computing techniques.

• But having data bigger it requires different


approaches:
Big Data – Techniques, tools and architecture
Definition:
“Big data is high-volume, high-velocity and high-
variety information assets that demand cost-effective,
innovative forms of information processing for
enhanced insight and decision making.” -- Gartner
The Structure of Big Data
• Structured
Structured data is data that adheres to a pre-defined data
model and is therefore straightforward to analyse
• Semi-structured
Semi-structured data is a form of structured data that does
not conform with the formal structure of data models
associated with relational databases or other forms of data
tables, but nonetheless contain tags or other markers to
separate semantic elements and enforce hierarchies of
records and fields within the data. 
• Unstructured
Unstructured data is information that either does not have
a predefined data model or is not organised in a pre-
defined manner.
Why Big Data
• Growth of Big Data is needed
– Increase of storage capacities
– Increase of processing power
– Availability of data(different data types)
– Every day we create 2.5 quintillion bytes
of data;
90% of the data in the world today
has been created
in the last two years alone
1) Automatically generated by a machine
(e.g. Sensor embedded in an engine)

How Is Big 2) Typically an entirely new source of data


(e.g. Use of the internet)
Data 3) Not designed to be friendly
Different? (e.g. Text streams)
4) May not have much values
Need to focus on the important part
Data generation points Examples
Big Data sources
• Big Data generated by People

• Big Data generated by Machines

• Big Data generated by Organizations


Big Data
generated by
People

The Unstructured Challenge


Size of Data
generated by
people on
different
platforms
Traditonal Data
Storage Model and
Tools are unable to
cope with these
huge datasets
80 to 90 % of
the entire
data is
unstructured
How people
generated data
can be utilized to
gain information
and value
How
unstructured
data can be
modeled for
Analysis
What Tools, Data
formats, and Skilled
people will be
required to gain Value
for unstructured Big
Data
What Challenges
can be faced for
getting Value out
of Unstructured
Big Data
Big Data sources
• Big Data generated by People

• Big Data generated by Machines

• Big Data generated by Organizations


Big Data generated by Machines
• It's Everywhere and there's a lot
• Machine data is the largest source of big data!
• Sensing Smart
What makes
a smart
device
smart?
Example
Smart Device:
Activity
Tracker
With the increasing
number of machines that
can sense and the
concept of Internet of
Things (IoT), huge
volumes of data is
generated that needs to
be analyzed.
Big Data sources
• Big Data generated by People

• Big Data generated by Machines

• Big Data generated by Organizations


Big Data generated by Organizations
• Structured but often siloed
• How organizations produce data
An Example:
Sale
Transaction
Data
Sales
Transaction
Records: Highly
structured data
How Sales
Transcation Data
can be used to get
better predicition
and improve the
overall business.
What are the
possible formats
for structured
data
Thank You!

You might also like