0% found this document useful (0 votes)
17 views

Unit 1 Big Data Tutorial

The document discusses big data including what it is, types of data, examples and use cases, architecture, and analytics. Big data refers to large and complex datasets that are difficult to process with traditional tools. The types are structured, unstructured, and semi-structured data. Examples of use cases include transportation, advertising, banking, government, and more. The architecture involves data sources, storage, batch processing, stream processing, and analytics.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Unit 1 Big Data Tutorial

The document discusses big data including what it is, types of data, examples and use cases, architecture, and analytics. Big data refers to large and complex datasets that are difficult to process with traditional tools. The types are structured, unstructured, and semi-structured data. Examples of use cases include transportation, advertising, banking, government, and more. The architecture involves data sources, storage, batch processing, stream processing, and analytics.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Engineering in One Video (EIOV) Watch video on

Engineering in One Video (EIOV) Watch video on

Topics to be covered...
Evolution of Technology
What is Big Data?
Types of Big Data?
Big Data Examples & Use Cases
Big Data architecture
When to use this architecture
5Vs of Big Data
Big Data technology
Big Data importance
Big Data applications
Big Data Analytics
Need for Big Data Analytics
What is Big Data Analytics
Types of Big Data Analytics
Happy Ending!
Engineering in One Video (EIOV) Watch video on

What is Big Data?


Big data is the term for a collection of data sets so large and
complex that it becomes difficult to process using on-hand
database management tools or traditional data processing
applications
Engineering in One Video (EIOV) Watch video on

What is Big Data?


Structured
Unstructured
Semi-structured
Engineering in One Video (EIOV) Watch video on

Structured
Any data that can be stored,
accessed and processed in
the form of fixed format is
termed as a 'structured' data.

Table
Engineering in One Video (EIOV) Watch video on

Unstructured
Any data with unknown form
or the structure is classified as
unstructured data.
Engineering in One Video (EIOV) Watch video on

Semi-structured
Semi-structured data is information that
does not reside in a relational database
or any other data table, but nonetheless
has some organizational properties to
make it easier to analyze, such as
semantic tags.
Engineering in One Video (EIOV) Watch video on
Engineering in One Video (EIOV) Watch video on

8 Big Data Examples & Use Cases


Transportation.
Advertising and Marketing.
Banking and Financial Services.
Government.
Media and Entertainment.
Meteorology.
Healthcare.
Cybersecurity.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


A big data architecture is designed to handle the ingestion,
processing, and analysis of data that is too large or
complex for traditional database systems.

Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Data sources: All big data solutions start with one or more
data sources.
Examples include:
-> Application data stores, such as relational databases.
-> Static files produced by applications, such as web server log files.
-> Real-time data sources, such as IoT devices.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Data storage: Data for batch processing operations is typically stored


in a distributed file store that can hold high volumes of large files in
various formats. This kind of store is often called a data lake. Options
for implementing this storage include Azure Data Lake Store or blob
containers in Azure Storage.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Batch processing: Because the data sets are so large, often a big data
solution must process data files using long-running batch jobs to
filter, aggregate, and otherwise prepare the data for analysis. Usually
these jobs involve reading source files, processing them and writing
the output to new files. Options include running U-SQL jobs in Azure
Data Lake Analytics, using Hive, Pig, or custom Map/Reduce jobs in
an HDInsight Hadoop cluster, or using Java, Scala, or Python
programs in an HDInsight Spark cluster.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Stream processing: After capturing real-time messages, the solution


must process them by filtering, aggregating , and otherwise
preparing the data for analysis. The processed stream data is then
written to an output sink. Azure Stream Analytics provides a
managed stream processing service based on perpetually running
SQL queries that operate on unbounded streams. You can also use
open source Apache streaming technologies like Storm and Spark
Streaming in an HDInsight cluster.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Analytical data store: Many big data solutions prepare data for
analysis and then serve the processed data in a structured format
that can be queried using analytical tools. The analytical data store
used to serve these queries can be a Kimball-style relational data
warehouse, as seen in most traditional business intelligence (BI)
solutions.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Analysis and reporting: The goal of most big data solutions is to


provide insights into the data through analysis and reporting. To
empower users to analyze the data, the architecture may include a
data modelling layer, such as a multidimensional OLAP cube or
tabular data model in Azure Analysis Services. It might also support
self-service BI, using the modelling and visualization technologies in
Microsoft Power BI or Microsoft Excel. Analysis and reporting can
also take the form of interactive data exploration by data scientists or
data analysts.
Engineering in One Video (EIOV) Watch video on

Big Data architecture


Batch
Data Storage
Processing
Analytics
Data Analytical
and
Sources Data Store
Reporting
Real-Time Message Stream
Ingestion Processing

Orchestration

Orchestration: Most big data solutions consist of repeated data


processing operations, encapsulated in workflows, that transform
source data, move data between multiple sources and sinks, load the
processed data into an analytical data store, or push the results
straight to a report or dashboard. To automate these workflows, you
can use an orchestration technology such Azure Data Factory or
Apache Oozie and Sqoop.
Engineering in One Video (EIOV) Watch video on

When to use this architecture


1. Store and process data in volumes too large for a traditional
database.
2. Transform unstructured data for analysis and reporting.
3. Capture, process, and analyse unbounded streams of data in real
time, or with low latency.
4. Use Azure Machine Learning or Microsoft Cognitive Services.
Engineering in One Video (EIOV) Watch video on

Characteristics OR 5Vs of Big Data


1. Volume
2. Veracity
3. Variety
4. Value
5. Velocity
Engineering in One Video (EIOV) Watch video on
Engineering in One Video (EIOV) Watch video on
Engineering in One Video (EIOV) Watch video on

Big Data technology


Engineering in One Video (EIOV) Watch video on

Big Data importance


1. Cost Savings
2. Time-Saving
3. Understand the market conditions
4. Social Media Listening
5. Boost Customer Acquisition and Retention
6. Solve Advertisers Problem
7. The driver of Innovations and Product Development
Engineering in One Video (EIOV) Watch video on

Big Data applications


1. Banking and Securities
2. Communications, Media and Entertainment
3. Healthcare Providers
4. Education
5. Government
6. Insurance
7. Retail and Wholesale trade
8. Transportation
Engineering in One Video (EIOV) Watch video on

Need for Big Data Analytics


1. Optimize business operations by analyzing
customer behaviour
Engineering in One Video (EIOV) Watch video on

Need for Big Data Analytics


2. Next Generation Products
Engineering in One Video (EIOV) Watch video on

What is Big Data Analytics?


Big data analytics is the use of advanced analytic
techniques against very large, diverse data sets
that include structured, semi-structured and
unstructured data, from different sources, and in
different sizes from terabytes to zettabytes.
Engineering in One Video (EIOV) Watch video on

Types of Big Data Analytics


1. Descriptive Analysis
2. Predictive Analysis
3. Prescriptive Analysis
4. Diagnostic Analysis
Engineering in One Video (EIOV) Watch video on

Types of Big Data Analytics


1. Descriptive Analysis
What is happening now based on
incoming data.
Engineering in One Video (EIOV) Watch video on

Types of Big Data Analytics


2. Predictive Analysis
What might happen in future.
Engineering in One Video (EIOV) Watch video on

Types of Big Data Analytics


3. Prescriptive Analysis
What action should be taken.

Google's self-driving car is perfect


example of Presciptive Analysis.
Engineering in One Video (EIOV) Watch video on

Types of Big Data Analytics


4. Diagnostic Analysis
What did it happen

You might also like