Hadoop
Hadoop
Topic # Learning
Topic Name Objectiv
e#
5 Pig
1
2
7 Hive
1
2
3
8 Hive Cont.
1
2
3
4
11 Spark
1
2
3
4
5
6
7
8
9
10
11
16 Case Study
1
Big Data: Module Table of Contents
Hive-meta Store
Hive Architecture
Hive UDF
Estimated Time Duration for this Topic
Partitioning
Bucking
Indexing
Different Performance Optimization techniques
Estimated Time Duration for this Topic
NoSQL discussion
Architecture and role of HBase
Other NoSQL databases and their use cases
Yarn Vs Hadoop 1.X
Need of real time data analysis and benefits of using Storm and Spark
Estimated Time Duration for this Topic
Introduction
Building blocks
Diving for Data
Wrapping up
Need of real time data analysis and benefits of using Storm and Spark
Doubt Clarification
Estimated Time Duration for this Topic
Introduction to Spark
Transformations
Key Value Methods and Caching Data
Distribution and Instrumentation
Spark Streaming
Optimization
Data Exploration and Analysis
Transforming and Cleaning Unstructured Data
Summarizing Data Along Dimensions
Modeling Relationships
Doubt Clarification
Estimated Time Duration for this Topic
Introduction
Querying Data with the DataFrames
Improving Type Safety with Datasets
Processing Data with the Streaming API
Optimizing, Structured Streaming, and Spark 2.x
Doubt Clarification
Estimated Time Duration for this Topic
Introduction
Batch Layer with Apache Spark
Speed Layer with Spark Streaming
Advanced Streaming Operations
Streaming Ingest with Spark Streaming
Doubt Clarification
Estimated Time Duration for this Topic
Case Study
Estimated Time Duration for this Topic
60 60 120
60 60 120
500 300
400
0 0 0
0
0
0 0 0
180 180
300 100 400
100 80 180
60 80 140
100 100
260 160 420
0
0 0 0
120 120
420 360 780
0
0 0 0
650 650
0 650 650