SlideShare a Scribd company logo
Dr. C.V. Suresh Babu
(CentreforKnowledgeTransfer)
institute
Discussion Topics
• Introduction
• Components of Hadoop
• MapReduce
• Map Task
• Reduce Task
• Anatomy of a Map Reduce
(CentreforKnowledgeTransfer)
institute
Introduction
• Hadoop is a framework written in Java that utilizes a large cluster of
commodity hardware to maintain and store big size data.
• Hadoop works on MapReduce Programming Algorithm that was introduced
by Google.
• Today lots of Big Brand Companys are using Hadoop in their Organization
to deal with big data for eg. Facebook, Yahoo, Netflix, eBay, etc.
(CentreforKnowledgeTransfer)
institute
Components of Hadoop
The Hadoop Architecture Mainly consists of 4 components.
• MapReduce
• HDFS(Hadoop distributed File System)
• YARN(Yet Another Resource Framework)
• Common Utilities or Hadoop Common
(CentreforKnowledgeTransfer)
institute
A Hadoop cluster
consists of a single
master and multiple
slave nodes. The master
node includes Job
Tracker, Task Tracker,
NameNode, and
DataNode whereas the
slave node includes
DataNode and
TaskTracker.
(CentreforKnowledgeTransfer)
institute
MapReduce
• MapReduce nothing but just like an Algorithm or a data structure that is
based on the YARN framework.
• The major feature of MapReduce is to perform the distributed processing in
parallel in a Hadoop cluster which Makes Hadoop working so fast.
• When you are dealing with Big Data, serial processing is no more of any use.
• MapReduce has mainly 2 tasks which are divided phase-wise:
 Map Task
 Reduce Task
(CentreforKnowledgeTransfer)
institute
Map Task
Here, we can see that
the Input is provided to
the Map() function then
it’s output is used as an
input to the Reduce
function and after that,
we receive our final
output.
In first phase, Map is utilized and in next phase Reduce is
utilized.
(CentreforKnowledgeTransfer)
institute
Map()
• As we can see that an Input is provided to
the Map(), now as we are using Big Data. The
Input is a set of Data.
• The Map() function here breaks this
DataBlocks into Tuples that are nothing but
a key-value pair.
• These key-value pairs are now sent as input
to the Reduce().
Reduce()
• The Reduce() function then combines this
broken Tuples or key-value pair based on
its Key value and form set of Tuples, and
perform some operation like sorting,
summation type job, etc. which is then
sent to the final Output Node.
• Finally, the Output is Obtained.
Note: The data processing is always done in Reducer depending upon the business requirement
of that industry. This is How First Map() and then Reduce is utilized one by one.
(CentreforKnowledgeTransfer)
institute
Map Task
• RecordReader The purpose of recoredreader is to break the records. It is responsible
for providing key-value pairs in a Map() function. The key is actually is its locational
information and value is the data associated with it.
• Map: A map is nothing but a user-defined function whose work is to process the
Tuples obtained from record reader. The Map() function either does not generate any
key-value pair or generate multiple pairs of these tuples.
• Combiner: Combiner is used for grouping the data in the Map workflow. It is similar
to a Local reducer. The intermediate key-value that are generated in the Map is
combined with the help of this combiner. Using a combiner is not necessary as it is
optional.
• Partitionar: Partitional is responsible for fetching key-value pairs generated in the
Mapper Phases. The partitioner generates the shards corresponding to each reducer.
Hashcode of each key is also fetched by this partition. Then partitioner performs
it’s(Hashcode) modulus with the number of reducers(key.hashcode()%(number of
reducers)).
(CentreforKnowledgeTransfer)
institute
Reduce Task
• Shuffle and Sort: The Task of Reducer starts with this step, the
process in which the Mapper generates the intermediate key-value and
transfers them to the Reducer task is known as Shuffling. Using the
Shuffling process the system can sort the data using its key
value.Once some of the Mapping tasks are done Shuffling begins that
is why it is a faster process and does not wait for the completion of the
task performed by Mapper.
• Reduce: The main function or task of the Reduce is to gather the Tuple
generated from Map and then perform some sorting and aggregation
sort of process on those key-value depending on its key element.
• OutputFormat: Once all the operations are performed, the key-value
pairs are written into the file with the help of record writer, each
record in a new line, and the key and value in a space-separated
manner.
(CentreforKnowledgeTransfer)
institute
Anatomy
of a
Map
Reduce
(CentreforKnowledgeTransfer)
institute
Ad

More Related Content

What's hot (20)

Hadoop
HadoopHadoop
Hadoop
Nishant Gandhi
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
 
Hadoop Map Reduce
Hadoop Map ReduceHadoop Map Reduce
Hadoop Map Reduce
VNIT-ACM Student Chapter
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
GauravBiswas9
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Dr. C.V. Suresh Babu
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
sunera pathan
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
Prashant Gupta
 
PPT on Hadoop
PPT on HadoopPPT on Hadoop
PPT on Hadoop
Shubham Parmar
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
Azad public school
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
Bhavesh Padharia
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnit
Eric Wendelin
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
Manish Borkar
 
Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-System
Md. Hasan Basri (Angel)
 
Apache PIG
Apache PIGApache PIG
Apache PIG
Prashant Gupta
 
Hadoop
Hadoop Hadoop
Hadoop
ABHIJEET RAJ
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
darugar
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
sravya raju
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
GauravBiswas9
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
sunera pathan
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Testing Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnitTesting Hadoop jobs with MRUnit
Testing Hadoop jobs with MRUnit
Eric Wendelin
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
Introduction to Hadoop Technology
Introduction to Hadoop TechnologyIntroduction to Hadoop Technology
Introduction to Hadoop Technology
Manish Borkar
 
Introduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-SystemIntroduction to Apache Hadoop Eco-System
Introduction to Apache Hadoop Eco-System
Md. Hasan Basri (Angel)
 
Cloud Computing: Hadoop
Cloud Computing: HadoopCloud Computing: Hadoop
Cloud Computing: Hadoop
darugar
 

Similar to Hadoop Architecture (20)

Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pig
KhanKhaja1
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdf
BikalAdhikari4
 
writing Hadoop Map Reduce programs
writing Hadoop Map Reduce programswriting Hadoop Map Reduce programs
writing Hadoop Map Reduce programs
jani shaik
 
MapReduce.pptx
MapReduce.pptxMapReduce.pptx
MapReduce.pptx
ssuserb8d5cb
 
Hadoop – Architecture.pptx
Hadoop – Architecture.pptxHadoop – Architecture.pptx
Hadoop – Architecture.pptx
SakthiVinoth78
 
Big Data Analytics [email protected]
Big Data Analytics Chapter3-6@2021.pdfBig Data Analytics Chapter3-6@2021.pdf
Big Data Analytics [email protected]
WasyihunSema2
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
Pallav Jha
 
Hadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.comHadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.com
softwarequery
 
Hadoop
HadoopHadoop
Hadoop
devakalyan143
 
Big Data- process of map reducing MapReduce- .ppt
Big Data- process of map reducing MapReduce- .pptBig Data- process of map reducing MapReduce- .ppt
Big Data- process of map reducing MapReduce- .ppt
sunilsoni446112
 
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
bhuvankumar3877
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
NelakurthyVasanthRed1
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
Shrihari Rathod
 
Lecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduceLecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduce
anasbro009
 
MapReduce basics
MapReduce basicsMapReduce basics
MapReduce basics
Harisankar H
 
Hadoop and MapReduce Introductort presentation
Hadoop and MapReduce Introductort presentationHadoop and MapReduce Introductort presentation
Hadoop and MapReduce Introductort presentation
ssuserb91a20
 
Map reduce prashant
Map reduce prashantMap reduce prashant
Map reduce prashant
Prashant Gupta
 
Unit3 MapReduce
Unit3 MapReduceUnit3 MapReduce
Unit3 MapReduce
Integral university, India
 
Hadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pigHadoop eco system with mapreduce hive and pig
Hadoop eco system with mapreduce hive and pig
KhanKhaja1
 
Introduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdfIntroduction to the Map-Reduce framework.pdf
Introduction to the Map-Reduce framework.pdf
BikalAdhikari4
 
writing Hadoop Map Reduce programs
writing Hadoop Map Reduce programswriting Hadoop Map Reduce programs
writing Hadoop Map Reduce programs
jani shaik
 
Hadoop – Architecture.pptx
Hadoop – Architecture.pptxHadoop – Architecture.pptx
Hadoop – Architecture.pptx
SakthiVinoth78
 
Hadoop fault tolerance
Hadoop  fault toleranceHadoop  fault tolerance
Hadoop fault tolerance
Pallav Jha
 
Hadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.comHadoop interview questions - Softwarequery.com
Hadoop interview questions - Softwarequery.com
softwarequery
 
Big Data- process of map reducing MapReduce- .ppt
Big Data- process of map reducing MapReduce- .pptBig Data- process of map reducing MapReduce- .ppt
Big Data- process of map reducing MapReduce- .ppt
sunilsoni446112
 
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptxCLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
CLOUD_COMPUTING_MODULE4_RK_BIG_DATA.pptx
bhuvankumar3877
 
MAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptxMAP REDUCE IN DATA SCIENCE.pptx
MAP REDUCE IN DATA SCIENCE.pptx
HARIKRISHNANU13
 
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdfmodule3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
module3part-1-bigdata-230301002404-3db4f2a4 (1).pdf
TSANKARARAO
 
Join Algorithms in MapReduce
Join Algorithms in MapReduceJoin Algorithms in MapReduce
Join Algorithms in MapReduce
Shrihari Rathod
 
Lecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduceLecture 04 big data analytics | map reduce
Lecture 04 big data analytics | map reduce
anasbro009
 
Hadoop and MapReduce Introductort presentation
Hadoop and MapReduce Introductort presentationHadoop and MapReduce Introductort presentation
Hadoop and MapReduce Introductort presentation
ssuserb91a20
 
Ad

More from Dr. C.V. Suresh Babu (20)

Data analytics with R
Data analytics with RData analytics with R
Data analytics with R
Dr. C.V. Suresh Babu
 
Association rules
Association rulesAssociation rules
Association rules
Dr. C.V. Suresh Babu
 
Clustering
ClusteringClustering
Clustering
Dr. C.V. Suresh Babu
 
Classification
ClassificationClassification
Classification
Dr. C.V. Suresh Babu
 
Blue property assumptions.
Blue property assumptions.Blue property assumptions.
Blue property assumptions.
Dr. C.V. Suresh Babu
 
Introduction to regression
Introduction to regressionIntroduction to regression
Introduction to regression
Dr. C.V. Suresh Babu
 
DART
DARTDART
DART
Dr. C.V. Suresh Babu
 
Mycin
MycinMycin
Mycin
Dr. C.V. Suresh Babu
 
Expert systems
Expert systemsExpert systems
Expert systems
Dr. C.V. Suresh Babu
 
Dempster shafer theory
Dempster shafer theoryDempster shafer theory
Dempster shafer theory
Dr. C.V. Suresh Babu
 
Bayes network
Bayes networkBayes network
Bayes network
Dr. C.V. Suresh Babu
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
Dr. C.V. Suresh Babu
 
Knowledge based agents
Knowledge based agentsKnowledge based agents
Knowledge based agents
Dr. C.V. Suresh Babu
 
Rule based system
Rule based systemRule based system
Rule based system
Dr. C.V. Suresh Babu
 
Formal Logic in AI
Formal Logic in AIFormal Logic in AI
Formal Logic in AI
Dr. C.V. Suresh Babu
 
Production based system
Production based systemProduction based system
Production based system
Dr. C.V. Suresh Babu
 
Game playing in AI
Game playing in AIGame playing in AI
Game playing in AI
Dr. C.V. Suresh Babu
 
Diagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AIDiagnosis test of diabetics and hypertension by AI
Diagnosis test of diabetics and hypertension by AI
Dr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
Dr. C.V. Suresh Babu
 
A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”A study on “impact of artificial intelligence in covid19 diagnosis”
A study on “impact of artificial intelligence in covid19 diagnosis”
Dr. C.V. Suresh Babu
 
Ad

Recently uploaded (20)

Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Download Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With LatestDownload Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With Latest
tahirabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Salesforce Data Cloud- Hyperscale data platform, built for Salesforce.
Dele Amefo
 
Exploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the FutureExploring Wayland: A Modern Display Server for the Future
Exploring Wayland: A Modern Display Server for the Future
ICS
 
Download Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With LatestDownload Wondershare Filmora Crack [2025] With Latest
Download Wondershare Filmora Crack [2025] With Latest
tahirabibi60507
 
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRYLEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
LEARN SEO AND INCREASE YOUR KNOWLDGE IN SOFTWARE INDUSTRY
NidaFarooq10
 
Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025Adobe After Effects Crack FREE FRESH version 2025
Adobe After Effects Crack FREE FRESH version 2025
kashifyounis067
 
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest (MSR...
Andre Hora
 
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...Explaining GitHub Actions Failures with Large Language Models Challenges, In...
Explaining GitHub Actions Failures with Large Language Models Challenges, In...
ssuserb14185
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025Avast Premium Security Crack FREE Latest Version 2025
Avast Premium Security Crack FREE Latest Version 2025
mu394968
 
Expand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchangeExpand your AI adoption with AgentExchange
Expand your AI adoption with AgentExchange
Fexle Services Pvt. Ltd.
 
Landscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature ReviewLandscape of Requirements Engineering for/by AI through Literature Review
Landscape of Requirements Engineering for/by AI through Literature Review
Hironori Washizaki
 
FL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full VersionFL Studio Producer Edition Crack 2025 Full Version
FL Studio Producer Edition Crack 2025 Full Version
tahirabibi60507
 
Automation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath CertificateAutomation Techniques in RPA - UiPath Certificate
Automation Techniques in RPA - UiPath Certificate
VICTOR MAESTRE RAMIREZ
 
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software DevelopmentSecure Test Infrastructure: The Backbone of Trustworthy Software Development
Secure Test Infrastructure: The Backbone of Trustworthy Software Development
Shubham Joshi
 
Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]Get & Download Wondershare Filmora Crack Latest [2025]
Get & Download Wondershare Filmora Crack Latest [2025]
saniaaftab72555
 
Adobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest VersionAdobe Illustrator Crack FREE Download 2025 Latest Version
Adobe Illustrator Crack FREE Download 2025 Latest Version
kashifyounis067
 
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...Exploring Code Comprehension  in Scientific Programming:  Preliminary Insight...
Exploring Code Comprehension in Scientific Programming: Preliminary Insight...
University of Hawai‘i at Mānoa
 
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
Interactive odoo dashboards for sales, CRM , Inventory, Invoice, Purchase, Pr...
AxisTechnolabs
 
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New VersionPixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
Pixologic ZBrush Crack Plus Activation Key [Latest 2025] New Version
saimabibi60507
 
How can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptxHow can one start with crypto wallet development.pptx
How can one start with crypto wallet development.pptx
laravinson24
 

Hadoop Architecture

  • 1. Dr. C.V. Suresh Babu (CentreforKnowledgeTransfer) institute
  • 2. Discussion Topics • Introduction • Components of Hadoop • MapReduce • Map Task • Reduce Task • Anatomy of a Map Reduce (CentreforKnowledgeTransfer) institute
  • 3. Introduction • Hadoop is a framework written in Java that utilizes a large cluster of commodity hardware to maintain and store big size data. • Hadoop works on MapReduce Programming Algorithm that was introduced by Google. • Today lots of Big Brand Companys are using Hadoop in their Organization to deal with big data for eg. Facebook, Yahoo, Netflix, eBay, etc. (CentreforKnowledgeTransfer) institute
  • 4. Components of Hadoop The Hadoop Architecture Mainly consists of 4 components. • MapReduce • HDFS(Hadoop distributed File System) • YARN(Yet Another Resource Framework) • Common Utilities or Hadoop Common (CentreforKnowledgeTransfer) institute
  • 5. A Hadoop cluster consists of a single master and multiple slave nodes. The master node includes Job Tracker, Task Tracker, NameNode, and DataNode whereas the slave node includes DataNode and TaskTracker. (CentreforKnowledgeTransfer) institute
  • 6. MapReduce • MapReduce nothing but just like an Algorithm or a data structure that is based on the YARN framework. • The major feature of MapReduce is to perform the distributed processing in parallel in a Hadoop cluster which Makes Hadoop working so fast. • When you are dealing with Big Data, serial processing is no more of any use. • MapReduce has mainly 2 tasks which are divided phase-wise:  Map Task  Reduce Task (CentreforKnowledgeTransfer) institute
  • 7. Map Task Here, we can see that the Input is provided to the Map() function then it’s output is used as an input to the Reduce function and after that, we receive our final output. In first phase, Map is utilized and in next phase Reduce is utilized. (CentreforKnowledgeTransfer) institute
  • 8. Map() • As we can see that an Input is provided to the Map(), now as we are using Big Data. The Input is a set of Data. • The Map() function here breaks this DataBlocks into Tuples that are nothing but a key-value pair. • These key-value pairs are now sent as input to the Reduce(). Reduce() • The Reduce() function then combines this broken Tuples or key-value pair based on its Key value and form set of Tuples, and perform some operation like sorting, summation type job, etc. which is then sent to the final Output Node. • Finally, the Output is Obtained. Note: The data processing is always done in Reducer depending upon the business requirement of that industry. This is How First Map() and then Reduce is utilized one by one. (CentreforKnowledgeTransfer) institute
  • 9. Map Task • RecordReader The purpose of recoredreader is to break the records. It is responsible for providing key-value pairs in a Map() function. The key is actually is its locational information and value is the data associated with it. • Map: A map is nothing but a user-defined function whose work is to process the Tuples obtained from record reader. The Map() function either does not generate any key-value pair or generate multiple pairs of these tuples. • Combiner: Combiner is used for grouping the data in the Map workflow. It is similar to a Local reducer. The intermediate key-value that are generated in the Map is combined with the help of this combiner. Using a combiner is not necessary as it is optional. • Partitionar: Partitional is responsible for fetching key-value pairs generated in the Mapper Phases. The partitioner generates the shards corresponding to each reducer. Hashcode of each key is also fetched by this partition. Then partitioner performs it’s(Hashcode) modulus with the number of reducers(key.hashcode()%(number of reducers)). (CentreforKnowledgeTransfer) institute
  • 10. Reduce Task • Shuffle and Sort: The Task of Reducer starts with this step, the process in which the Mapper generates the intermediate key-value and transfers them to the Reducer task is known as Shuffling. Using the Shuffling process the system can sort the data using its key value.Once some of the Mapping tasks are done Shuffling begins that is why it is a faster process and does not wait for the completion of the task performed by Mapper. • Reduce: The main function or task of the Reduce is to gather the Tuple generated from Map and then perform some sorting and aggregation sort of process on those key-value depending on its key element. • OutputFormat: Once all the operations are performed, the key-value pairs are written into the file with the help of record writer, each record in a new line, and the key and value in a space-separated manner. (CentreforKnowledgeTransfer) institute