Mapreduce Lifecycle

MapReduce is a programming model for processing large datasets in parallel across clusters of computers. It works by breaking jobs into independent tasks like mapping and reducing, and executing those tasks across nodes to process data in parallel. The key components are a job client, job tracker, task trackers, and map and reduce tasks.

Uploaded by

Bhavya Goyal

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Mapreduce Lifecycle

Uploaded by

Bhavya Goyal

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 8

MapReduce

Lifecycle By- Bhavya Goyal (21BBAB24)

MapReduce
MapReduce is a programming model designed to process large amount of data in parallel by dividing the job into
several independent local tasks. Running the independent tasks locally reduces the network usage drastically. To run
the tasks locally, the data needs move to the data nodes for data processing.
The below tasks occur when the user submits a MapReduce job to Hadoop -
• The local Job Client prepares the job for submission and hands it off to the Job Tracker.
• The Job Tracker schedules the job and distributes the map work among the Task Trackers for parallel processing.
• Each Task Tracker issues a Map Task.
• The Job Tracker receives progress information from the Task Trackers.
• Once the mapping phase results available, the Job Tracker distributes the reduce work among the Task Trackers for
parallel processing.
• Each Task Tracker issues a Reduce Task to perform the work
• The Job Tracker receives progress information from the Task Trackers.
• Once the Reduce task completed, Cleanup task will be performed.
JOB CLIENT

The Job Client prepares a job for execution. The local Job Client performs below When a MapReduce
job submitted to Hadoop –
• Validates the job configuration.
• Generates the input splits.
• Copies the job resources to a shared location (HDFS directory) which is accessible to the Job
Tracker and Task Trackers.
• Submits the job to the Job Tracker.
JOB TRACKER

The Job Tracker is responsible for the below tasks -

• scheduling jobs
• dividing a job into map and reduce tasks
• distributing map and reduce tasks among worker nodes
• task failure recovery
• tracking the job status
The Job Tracker performs below when preparing to run a job -
• Fetches input splits from the shared location where the Job Client placed the information.
• Creates a map task for each split.
• Assigns each map task to a Task Tracker.
The Job Tracker monitors the health of the Task Trackers and the progress of the job. Once the mapping phase results available, the Job
Tracker performs the below steps -
• Creates reduce tasks up to the maximum enabled by the job configuration.
• Assigns each map result partition to a reduce task.
• Assigns each reduce task to a Task Tracker.
A job is marked as complete when all map and reduce tasks completed successfully, or, when all map tasks completed successfully if
there is no reduce step.
TASK TRACKER

A Task Tracker manages the tasks assigned and reports status to the Job Tracker. The Task
Tracker runs on the associated node. The associated node may not require to be on the same
host.
Task tracker perform below when the Job Tracker assigns a map or reduce task to a Task
Tracker -
• Fetches job resources locally.
• Issues a child JVM on the node to execute the map or reduce task.
• Reports status to the Job Tracker.
The task issued by the Task Tracker runs the job's map or reduce functions.
MAP TASK

The Hadoop MapReduce framework creates a map task to process each InputSplit. The map task -
• Create input key-value pairs using the InputFormat to fetch the input data locally.
• Applies the job-supplied map function to each key-value pair.
• Performs local sorting and aggregation of the results.
• Runs the Combiner for further aggregation if the job includes a Combiner.
• Stores the results locally in memory and on the local file system.
• Communicates with the Task Tracker about progress and status.
• Notifies the Task Tracker for the job completion.
Map task results processed through a local sort by key to prepare the data for reduce tasks. Combiner runs in the map
task, if a Combiner is configured for the job. Combiner consolidates and reduces the amount of data that must be
transferred to reduce tasks.
When a map task notifies the Task Tracker about the job completion, the Task Tracker notifies it to the Job Tracker. Then
Job Tracker makes the results available to reduce tasks.
REDUCE TASK

The reduce phase aggregates the results from the map phase into final results. Normally, the result set is smaller than
the input set and application dependent. The reduction is carried out by parallel reduce tasks.
The reduce input keys and values need not have the same type as the output keys and values. The reduce phase is
optional and a job can be configured to stop after map phase completes. Reduce is carried out in three phases - copy,
sort and merge.
• A reduce task -
• Fetches job resources locally.
• Performs copy phase to fetch local copies of all performed map results from the map worker nodes.
• Once the copy phase completed, performs sort phase to merge the copied results into a single sorted set of (key,
value-list) pairs.
• Once the sort phase completes, executes the reduce phase by invoking the job-supplied reduce function on each (key,
value-list) pair.
• Saves the end results to the output destination (HDFS).
The input to a reduce function is key-value pairs where the value is a list of values sharing the same key. When a map
task notifies the Task Tracker about the job completion, the Task Tracker notifies it to the Job Tracker. Then Job Tracker
saves the end results at the output destination (HDFS).

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
57% (82)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (108)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
91% (35)
The 36 Questions That Lead To Love - The New York Times
3 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Case Studies in Immunology A Clinical Companion Raif Geha Download PDF
100% (12)
Case Studies in Immunology A Clinical Companion Raif Geha Download PDF
49 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
Group Assignment Oct 2016
0% (1)
Group Assignment Oct 2016
2 pages
1001 Songs
70% (73)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Big Ben Supplementary Pack B1 - Preview
No ratings yet
Big Ben Supplementary Pack B1 - Preview
20 pages
Anatomy of Map-Reduce Jobs PDF
No ratings yet
Anatomy of Map-Reduce Jobs PDF
30 pages
Unit Iv Mapreduce Applications
No ratings yet
Unit Iv Mapreduce Applications
70 pages
Nash Vectra XL Brochure
No ratings yet
Nash Vectra XL Brochure
4 pages
Cost Oriented Pricing Method
100% (1)
Cost Oriented Pricing Method
12 pages
BDA UNIT -4 notes
No ratings yet
BDA UNIT -4 notes
28 pages
Module 4
No ratings yet
Module 4
37 pages
BDA Unit 2 Notes
No ratings yet
BDA Unit 2 Notes
32 pages
MapReduce
No ratings yet
MapReduce
14 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
11 pages
Unit 3
No ratings yet
Unit 3
13 pages
Mapreduce
No ratings yet
Mapreduce
5 pages
Module 4 BDA Solutions
No ratings yet
Module 4 BDA Solutions
22 pages
1 UNIT-1
No ratings yet
1 UNIT-1
59 pages
UNIT -4 PPT
No ratings yet
UNIT -4 PPT
50 pages
Unit-4
No ratings yet
Unit-4
19 pages
MapReduce Arch
No ratings yet
MapReduce Arch
29 pages
How Map Reduce Work
No ratings yet
How Map Reduce Work
99 pages
3.1.How Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.How Map Reduce Works & 3.2 Anatomy
11 pages
Big Data BCA Unit4
No ratings yet
Big Data BCA Unit4
9 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
4 pages
MapReduce Architecture
No ratings yet
MapReduce Architecture
5 pages
BIG DATA UNIT -3
No ratings yet
BIG DATA UNIT -3
7 pages
MapReduce Architecture
No ratings yet
MapReduce Architecture
27 pages
Unit 5 - Mapreduce
No ratings yet
Unit 5 - Mapreduce
8 pages
Bda Unit-3
No ratings yet
Bda Unit-3
20 pages
BDA UNIT-3 (1) - Merged
No ratings yet
BDA UNIT-3 (1) - Merged
98 pages
Big Data Unit 4
No ratings yet
Big Data Unit 4
14 pages
Anatomyofclassicmapreduceinhadoop 140102205919 Phpapp01
No ratings yet
Anatomyofclassicmapreduceinhadoop 140102205919 Phpapp01
15 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
26 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
BDA-Unit-II
No ratings yet
BDA-Unit-II
12 pages
Unit - III
No ratings yet
Unit - III
37 pages
Bda Unit 4
No ratings yet
Bda Unit 4
20 pages
P.Prabu (23x61c) CCS334-BDA - Unit-3
No ratings yet
P.Prabu (23x61c) CCS334-BDA - Unit-3
23 pages
Hadoop and Big Data Unit 31
No ratings yet
Hadoop and Big Data Unit 31
9 pages
Unit-2 (MapReduce-II)
No ratings yet
Unit-2 (MapReduce-II)
11 pages
A Weather Dataset. Understanding Hadoop API for MapReduce Framework
No ratings yet
A Weather Dataset. Understanding Hadoop API for MapReduce Framework
9 pages
Lecture 5 MapReduce Working
No ratings yet
Lecture 5 MapReduce Working
15 pages
Executing Hadoop Map Reduce Jobs
No ratings yet
Executing Hadoop Map Reduce Jobs
2 pages
UNIT 3bda
No ratings yet
UNIT 3bda
16 pages
2 BDA MapReduce
No ratings yet
2 BDA MapReduce
30 pages
Anatomy of A MapReduce Job
No ratings yet
Anatomy of A MapReduce Job
5 pages
Lec 5
No ratings yet
Lec 5
5 pages
Hadoop Streaming: Mapreduce
No ratings yet
Hadoop Streaming: Mapreduce
8 pages
Notes - Unit 3 - Map Reduce Applications
No ratings yet
Notes - Unit 3 - Map Reduce Applications
11 pages
Big Data Analytics Mid 2
No ratings yet
Big Data Analytics Mid 2
9 pages
BDA U2 - copy
No ratings yet
BDA U2 - copy
79 pages
BDA-U4
No ratings yet
BDA-U4
25 pages
2inceptez Hadoop Processing
No ratings yet
2inceptez Hadoop Processing
16 pages
Big Data Analytics-4
No ratings yet
Big Data Analytics-4
26 pages
Map Reduce
No ratings yet
Map Reduce
40 pages
Lec 5
No ratings yet
Lec 5
6 pages
Hadoop Learning MapReduce
No ratings yet
Hadoop Learning MapReduce
3 pages
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
No ratings yet
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
54 pages
What Is MapReduce in Hadoop - Architecture - Example
No ratings yet
What Is MapReduce in Hadoop - Architecture - Example
7 pages
Bda Module 4
No ratings yet
Bda Module 4
34 pages
3-MapReduce Different Phases-13-01-2025
No ratings yet
3-MapReduce Different Phases-13-01-2025
23 pages
BDA Unit-3
No ratings yet
BDA Unit-3
24 pages
UNIT-4 bda
No ratings yet
UNIT-4 bda
26 pages
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
From Everand
SAP interface programming with RFC and VBA: Edit SAP data with MS Access
Karl Josef Hensel
No ratings yet
3D Hardware design:: Software applications for GPU
From Everand
3D Hardware design:: Software applications for GPU
S Mathioudakis
No ratings yet
Remote Projects Follow-up with Scrum-Excel Burn Down Chart: Scrum and Jira, #1
From Everand
Remote Projects Follow-up with Scrum-Excel Burn Down Chart: Scrum and Jira, #1
Quantic Statistics
No ratings yet
Sample Leadership Development Plan
No ratings yet
Sample Leadership Development Plan
4 pages
Grand Jury Manual
100% (1)
Grand Jury Manual
769 pages
TIA PRO1 02 SystemOverview en
No ratings yet
TIA PRO1 02 SystemOverview en
33 pages
Why Resulting Trust Arrise Shah
100% (2)
Why Resulting Trust Arrise Shah
3 pages
DIC DDA Freelancer Permit
No ratings yet
DIC DDA Freelancer Permit
1 page
GMSYS-3D Release Notes
No ratings yet
GMSYS-3D Release Notes
2 pages
Lorem Ipsum
No ratings yet
Lorem Ipsum
99 pages
BHRM123 Case Study
No ratings yet
BHRM123 Case Study
9 pages
Rustan Pulp and Paper Mills v. IAC, 214 SCRA 665 (1992)
No ratings yet
Rustan Pulp and Paper Mills v. IAC, 214 SCRA 665 (1992)
4 pages
G20 MCQ Questions for Practiced
No ratings yet
G20 MCQ Questions for Practiced
6 pages
Total Quality Management
No ratings yet
Total Quality Management
46 pages
Mictron 900: Fire Alarm System
No ratings yet
Mictron 900: Fire Alarm System
2 pages
Buying A Used Car
No ratings yet
Buying A Used Car
20 pages
Screenshot 2023-04-27 at 1.09.31 PM
No ratings yet
Screenshot 2023-04-27 at 1.09.31 PM
1 page
Part Book PDF
100% (1)
Part Book PDF
244 pages
CIS 405 - Group Project
0% (1)
CIS 405 - Group Project
2 pages
Radio Owners Manual Xdma6415
No ratings yet
Radio Owners Manual Xdma6415
28 pages
Scouring of Wool
0% (1)
Scouring of Wool
4 pages
Python Scripting For System Administration: Rebeka Mukherjee
No ratings yet
Python Scripting For System Administration: Rebeka Mukherjee
50 pages
Identification Guide - Philco Library
No ratings yet
Identification Guide - Philco Library
10 pages
Simulation 1PAN 12% Centre Déols
No ratings yet
Simulation 1PAN 12% Centre Déols
3 pages
Spears - PIPE SCH80
No ratings yet
Spears - PIPE SCH80
1 page
RLT 02 Ea 5
No ratings yet
RLT 02 Ea 5
2 pages
Break-Even Analysis With Multiple Products
No ratings yet
Break-Even Analysis With Multiple Products
7 pages
COVID Response Toolkit Kolkata
No ratings yet
COVID Response Toolkit Kolkata
23 pages

Mapreduce Lifecycle

Uploaded by

Mapreduce Lifecycle

Uploaded by

MapReduce

Lifecycle By- Bhavya Goyal (21BBAB24)

The Job Tracker is responsible for the below tasks -

You might also like