Big Data 4 Vivek

This document describes running a MapReduce program in Java to count the frequency of words in a text dataset. It involves writing a mapper to extract words and counts from input data, a reducer to sum the counts of each word, and a driver to run the MapReduce job and output the results.

Uploaded by

Pulkit Ahuja

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Big Data 4 Vivek

Uploaded by

Pulkit Ahuja

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Name: Vivek Kumar Roll No:11212652

EXPERIMENT NO:- 04

AIM: Run a java program based on parallel programming to implement the concept of
Map Reduce Paradigm.

DESCRIPTION:--

MapReduce is the heart of Hadoop. It is this programming paradigm that allows for massive
scalability across hundreds or thousands of servers in a Hadoop cluster. The MapReduce concept
is fairly simple to understand for those who are familiar with clustered scale-out data processing
solutions. The term MapReduce actually refers to two separate and distinct tasks that Hadoop
programs perform. The first is the map job, which takes a set of data and converts it into another
set of data, where individual elements are broken down into tuples (key/value pairs). The reduce
job takes the output from a map as input and combines those data tuples into a smaller set of
tuples. As the sequence of the name MapReduce implies, the reduce job is always performed
after the map job.

ALGORITHM : MAPREDUCE PROGRAM

WordCount is a simple program which counts the number of occurrences of each word in
a given text input data set. WordCount fits very well with the MapReduce programming model
making it a great example to understand the Hadoop Map/Reduce programming style. Our
implementation consists of three main parts:

1. Mapper: Mapper is a function which process the input data. The mapper
processes the data and creates several small chunks of data. The input to the
mapper function is in the form of (key, value) pairs, even though the input to a
MapReduce program is a file or directory (which is stored in the HDFS)
2. Reducer: The reducer is a pure function that takes the previous state and an
action, and returns the next state. (previousState, action) => nextState.
Copy. It's called a reducer because it's the type of function you would
pass to Array.
3. Driver: There is one final component of a Hadoop MapReduce program,
called the Driver. The driver initializes the job and instructs the
Hadoop platform to execute your code on a set of input files, and controls where
the output files are placed.
Step-1. Write a Mapper : A Mapper overrides the ―mapǁ function from the Class
org.apache.hadoop.mapreduce.Mapper "which provides <key, value> pairs as the input. A
Mapper implementation may output <key,value> pairs using the provided Context . Input value
of the WordCount Map task will be a line of text from the input data file and the key would be
the line number <line_number, line_of_text> . Map task outputs <word, one> for each word in
the line of text. Pseudo-code
Name: Vivek Kumar Roll No:11212652

void Map (key, value){

for each word x in
value:
output.collect(x, 1);
}
Step-2. Write a Reducer: A Reducer collects the intermediate <key.value> output from
multiple map tasks and assemble a single result. Here, the WordCount program will sum up
the occurrence of each word to pairs as <word, occurrence>. Pseudo-code
void Reduce (keyword, <list of value>)
{ for each x in <list of value>:
sum+=x;
final output.collect(keyword, sum);

Step-3. Write Driver: The Driver program configures and run the MapReduce job. We use the
main program to perform basic configurations such as:
• Job Name: name of this Job
• Executable (Jar) Class: the main executable class. For here, WordCount.
• Mapper Class: class which overrides the "map" function. For here, Map.
• Reducer: class which override the "reduce" function. For here, Reduce.
• Output Key: type of output key. For here. Text. Output Value: type of output value.
For here, IntWritable.
• File Input Path
• File Output Path

INPUT:- Set of Data Related Shakespeare Comedies, Glossary, Poems

Name: Vivek Kumar Roll No:11212652

OUTPUT:-

BDC Output 3
No ratings yet
BDC Output 3
4 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
3 MapReduce program ex code
No ratings yet
3 MapReduce program ex code
14 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
Ravikant_Hadoop_file
No ratings yet
Ravikant_Hadoop_file
22 pages
Bda Unit III r20csm
No ratings yet
Bda Unit III r20csm
54 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
Chapter 9 - Processing Big Data With Mapreduce
No ratings yet
Chapter 9 - Processing Big Data With Mapreduce
157 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Palak
No ratings yet
Palak
10 pages
wc
No ratings yet
wc
13 pages
6. Map Reduce Programming
No ratings yet
6. Map Reduce Programming
67 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Bda 03
No ratings yet
Bda 03
10 pages
Assignment 11 DSBDA
No ratings yet
Assignment 11 DSBDA
4 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
Lecture - 3
No ratings yet
Lecture - 3
25 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
2020300053_BDA_EXP2_CHINMAY
No ratings yet
2020300053_BDA_EXP2_CHINMAY
7 pages
CS702_Big_Data_Programs
No ratings yet
CS702_Big_Data_Programs
58 pages
Lecture 03
No ratings yet
Lecture 03
26 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
Unit 3 - Big Data Technologies
No ratings yet
Unit 3 - Big Data Technologies
42 pages
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
No ratings yet
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
23 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
5 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
Bda Lab Exercises Lab Mannual - 2023
No ratings yet
Bda Lab Exercises Lab Mannual - 2023
72 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Hadoop Tutorial - YDN
No ratings yet
Hadoop Tutorial - YDN
14 pages
Map Reduce
No ratings yet
Map Reduce
18 pages
Map Reduce Design and Execution Framework Part 1
No ratings yet
Map Reduce Design and Execution Framework Part 1
19 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
No ratings yet
Mapreduce: Simplified Data Processing On Large Clusters by Jeffrey Dean and Sanjay Ghemawa Presented by Jon Logan
30 pages
BDA Experiment 3
No ratings yet
BDA Experiment 3
7 pages
ECS765P_W2_The MapReduce Programming Model
No ratings yet
ECS765P_W2_The MapReduce Programming Model
53 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
BDA
No ratings yet
BDA
6 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
3.Map-Reduce Framework - 1
No ratings yet
3.Map-Reduce Framework - 1
47 pages
Map Reduce
No ratings yet
Map Reduce
30 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Morning-Prayer-Booklet-Final-Online-Version
No ratings yet
Morning-Prayer-Booklet-Final-Online-Version
28 pages
January 06, 2024 Choghadiya, Chogadia, Chaughadia
No ratings yet
January 06, 2024 Choghadiya, Chogadia, Chaughadia
1 page
Curriculum Vitae Finished Work1
No ratings yet
Curriculum Vitae Finished Work1
14 pages
Advance One
No ratings yet
Advance One
20 pages
Write Message: Characters Left: 200: Total Members: 8353
No ratings yet
Write Message: Characters Left: 200: Total Members: 8353
1 page
Macbeth Argumentative Essay
100% (1)
Macbeth Argumentative Essay
2 pages
Bovee CH03 PPT
No ratings yet
Bovee CH03 PPT
34 pages
Trust Format
100% (1)
Trust Format
5 pages
Print Now Important
No ratings yet
Print Now Important
25 pages
PPP Lesson Plan On Line
No ratings yet
PPP Lesson Plan On Line
6 pages
Completed Square
No ratings yet
Completed Square
7 pages
Lec 7 SOP and POS Expressions Their Conversion and Representation
No ratings yet
Lec 7 SOP and POS Expressions Their Conversion and Representation
31 pages
MS WORD 2013: References Tab
No ratings yet
MS WORD 2013: References Tab
15 pages
Junior Explorer 6 irregular verbs | Quizlet
No ratings yet
Junior Explorer 6 irregular verbs | Quizlet
13 pages
GRASPS MYP Assessment Session
100% (3)
GRASPS MYP Assessment Session
35 pages
Visual Basic 6.0 Lec1 To Lec3
No ratings yet
Visual Basic 6.0 Lec1 To Lec3
25 pages
Prefixes and Suffixes Worksheet
No ratings yet
Prefixes and Suffixes Worksheet
4 pages
PGEG S1 01 (Block 4) PDF
No ratings yet
PGEG S1 01 (Block 4) PDF
71 pages
Assignment Title: Bold, Centered, Upper and Lower Case
No ratings yet
Assignment Title: Bold, Centered, Upper and Lower Case
7 pages
Ficha Financeira-2021-11-29 10 - 02
No ratings yet
Ficha Financeira-2021-11-29 10 - 02
158 pages
662a5089e0494246e350140dslides - Data Wrangling With SQL
No ratings yet
662a5089e0494246e350140dslides - Data Wrangling With SQL
85 pages
06advantage1 UT3 Test AK
No ratings yet
06advantage1 UT3 Test AK
2 pages
"Futarigoto (Isshō Ni Ichido No Warp Ver.) (ふたりごと一生に一度の: (Romanized:)
No ratings yet
"Futarigoto (Isshō Ni Ichido No Warp Ver.) (ふたりごと一生に一度の: (Romanized:)
3 pages
3 PPSP Correct Sent
No ratings yet
3 PPSP Correct Sent
1 page
Swaminathan Publication
No ratings yet
Swaminathan Publication
4 pages
Quiz in Purposive Communication
No ratings yet
Quiz in Purposive Communication
2 pages
Fiction vs. Nonfiction
No ratings yet
Fiction vs. Nonfiction
9 pages
Hemanshi Dobaria: Sr. Full Stack Java Developer PH.: (323) 577-9676
No ratings yet
Hemanshi Dobaria: Sr. Full Stack Java Developer PH.: (323) 577-9676
6 pages
Xilinx ISE Manual
No ratings yet
Xilinx ISE Manual
69 pages
1close Up b1 WB 2016
No ratings yet
1close Up b1 WB 2016
85 pages

Big Data 4 Vivek

Uploaded by

Big Data 4 Vivek

Uploaded by

Name: Vivek Kumar Roll No:11212652

ALGORITHM : MAPREDUCE PROGRAM

void Map (key, value){

INPUT:- Set of Data Related Shakespeare Comedies, Glossary, Poems

You might also like