0% found this document useful (0 votes)

9 views

BDC Output 3

Uploaded by

vogalaf328

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

BDC Output 3

Uploaded by

vogalaf328

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Big Data Computing Practical No.

Program Source code:

import java.io.IOException; import java.util.StringTokenizer; import
org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer; import
org.apache.hadoop.conf.Configuration; import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import
org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import
org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import
org.apache.hadoop.fs.Path;
public class WordCount
{ public static class Map extends Mapper<LongWritable,Text,Text,IntWritable> { public void
map(LongWritable key, Text value,Context context) throws IOException,InterruptedException{
String line = value.toString(); StringTokenizer tokenizer =
new StringTokenizer(line); while
(tokenizer.hasMoreTokens()) {
value.set(tokenizer.nextToken()); context.write(value, new
IntWritable(1));
}
} } public static class Reduce extends Reducer<Text,IntWritable,Text,IntWritable>

{ public void reduce(Text key, Iterable<IntWritable> values,Context context) throws

IOException,InterruptedException

{ int sum=0;

for(IntWritable x: values)

{ sum+=x.get();

} context.write(key, new IntWritable(sum));

}
} public static void main(String[] args) throws Exception
Configuration conf= new Configuration(); Job job = new
Job(conf,"My Word Count Program");
Big Data Computing Practical No. 3

job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class); Path
outputPath = new Path(args[1]);
//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
//deleting the output path automatically from hdfs so that we don't have to delete it explicitly
outputPath.getFileSystem(conf).delete(outputPath); //exiting the job only if the flag value
becomes false
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

The entire MapReduce program can be fundamentally divided into three parts:
A. Mapper Phase Code
B. Reducer Phase Code
C. Driver Code

We will understand the code for each of these three parts sequentially.

Mapper code:
public static class Map extends
Mapper<LongWritable,Text,Text,IntWritable> {

public void map(LongWritable key, Text value, Context context) throws

IOException,InterruptedException {

String line = value.toString();

StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) { value.set(tokenizer.nextToken());
context.write(value, new IntWritable(1));
}
Big Data Computing Practical No. 3

• We have created a class Map that extends the class

Mapper which is already defined in the MapReduce Framework.

• We define the data types of input and output key/value pair after the class declaration using angle
brackets.

• Both the input and output of the Mapper is a key/value pair.

• Input:
◦ The key is nothing but the offset of each line in the text file:LongWritable
◦ The value is each individual line (as shown in the figure at the right): Text

• Output:
◦ The key is the tokenized words: Text
◦ We have the hardcoded value in our case which is 1: IntWritable
◦ Example – Dear 1, Bear 1, etc.
• We have written a java code where we have tokenized each word and assigned them a hardcoded
value equal to 1.

Reducer Code:
public static class Reduce extends
Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key, Iterable<IntWritable> values,Context context)
throws IOException,InterruptedException
{
int sum=0; for(IntWritable x: values)
{
sum+=x.get();
}
context.write(key, new IntWritable(sum));
}
}
• We have created a class Reduce which extends class Reducer like that of
Mapper.
• We define the data types of input and output key/value pair after the class
declaration using angle brackets as done for Mapper.
• Both the input and the output of the Reducer is a keyvalue pair.
• Input:
◦ The key nothing but those unique words which have been generated after
the sorting and shuffling phase: Text
Big Data Computing Practical No. 3

◦ The value is a list of integers corresponding to each key: IntWritable ◦

Example – Bear, [1, 1], etc.
• Output:
◦The key is all the unique words present in the input text file: Text
◦The value is the number of occurrences of each of the unique words:
IntWritable
◦Example – Bear, 2; Car, 3, etc.

• We have aggregated the values present in each of the list corresponding to

each key and produced the final answer.
• In general, a single reducer is created for each of the unique words, but,
you can specify the number of reducer in mapred-site.xml.
•
Driver Code:
Configuration conf= new Configuration();
Job job = new Job(conf,"My Word Count Program"); job.setJarByClass(WordCount.class);
job.setMapperClass(Map.class); job.setReducerClass(Reduce.class);
job.setOutputKeyClass(Text.class);

job.setOutputValueClass(IntWritable.class); job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
Path outputPath = new Path(args[1]);

//Configuring the input/output path from the filesystem into the job
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

• In the driver class, we set the configuration of our MapReduce job to run in Hadoop.
• We specify the name of the job , the data type of input/ output of the mapper and reducer.
• We also specify the names of the mapper and reducer classes.
• The path of the input and output folder is also specified.
• The method setInputFormatClass () is used for specifying that how a Mapper will read the input
data or what will be the unit of work. Here, we have chosen
TextInputFormat so that single line is read by the mapper at a time from the input text file.
• The main () method is the entry point for the driver. In this method, we instantiate a new
Configuration object for the job.

Run the MapReduce code:

The command for running a MapReduce code is:
hadoop jar hadoop-mapreduce-example.jar WordCount / sample/input /sample/output

Solutions Manual - Fundamentals of Statistical Signal Procession - Estimation Theory-Stephen-Kay
80% (10)
Solutions Manual - Fundamentals of Statistical Signal Procession - Estimation Theory-Stephen-Kay
221 pages
3 MapReduce program ex code
No ratings yet
3 MapReduce program ex code
14 pages
Palak
No ratings yet
Palak
10 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Ravikant_Hadoop_file
No ratings yet
Ravikant_Hadoop_file
22 pages
Advanced Mapreduce
No ratings yet
Advanced Mapreduce
37 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
wc
No ratings yet
wc
13 pages
Hadoop Wordcount Program
No ratings yet
Hadoop Wordcount Program
20 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Unit IV Programming Model
No ratings yet
Unit IV Programming Model
30 pages
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
No ratings yet
Developing A Mapreduce Application: by Dr. K. Venkateswara Rao Professor Department of Cse
83 pages
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
No ratings yet
Word Count Program To Demonstrate The Use of Map and Reduce Tasks
5 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Practical 2-1
No ratings yet
Practical 2-1
4 pages
Hadoop Developingapps PDF
No ratings yet
Hadoop Developingapps PDF
17 pages
Example - (Map Function in Word Count)
No ratings yet
Example - (Map Function in Word Count)
6 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
Exp 3-Word Count
No ratings yet
Exp 3-Word Count
4 pages
Classcreation
No ratings yet
Classcreation
2 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
Hadoop and Map Reduce
No ratings yet
Hadoop and Map Reduce
27 pages
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
No ratings yet
Setting Up Eclipse:: Codelab 1 Introduction To The Hadoop Environment (Version 0.17.0)
9 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
✅ PART 1- Install Java and Hadoop on Ubuntu
No ratings yet
✅ PART 1- Install Java and Hadoop on Ubuntu
4 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
6 WIBD-Practicals
No ratings yet
6 WIBD-Practicals
19 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
No ratings yet
Parlab Parallel Boot Camp Cloud Computing With Mapreduce and Hadoop
49 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
Experiment 6 BDA
No ratings yet
Experiment 6 BDA
4 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
CS702_Big_Data_Programs
No ratings yet
CS702_Big_Data_Programs
58 pages
To Count Using Map and Reduce Program: Wordcount - Java
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
2 pages
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
No ratings yet
Hands-On Exercises With Big Data: Lab Sheet 1: Getting Started With Mapreduce and Hadoop
14 pages
CTBD Sol02
No ratings yet
CTBD Sol02
2 pages
BDF Programs
No ratings yet
BDF Programs
32 pages
2020300053_BDA_EXP2_CHINMAY
No ratings yet
2020300053_BDA_EXP2_CHINMAY
7 pages
Steps to create jar file and execute word count problem in mapper reducer
No ratings yet
Steps to create jar file and execute word count problem in mapper reducer
5 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
BDA Exp Removed Removed
No ratings yet
BDA Exp Removed Removed
33 pages
BDA3
No ratings yet
BDA3
7 pages
6. Map Reduce Programming
No ratings yet
6. Map Reduce Programming
67 pages
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
No ratings yet
Steps: /usr/lib/hadoop-0.20/ Usr/lib/hadoop-0.20/lib
4 pages
CTBD Ex02
No ratings yet
CTBD Ex02
3 pages
CS-702 (D) BigData
No ratings yet
CS-702 (D) BigData
61 pages
BDA
No ratings yet
BDA
6 pages
Source Code for Wordcount
No ratings yet
Source Code for Wordcount
3 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
Easy Java Certification 130 Questions
No ratings yet
Easy Java Certification 130 Questions
67 pages
RCV PTP PDF
No ratings yet
RCV PTP PDF
2 pages
Class - 12 COMMERCE 1 - COMMERCE 2
No ratings yet
Class - 12 COMMERCE 1 - COMMERCE 2
1 page
System Verilog Classes
No ratings yet
System Verilog Classes
105 pages
Experiment No. 2 or Gates & and Gates
No ratings yet
Experiment No. 2 or Gates & and Gates
5 pages
Data-Structures Short Questions
No ratings yet
Data-Structures Short Questions
28 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
27 pages
CO1508 Computer Systems & Security - Week 07 Cryptography - Hash Functions & Digital Signatures
No ratings yet
CO1508 Computer Systems & Security - Week 07 Cryptography - Hash Functions & Digital Signatures
7 pages
Coding Form Tabel SPPD
No ratings yet
Coding Form Tabel SPPD
4 pages
Curriculum-Roadmap
No ratings yet
Curriculum-Roadmap
1 page
Example of Stacks
No ratings yet
Example of Stacks
13 pages
Assgn 2
No ratings yet
Assgn 2
2 pages
Programming Techniques Using Python
No ratings yet
Programming Techniques Using Python
12 pages
Lang/Year: Mobile: +91-8667215877/+91-9566492473 - Whatsapp: 08667215877 Output Satisfaction
No ratings yet
Lang/Year: Mobile: +91-8667215877/+91-9566492473 - Whatsapp: 08667215877 Output Satisfaction
4 pages
Java Assignment
No ratings yet
Java Assignment
3 pages
OL P2 Quick Revision
No ratings yet
OL P2 Quick Revision
49 pages
Os Lab6
No ratings yet
Os Lab6
12 pages
3161009-BE-WINTER-2022
No ratings yet
3161009-BE-WINTER-2022
2 pages
Practical Guide OPP - Object Oriented Programming
No ratings yet
Practical Guide OPP - Object Oriented Programming
26 pages
3rd International Conference on Computing and Information Technology (CITE 2025)
No ratings yet
3rd International Conference on Computing and Information Technology (CITE 2025)
2 pages
Sentence Transaltion For Kannada Using M
No ratings yet
Sentence Transaltion For Kannada Using M
5 pages
Education
No ratings yet
Education
1 page
CPS 356 Lecture Notes - Scheduling
No ratings yet
CPS 356 Lecture Notes - Scheduling
10 pages
Data Structures: Dr. Priya P Sajan C-Dac Thiruvananthapuram
100% (1)
Data Structures: Dr. Priya P Sajan C-Dac Thiruvananthapuram
56 pages
The Theory of Interest - Solutions Manual: N T T N T T N
No ratings yet
The Theory of Interest - Solutions Manual: N T T N T T N
13 pages
Puzzle Time 1 4
0% (1)
Puzzle Time 1 4
1 page
Self - Learning Module - 4 - Q4
No ratings yet
Self - Learning Module - 4 - Q4
10 pages
Java Software Solutions for AP computer science 3rd Edition Loftus download
No ratings yet
Java Software Solutions for AP computer science 3rd Edition Loftus download
46 pages
Chapter - 1: Introduction To Software Engineering
No ratings yet
Chapter - 1: Introduction To Software Engineering
26 pages

BDC Output 3

Uploaded by

BDC Output 3

Uploaded by

Big Data Computing Practical No.

Program Source code:

{ public void reduce(Text key, Iterable<IntWritable> values,Context context) throws

} context.write(key, new IntWritable(sum));

public void map(LongWritable key, Text value, Context context) throws

String line = value.toString();

• We have created a class Map that extends the class

• Both the input and output of the Mapper is a key/value pair.

◦ The value is a list of integers corresponding to each key: IntWritable ◦

• We have aggregated the values present in each of the list corresponding to

Run the MapReduce code:

You might also like