0% found this document useful (0 votes)

18 views

BDA3

1) The document describes an experiment to write code for a MapReduce word count program using Hadoop. 2) It involves creating driver, mapper, and reducer classes in Eclipse and adding Hadoop libraries. 3) The mapper class splits input lines by spaces and emits (word, 1) pairs, the reducer class sums the 1s for each word, and the driver class runs the MapReduce job.

Uploaded by

nikithakatta0

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

BDA3

Uploaded by

nikithakatta0

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Big data Analytics LAB

Experiment_no-03

Aim: Write driver code, mapper code, reducer code to count number of words in a
given file. (Hint: WordCount Map- Reduce Program)

Description:

1) Open Oracle VM VirtualBox->export cloudera->start

2) Cloudera->settings->system->set processors to “2”, by default it is “1”.

3) To launch “cloudera Express”

(i) Open terminal in cloudera and start the server by using

“sudo service cloudera-sdh-server start”

(ii) After successful completion click on Cloudera Express->Cloud Manager

(iii) Both username and password is “cloudera” and then click on Login

4) In browser type “localhost:50070/dfshealth.jsp”

5) In eclipse->File->New->Java project->Project name “WordCount”->Finish

6) Create three classes.

Right click on WordCount -> New->class->Name “WCDriver”

Right click on WordCount -> New->class->Name “WCMapper”

Right click on WordCount -> New->class->Name “WCReducer”

7) Add Hadoop libraries.

Right click on WordCount ->Build path->Configure Build path->Add external

JARS. (usr\lib\hadoop\hadoop-common-2.6.0-cdh 5.13.0 jar,

Usr\lib\hadoop\hadoop-core-2.6.0-cdh 5.13.0 jar)

Program:

WCMapper.java

import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

Text, Text, IntWritable> {

// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{

String line = value.toString();

// Splitting the line on spaces

for (String word : line.split(" "))
{
if (word.length() > 0)
{
output.collect(new Text(word), new IntWritable(1));
}
}
}
}

WCReducer.java

import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements Reducer<Text,

IntWritable, Text, IntWritable> {

// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{

int count = 0;
// Counting the frequency of each words
while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}

output.collect(key, new IntWritable(count));

}
}

WCDriver.java

import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException

{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}

JobConf conf = new JobConf(WCDriver.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}

// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}

Right click on WordCount->Export->java->jar file->JAR file:” WordCount”->Finish

OUTPUT:

In Terminal

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (77)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (78)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Phone Codes
78% (27)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
Sample Mental Health Progress Note
96% (47)
Sample Mental Health Progress Note
3 pages
2025 MandateForLeadership FULL
70% (10)
2025 MandateForLeadership FULL
920 pages
How To Kiss A Woman's Breast
60% (114)
How To Kiss A Woman's Breast
14 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (7)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
CMPE472 Quiz#1
100% (1)
CMPE472 Quiz#1
52 pages
1001 Songs
70% (71)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
DT02 - Health Check Report Summary - 28062023
No ratings yet
DT02 - Health Check Report Summary - 28062023
325 pages
Secom 777tc e
No ratings yet
Secom 777tc e
4 pages
CONCLUSION (Multi-Threading)
No ratings yet
CONCLUSION (Multi-Threading)
6 pages
Word Count
No ratings yet
Word Count
10 pages
wrordcount
No ratings yet
wrordcount
2 pages
DA Lab Program-2
No ratings yet
DA Lab Program-2
6 pages
Exp 4 Word Count
No ratings yet
Exp 4 Word Count
4 pages
Import Import Import Import Import Import Import Import Public Class Extends Implements
No ratings yet
Import Import Import Import Import Import Import Import Public Class Extends Implements
7 pages
MapReduce Example
No ratings yet
MapReduce Example
3 pages
6 - Simple Wordcount
No ratings yet
6 - Simple Wordcount
2 pages
DSBDA 11
No ratings yet
DSBDA 11
15 pages
BDA
No ratings yet
BDA
6 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Mcsl26 See QP Solution 2024
No ratings yet
Mcsl26 See QP Solution 2024
33 pages
Lab3_BigData-MapReduce
No ratings yet
Lab3_BigData-MapReduce
8 pages
DSBDA GRP B Print
No ratings yet
DSBDA GRP B Print
21 pages
ADA Lab Manual
No ratings yet
ADA Lab Manual
34 pages
Running Jar Program
No ratings yet
Running Jar Program
3 pages
Use Case
No ratings yet
Use Case
3 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Steps to create jar file and execute word count problem in mapper reducer
No ratings yet
Steps to create jar file and execute word count problem in mapper reducer
5 pages
To Count Using Map and Reduce Program: Wordcount - Java
No ratings yet
To Count Using Map and Reduce Program: Wordcount - Java
2 pages
Source Code for Wordcount
No ratings yet
Source Code for Wordcount
3 pages
1WordCount
No ratings yet
1WordCount
2 pages
Wordcount
No ratings yet
Wordcount
3 pages
Practical-11
No ratings yet
Practical-11
3 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
Document 6
No ratings yet
Document 6
15 pages
Week-8 de
No ratings yet
Week-8 de
9 pages
EXP_3_4
No ratings yet
EXP_3_4
7 pages
Word Count Program
No ratings yet
Word Count Program
2 pages
Bda Final 11jan
No ratings yet
Bda Final 11jan
7 pages
1ST PROGRAM Hadoop
No ratings yet
1ST PROGRAM Hadoop
5 pages
Group B PR 3 DSBDA
No ratings yet
Group B PR 3 DSBDA
6 pages
Prerequisites: Single Node Setup Cluster Setup
No ratings yet
Prerequisites: Single Node Setup Cluster Setup
5 pages
WordCount Program Hadoop Task 2
No ratings yet
WordCount Program Hadoop Task 2
7 pages
Group A 2nd
No ratings yet
Group A 2nd
3 pages
CCBDI Full Lab Manual Anurag Removed
No ratings yet
CCBDI Full Lab Manual Anurag Removed
97 pages
Practical 2c
No ratings yet
Practical 2c
2 pages
Run Wordcount
No ratings yet
Run Wordcount
3 pages
Prácticas Bigdata: 1. Lanzar Un Proceso Mapreduce Contra El Cluster
No ratings yet
Prácticas Bigdata: 1. Lanzar Un Proceso Mapreduce Contra El Cluster
3 pages
✅ PART 1- Install Java and Hadoop on Ubuntu
No ratings yet
✅ PART 1- Install Java and Hadoop on Ubuntu
4 pages
Codigo Haddop
No ratings yet
Codigo Haddop
3 pages
Practical 3bcbs
No ratings yet
Practical 3bcbs
5 pages
Hadoop Training in Hyderabad
No ratings yet
Hadoop Training in Hyderabad
49 pages
DSBDA 14
No ratings yet
DSBDA 14
16 pages
Big Data - ASSIGNMENT 2
No ratings yet
Big Data - ASSIGNMENT 2
15 pages
Map Reduce
No ratings yet
Map Reduce
4 pages
579 BDA Week-04
No ratings yet
579 BDA Week-04
1 page
Part B Assignment - No - 1
No ratings yet
Part B Assignment - No - 1
6 pages
SalesData Map Reduce
No ratings yet
SalesData Map Reduce
3 pages
ContarPalabras Java
No ratings yet
ContarPalabras Java
2 pages
Word Count Program
No ratings yet
Word Count Program
3 pages
OddEven Program
No ratings yet
OddEven Program
2 pages
AJP - PR - ANS Ajp
No ratings yet
AJP - PR - ANS Ajp
17 pages
Ravinder Big Data 4 PDF
No ratings yet
Ravinder Big Data 4 PDF
15 pages
BDAV Practical
No ratings yet
BDAV Practical
17 pages
Hadoop Map Reduce
No ratings yet
Hadoop Map Reduce
8 pages
Exp-12
No ratings yet
Exp-12
7 pages
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
No ratings yet
Practical-1: Aim: Hadoop Configuration and Single Node Cluster Setup and Perform File Management Task in
61 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
049
No ratings yet
049
2 pages
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Types of File Organization
100% (1)
Types of File Organization
3 pages
Dev List
No ratings yet
Dev List
12 pages
Cloud Based Bus Pass System: Software Requirements
No ratings yet
Cloud Based Bus Pass System: Software Requirements
2 pages
Presentation
No ratings yet
Presentation
37 pages
Companion Log 2020 11 25T11 04 14Z
No ratings yet
Companion Log 2020 11 25T11 04 14Z
11 pages
Hitachi_Data_Systems
No ratings yet
Hitachi_Data_Systems
11 pages
embedded processors lab
No ratings yet
embedded processors lab
85 pages
CN 8 EXP
No ratings yet
CN 8 EXP
5 pages
NNTZT2 TZTL12F-15F Built-In Data Installation
No ratings yet
NNTZT2 TZTL12F-15F Built-In Data Installation
5 pages
Tms 320 LF 2406 A
No ratings yet
Tms 320 LF 2406 A
137 pages
Computer Studies II PDF
No ratings yet
Computer Studies II PDF
3 pages
DigitalSignerServicev4 1 7UserGuidelines
No ratings yet
DigitalSignerServicev4 1 7UserGuidelines
78 pages
LA-D802P-R10_20160621B-GERBER-A31
No ratings yet
LA-D802P-R10_20160621B-GERBER-A31
51 pages
NAPALM Documentation: Release 3
No ratings yet
NAPALM Documentation: Release 3
90 pages
Datasheet
No ratings yet
Datasheet
40 pages
Configuring HYPACK SURVEY For A Towed SideScan With Trailing Magnetometer
100% (1)
Configuring HYPACK SURVEY For A Towed SideScan With Trailing Magnetometer
6 pages
Memory Organization
No ratings yet
Memory Organization
36 pages
Database Server + Application Server
No ratings yet
Database Server + Application Server
16 pages
Method of Procedure Delta Modbus Integration - R00
No ratings yet
Method of Procedure Delta Modbus Integration - R00
13 pages
Oracle DBA Online Training
No ratings yet
Oracle DBA Online Training
17 pages
A Hybrid Cloud Approach For Secure Authorized Deduplication
No ratings yet
A Hybrid Cloud Approach For Secure Authorized Deduplication
3 pages
Eve-Calc2 0
No ratings yet
Eve-Calc2 0
6 pages
Third Quarterly Exam in Computer 1
No ratings yet
Third Quarterly Exam in Computer 1
3 pages
Ricochet and VCCS PDF
No ratings yet
Ricochet and VCCS PDF
5 pages
Chapter 1 - Computers and IT
No ratings yet
Chapter 1 - Computers and IT
5 pages
Politeknik Seberang Perai: Network Configuration
No ratings yet
Politeknik Seberang Perai: Network Configuration
27 pages

BDA3

Uploaded by

BDA3

Uploaded by

Big data Analytics LAB

1) Open Oracle VM VirtualBox->export cloudera->start

2) Cloudera->settings->system->set processors to “2”, by default it is “1”.

3) To launch “cloudera Express”

(i) Open terminal in cloudera and start the server by using

“sudo service cloudera-sdh-server start”

(ii) After successful completion click on Cloudera Express->Cloud Manager

4) In browser type “localhost:50070/dfshealth.jsp”

6) Create three classes.

Right click on WordCount -> New->class->Name “WCDriver”

Right click on WordCount -> New->class->Name “WCMapper”

Right click on WordCount -> New->class->Name “WCReducer”

7) Add Hadoop libraries.

Right click on WordCount ->Build path->Configure Build path->Add external

Usr\lib\hadoop\hadoop-core-2.6.0-cdh 5.13.0 jar)

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

String line = value.toString();

// Splitting the line on spaces

public class WCReducer extends MapReduceBase implements Reducer<Text,

output.collect(key, new IntWritable(count));

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException

JobConf conf = new JobConf(WCDriver.class);

Right click on WordCount->Export->java->jar file->JAR file:” WordCount”->Finish

You might also like