0% found this document useful (0 votes)

5 views

exp5bda

The document outlines an experiment to implement a simple Map-Reduce algorithm for matrix multiplication using Hadoop. It describes the MapReduce process, including the Map and Reduce stages, and provides detailed steps for setting up the project, compiling Java code, and executing the MapReduce job. The conclusion emphasizes the practical application of these concepts through a hands-on example of multiplying 2x2 matrices.

Uploaded by

jpurva23ecs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

exp5bda

Uploaded by

jpurva23ecs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

EXPERIMENT NO : 5

Aim : To implement simple algorithm in Map-Reduce: Matrix Multiplication/word count.

Software used : Hadoop

Thoery :

MapReduce is a style of computing that has been implemented in several systems, including Google’s internal
implementation (simply called MapReduce) and the popular open-source implementation Hadoop which can be
obtained, along with the HDFS file system from the Apache Foundation. You can use an implementation of
MapReduce to manage many large- scale computations in a way that is tolerant of hardware faults. All you need to
write are two functions, called Map and Reduce, while the system manages the parallel execution, coordination of
tasks that execute Map or Reduce, and also deals with the possibility that one of these tasks will fail to execute. In
brief, a MapReduce computation executes as follows:

1. Some number of Map tasks each are given one or more chunks from a distributed file system. These Map
tasks turn the chunk into a sequence of key-value pairs. The way key- value pairs are produced from the input data is
determined by the code written by the user for the Map function.

2. The key-value pairs from each Map task are collected by a master controller and sorted by key. The keys are
divided among all the Reduce tasks, so all key-value pairs with the same key wind up at the same Reduce task.

3. The Reduce tasks work on one key at a time, and combine all the values associated with that key in some way.
The manner of combination of values is determined by the code written by the user for the Reduce function.

Matrix Multiplication

Suppose we have an n x n matrix M, whose element in row i and column j will be denoted by Mij. Suppose we also
have vector v of length n, whose jth element is Vj . Then the matrix vector product is the vector of length n, whose ith
element xi.

Let A and B be the two matrices to be multiplied and the result be matrix C. Matrix A has dimensions

L, M and matrix B has dimensions M, N. In the Map phase:

Workflow of Map Reduce Program to count word:

● During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the cluster.
Generally MapReduce paradigm is based on sending the computer to where the data resides!

● MapReduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.

● Map stage − The map or mapper’s job is to process the input data. Generally the input data is in the form of
file or directory and is stored in the Hadoop file system (HDFS). The input file is passed to the mapper function line
by line. The mapper processes the data and creates several small chunks of data.

● Reduce stage − This stage is the combination of the Shuffle stage and the Reduce stage. The Reducer’s job is
to process the data that comes from the mapper. After processing, it produces a new set of output, which will be stored
in the HDFS.

● During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the cluster.

● The framework manages all the details of data-passing such as issuing tasks, verifying task completion, and
copying data around the cluster between the nodes.

● Most of the computing takes place on nodes with data on local disks that reduces the network traffic.

● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate result, and
sends it back to the Hadoop server.

● The framework manages all the details of data-passing such as issuing tasks, verifying task completion, and
copying data around the cluster between the nodes.

● Most of the computing takes place on nodes with data on local disks that reduces the network traffic.

● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate result, and
sends it back to the Hadoop server.

Steps to follow:
Step 1: Create a folder in C:\ as ‘hadoop_project’ => C:\hadoop_project
Step 2 : Inside the folder right click -> new -> text document and create 3 java code (mapper
code,reducer code,driver code)

Step 3 : open command prompt (cmd) and navigate the project folder

step
4 : compile java files with Hadoop dependencies and create a jar file
Step 5 : prepare input data file as matrix_input.txt

Step 6 : create a directory in Hadoop as matrix_input

Step 7 :put the input file in created directory

Step 8 : Run the JAR File

C:\Users\Administrator>hadoop jar C:\hadoop_project\matrix-multiplication.jar

MatrixMultiplicationDriver /matrix_input /matrix_output

Step 8 : checking output folder for output file

Step 9 : view output

Step 10 : getting output as text file in desktop

Consider A and B matrix of 2 x 2 dimension, perform matrix multiplication using Mapreduce.
Write all the steps as discussed in the class.
Conclusion :

CH 01 Introduction
No ratings yet
CH 01 Introduction
21 pages
Hadoop Interview Questions New
No ratings yet
Hadoop Interview Questions New
9 pages
Mid Term Exam Big Data - 2
No ratings yet
Mid Term Exam Big Data - 2
4 pages
exp5bdafinal
No ratings yet
exp5bdafinal
7 pages
Bda 03
No ratings yet
Bda 03
10 pages
DSBDA Manual Assignment 11
No ratings yet
DSBDA Manual Assignment 11
6 pages
UNIT 3 NOTES (1)
No ratings yet
UNIT 3 NOTES (1)
21 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
3.1.How Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.How Map Reduce Works & 3.2 Anatomy
11 pages
21CS1601 UNIT 5 UNDERSTANDING BIG DATA TECHNOLGIES
No ratings yet
21CS1601 UNIT 5 UNDERSTANDING BIG DATA TECHNOLGIES
20 pages
3 Fuel Consumption Example - MR
No ratings yet
3 Fuel Consumption Example - MR
7 pages
Notes Bug Data and of Apache
No ratings yet
Notes Bug Data and of Apache
4 pages
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
No ratings yet
Mapreduce Programming Model and Design Patterns: Andrea Lottarini January 17, 2012
23 pages
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
No ratings yet
BDA Module 3 - Part 1 (Mapreduce and HBase) 2023
15 pages
BIG DATA
No ratings yet
BIG DATA
120 pages
Introduction To MapReduce
No ratings yet
Introduction To MapReduce
9 pages
Bda Module 4
No ratings yet
Bda Module 4
34 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
Module2 C MapReduceParadigm
No ratings yet
Module2 C MapReduceParadigm
74 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
Features of Hadoop: - Suitable For Big Data Analysis
No ratings yet
Features of Hadoop: - Suitable For Big Data Analysis
6 pages
bda_unit_3[1]
No ratings yet
bda_unit_3[1]
20 pages
Unit-2 MapReduce2024
No ratings yet
Unit-2 MapReduce2024
41 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
11 pages
Unit 2 - From Hadoop Streaming PDF
No ratings yet
Unit 2 - From Hadoop Streaming PDF
20 pages
Data Science
No ratings yet
Data Science
7 pages
Bda Experiment No2
No ratings yet
Bda Experiment No2
12 pages
Map Reduce Programming
No ratings yet
Map Reduce Programming
74 pages
Exp5 BDI 60004200124
No ratings yet
Exp5 BDI 60004200124
5 pages
Unit 5 Lecture 5
No ratings yet
Unit 5 Lecture 5
21 pages
Map Reduce
No ratings yet
Map Reduce
18 pages
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
No ratings yet
03 Firstmrjob Invertedindexconstruction 141206231216 Conversion Gate01 PDF
54 pages
Map Reduce
No ratings yet
Map Reduce
25 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
05 Movies Data Analysis Using Mapreduce
No ratings yet
05 Movies Data Analysis Using Mapreduce
20 pages
3.Map-Reduce Framework - 1
No ratings yet
3.Map-Reduce Framework - 1
47 pages
Mapreduce Programming Framework
No ratings yet
Mapreduce Programming Framework
23 pages
Map Reduce Workflow Colloquim
No ratings yet
Map Reduce Workflow Colloquim
30 pages
CS702_Big_Data_Programs
No ratings yet
CS702_Big_Data_Programs
58 pages
Big Data 4 Vivek
No ratings yet
Big Data 4 Vivek
3 pages
UNIT – III
No ratings yet
UNIT – III
38 pages
MapReduce
No ratings yet
MapReduce
14 pages
Big Data Unit-2 PPT part2
No ratings yet
Big Data Unit-2 PPT part2
78 pages
Big data analytics
No ratings yet
Big data analytics
50 pages
Map Reduce
No ratings yet
Map Reduce
57 pages
BDA Experiment 3
No ratings yet
BDA Experiment 3
7 pages
Understanding MapReduce in Hadoop
No ratings yet
Understanding MapReduce in Hadoop
25 pages
Map Red
No ratings yet
Map Red
6 pages
Module 3 (Part-1) - Big Data
No ratings yet
Module 3 (Part-1) - Big Data
46 pages
M4_06_MapReduce
No ratings yet
M4_06_MapReduce
28 pages
Cloud
No ratings yet
Cloud
11 pages
Bda Unit III r20csm
No ratings yet
Bda Unit III r20csm
54 pages
Hadoop Map Reduce Concepts - Teaching - 1
No ratings yet
Hadoop Map Reduce Concepts - Teaching - 1
53 pages
Map Reduce
No ratings yet
Map Reduce
42 pages
Unit 3
No ratings yet
Unit 3
13 pages
BDA Manual
No ratings yet
BDA Manual
57 pages
8300 17977 1 PB
No ratings yet
8300 17977 1 PB
19 pages
Big Data Practical 2
No ratings yet
Big Data Practical 2
11 pages
Hadoop (Mapreduce)
No ratings yet
Hadoop (Mapreduce)
43 pages
Map Reduce Report
No ratings yet
Map Reduce Report
16 pages
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
Professional Hadoop Solutions
From Everand
Professional Hadoop Solutions
Boris Lublinsky
4/5 (2)
Attunity Magic of Data Integration Eguide
No ratings yet
Attunity Magic of Data Integration Eguide
8 pages
Top 200 Data Engineer Interview Question PDF
100% (4)
Top 200 Data Engineer Interview Question PDF
482 pages
Real-Time Big Data Analytics - Sample Chapter
100% (2)
Real-Time Big Data Analytics - Sample Chapter
30 pages
SAP Business Technology Platform Service Description Guide-Mar2021
No ratings yet
SAP Business Technology Platform Service Description Guide-Mar2021
25 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
27 pages
It 6001 Da 2 Marks With Answer PDF
No ratings yet
It 6001 Da 2 Marks With Answer PDF
10 pages
Chetan Shah - Blockchain - MVLCO
No ratings yet
Chetan Shah - Blockchain - MVLCO
8 pages
LMR-SRM-Prof. S. Magesh
No ratings yet
LMR-SRM-Prof. S. Magesh
27 pages
MapReduce Word Count Example - Javatpoint
No ratings yet
MapReduce Word Count Example - Javatpoint
12 pages
BlueData EPIC Software Architecture Technical White Paper
No ratings yet
BlueData EPIC Software Architecture Technical White Paper
29 pages
226 Unit-7
No ratings yet
226 Unit-7
26 pages
Literature Review On Big Data
No ratings yet
Literature Review On Big Data
10 pages
Hadoop Fundamentals
No ratings yet
Hadoop Fundamentals
45 pages
Industrial Training Report
No ratings yet
Industrial Training Report
8 pages
Bda Experiment 5: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
No ratings yet
Bda Experiment 5: Roll No. A-52 Name: Janmejay Patil Class: BE-A Batch: A3 Date of Experiment: Date of Submission Grade
5 pages
TE 2019 DSBDA Lab Manual Sem II 2023 Final
No ratings yet
TE 2019 DSBDA Lab Manual Sem II 2023 Final
170 pages
Myk Ogbinar 2019 CV 0318
No ratings yet
Myk Ogbinar 2019 CV 0318
2 pages
Big Data Analytics in Oil and Gas Industry - An Emerging Trend
100% (1)
Big Data Analytics in Oil and Gas Industry - An Emerging Trend
8 pages
Hadoop All Installations
No ratings yet
Hadoop All Installations
19 pages
HadoopfilePP
No ratings yet
HadoopfilePP
83 pages
Model question paper _Big data_2024-25_kca022
No ratings yet
Model question paper _Big data_2024-25_kca022
3 pages
A Deep Dive Into Nosql Databases The Use Cases and Applications First Edition
No ratings yet
A Deep Dive Into Nosql Databases The Use Cases and Applications First Edition
1,091 pages
Chap7 BigData
No ratings yet
Chap7 BigData
35 pages
List of Digital 201 Courses: What Is Digital 201 Course
No ratings yet
List of Digital 201 Courses: What Is Digital 201 Course
81 pages
Unit 5 BDA
No ratings yet
Unit 5 BDA
34 pages
Analysis of Dynamic Workflow Scheduling Algorithm For Big Data Application
No ratings yet
Analysis of Dynamic Workflow Scheduling Algorithm For Big Data Application
5 pages
Anatomy of A MapReduce Job
No ratings yet
Anatomy of A MapReduce Job
5 pages

exp5bda

Uploaded by

exp5bda

Uploaded by

EXPERIMENT NO : 5

Aim : To implement simple algorithm in Map-Reduce: Matrix Multiplication/word count.

Software used : Hadoop

L, M and matrix B has dimensions M, N. In the Map phase:

Workflow of Map Reduce Program to count word:

Step 6 : create a directory in Hadoop as matrix_input

Step 8 : Run the JAR File

C:\Users\Administrator>hadoop jar C:\hadoop_project\matrix-multiplication.jar

Step 8 : checking output folder for output file

Step 10 : getting output as text file in desktop

You might also like