0% found this document useful (0 votes)
5 views

exp5bda

The document outlines an experiment to implement a simple Map-Reduce algorithm for matrix multiplication using Hadoop. It describes the MapReduce process, including the Map and Reduce stages, and provides detailed steps for setting up the project, compiling Java code, and executing the MapReduce job. The conclusion emphasizes the practical application of these concepts through a hands-on example of multiplying 2x2 matrices.

Uploaded by

jpurva23ecs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

exp5bda

The document outlines an experiment to implement a simple Map-Reduce algorithm for matrix multiplication using Hadoop. It describes the MapReduce process, including the Map and Reduce stages, and provides detailed steps for setting up the project, compiling Java code, and executing the MapReduce job. The conclusion emphasizes the practical application of these concepts through a hands-on example of multiplying 2x2 matrices.

Uploaded by

jpurva23ecs
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

EXPERIMENT NO : 5

Aim : To implement simple algorithm in Map-Reduce: Matrix Multiplication/word count.

Software used : Hadoop

Thoery :

MapReduce is a style of computing that has been implemented in several systems, including Google’s internal
implementation (simply called MapReduce) and the popular open-source implementation Hadoop which can be
obtained, along with the HDFS file system from the Apache Foundation. You can use an implementation of
MapReduce to manage many large- scale computations in a way that is tolerant of hardware faults. All you need to
write are two functions, called Map and Reduce, while the system manages the parallel execution, coordination of
tasks that execute Map or Reduce, and also deals with the possibility that one of these tasks will fail to execute. In
brief, a MapReduce computation executes as follows:

1. Some number of Map tasks each are given one or more chunks from a distributed file system. These Map
tasks turn the chunk into a sequence of key-value pairs. The way key- value pairs are produced from the input data is
determined by the code written by the user for the Map function.

2. The key-value pairs from each Map task are collected by a master controller and sorted by key. The keys are
divided among all the Reduce tasks, so all key-value pairs with the same key wind up at the same Reduce task.

3. The Reduce tasks work on one key at a time, and combine all the values associated with that key in some way.
The manner of combination of values is determined by the code written by the user for the Reduce function.

Matrix Multiplication

Suppose we have an n x n matrix M, whose element in row i and column j will be denoted by Mij. Suppose we also
have vector v of length n, whose jth element is Vj . Then the matrix vector product is the vector of length n, whose ith
element xi.

Let A and B be the two matrices to be multiplied and the result be matrix C. Matrix A has dimensions

L, M and matrix B has dimensions M, N. In the Map phase:

Workflow of Map Reduce Program to count word:


● During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the cluster.
Generally MapReduce paradigm is based on sending the computer to where the data resides!

● MapReduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.

● Map stage − The map or mapper’s job is to process the input data. Generally the input data is in the form of
file or directory and is stored in the Hadoop file system (HDFS). The input file is passed to the mapper function line
by line. The mapper processes the data and creates several small chunks of data.

● Reduce stage − This stage is the combination of the Shuffle stage and the Reduce stage. The Reducer’s job is
to process the data that comes from the mapper. After processing, it produces a new set of output, which will be stored
in the HDFS.

● During a MapReduce job, Hadoop sends the Map and Reduce tasks to the appropriate servers in the cluster.

● The framework manages all the details of data-passing such as issuing tasks, verifying task completion, and
copying data around the cluster between the nodes.

● Most of the computing takes place on nodes with data on local disks that reduces the network traffic.

● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate result, and
sends it back to the Hadoop server.

● The framework manages all the details of data-passing such as issuing tasks, verifying task completion, and
copying data around the cluster between the nodes.

● Most of the computing takes place on nodes with data on local disks that reduces the network traffic.

● After completion of the given tasks, the cluster collects and reduces the data to form an appropriate result, and
sends it back to the Hadoop server.

Steps to follow:
Step 1: Create a folder in C:\ as ‘hadoop_project’ => C:\hadoop_project
Step 2 : Inside the folder right click -> new -> text document and create 3 java code (mapper
code,reducer code,driver code)

Step 3 : open command prompt (cmd) and navigate the project folder

step
4 : compile java files with Hadoop dependencies and create a jar file
Step 5 : prepare input data file as matrix_input.txt

Step 6 : create a directory in Hadoop as matrix_input


Step 7 :put the input file in created directory

Step 8 : Run the JAR File

C:\Users\Administrator>hadoop jar C:\hadoop_project\matrix-multiplication.jar


MatrixMultiplicationDriver /matrix_input /matrix_output

Step 8 : checking output folder for output file


Step 9 : view output

Step 10 : getting output as text file in desktop


Consider A and B matrix of 2 x 2 dimension, perform matrix multiplication using Mapreduce.
Write all the steps as discussed in the class.
Conclusion :

You might also like