ccpractical 7
ccpractical 7
Practical-7
Aim: Demostrate the use of map and reduce tasks.
Theory:
MapReduce
A MapReduce is a data processing tool which is used to process the data parallelly in a
distributed form. It was developed in 2004, on the basis of paper titled as "MapReduce:
Simplified Data Processing on Large Clusters," published by Google. The MapReduce is a
paradigm which has two phases, the mapper phase, and the reducer phase. In the Mapper, the
input is given in the form of a key-value pair. The output of the Mapper is fed to the reducer
as input. The reducer runs only after the Mapper is over. The reducer too takes input in key-
value format, and the output of reducer is the final output.
Step 5: Download the latest Hadoop version from the Apache Hadoop
mapred-site.xml
yarn-site.xml
Conclusion :
Thus, we have successfully demonstrated the use of map and reduce
tasks.