0% found this document useful (0 votes)
120 views

HADOOP 1.X Installation Steps On Ubuntu

This document outlines 46 steps to install Hadoop in pseudo-distributed mode on Ubuntu. It involves installing Java, configuring environment variables, creating a dedicated Hadoop user, setting up SSH keys for passwordless access, downloading and extracting the Hadoop tarball, configuring core-site.xml, hdfs-site.xml and mapred-site.xml files, and starting the necessary daemons to launch the fully configured Hadoop installation.

Uploaded by

VisheshUtsav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views

HADOOP 1.X Installation Steps On Ubuntu

This document outlines 46 steps to install Hadoop in pseudo-distributed mode on Ubuntu. It involves installing Java, configuring environment variables, creating a dedicated Hadoop user, setting up SSH keys for passwordless access, downloading and extracting the Hadoop tarball, configuring core-site.xml, hdfs-site.xml and mapred-site.xml files, and starting the necessary daemons to launch the fully configured Hadoop installation.

Uploaded by

VisheshUtsav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

HADOOP INSTALLATION STEPS (Pseudo Distributed Mode)

1. $ uname m find the 32 bit or 64 bit


2. $ sudo apt-get update to get the latest packages in Ubuntu
3. $ sudo apt-get install openjdk-7-jre To install JRE
4. $ JAVA version To understand that JAVA is successfully installed as well as to know the version of it
5. $ which java to know the path where JAVA is installed
6. $ su root to change to root user
7. $ export JAVA_HOME=/usr to set the path for JAVA_HOME (#5 above)
8. $ export PATH=$JAVA_HOME/bin:$PATH to set the JAVA_HOME in path
9. $ echo $PATH To ensure that the path is correctly set for JAVA
10.
$ sudo apt-get install openssh-server SSH required for inter machine connection
11.
$ sudo apt-get install openssh-client SSH required for inter machine connections
12.
$ sudo addgroup hadoop create a dedicated group for HADOOP users
13.
$ sudo adduser --ingroup hadoop hduser create a user by name hduser and add that user to Hadoop group
14.
$ su hduser Switch to hduser
15.
$ ssh-keygen t rsa P Create a password less rsa key for hduser
16.
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys add the key generated to the authorized
keys for the machine, so that communication happens seamlessly
17.
$ ssh localhost to add the localhost to the list of known hosts for seamless communication between
machines
18.
19.
$ ifconfig to find the IP address of VM
20.
Copy or download the Hadoop 1.x.tar.gz the tarball of Hadoop from www.apache.org
21.
$ sudo cp <Hadoop 1.x.tar.gz from downloaded folder> /usr/local/ copy the tar ball to the standard
folder /usr/local/
22.
$ su user swith to user who has sudo privileges
23.
$ cd /usr/local/ get in to the folder where HADOOP is expected to be installed
24.
$ sudo tar xzf hadoop1.x unzip the Hadoop tar file (it would create a folder )
25.
$ sudo mv Hadoop1.x Hadoop rename the folder with the version # to Hadoop alone (for simplicity)
26.
$ sudo chown R hduser:Hadoop Hadoop give complete permissions on Hadoop folder to hduser of Hadoop
27.
$ usercd /home/hduser Get to the home directory of hduser
28.
$ ls al look for all hidden files aswell
29.
$ gedit .bashrc Need to modify the bashrc file of hduser to set the path etc for hduser
30.
$ export HADOOP_HOME=/usr/local/hadoop

31.
32.
33.
34.
35.
36.
37.

$
$
$
$
$
$
a. $
$
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
k.

export JAVA_HOME=/usr
export PATH=$PATH:$HADOOP_HOME/bin
sudo mkdir p /app/hadoop/tmp this folder is created to act as a temporary storage folder
sudo chown R hduser:Hadoop /app/Hadoop/tmp provide full privilege to hduser on folder
cd /usr/local/Hadoop/conf/ change to the folder where Hadoop configuration is to be done
nano hadoop-env.sh open the file in editor and add the below line
export JAVA_HOME=/usr setting JAVA HOME
nano core-site.xml open in the editor and add following lines

<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>Default FS, NN Machine and the port#</description>
</property>

$ nano hdfs-site.xml open in the editor and add following lines

38.
a.
b.
c.
d.
e.
f.
g.

<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
h. </property>

39.

$ nano mapred-site.xml open in the editor and add following lines


a. <property>
b.
<name>mapred.job.tracker</name>
c.
<value>localhost:54311</value>
d.
<description>The host and port that the MapReduce job tracker runs
e.
at. If "local", then jobs are run in-process as a single map
f.
and reduce task.
g.
</description>
h. </property>

40.

$ su hduser change to user under whom Hadoop is configured to run

41.
42.
43.
44.
45.
46.

$
$
$
$
$
$

hadoop namenode format create the hdfs structure of Hadoop


start-dfs.sh a shell script to start DFS related daemons
start-mapred.sh a shell script to start process related daemons
jps to show all the Hadoop daemons active and their process IDs
su user change to an user with sudo privileges
sudo apt-get install openjdk-7-jdk to install jps (incase if it is not installed)

####################################################################################################

You might also like