0% found this document useful (0 votes)

16 views

Hadoop Installation Guide

Uploaded by

Fazal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Hadoop Installation Guide

Uploaded by

Fazal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

HADOOP INSTALLATION GUIDE

Name: Fazal Rahim Daftani

Roll.No: 1272241004
Teacher: Veshal Pawar

Step 1: Install Java Development Kit

To start, you'll need to install the Java Development Kit (JDK) on your Ubuntu system. The
default Ubuntu repositories offer both Java 8 and Java 11, but it's recommended to use Java
8 for compatibility with Hive. You can use the following command to install it:

sudo apt update && sudo apt install openjdk-8-jdk

Copy

Step 2: Verify Java Version

Once the Java Development Kit is successfully installed, you should check the version to
ensure it's working correctly:

java -version

Copy

Output:
11/30/24, 2:56 PM Hadoop Installation Guide

Step 3: Install SSH

SSH (Secure Shell) is crucial for Hadoop, as it facilitates secure communication between
nodes in the Hadoop cluster. This is essential for maintaining data integrity and confidentiality
and enabling efficient distributed data processing across the cluster:

sudo apt install ssh

Copy

Step 4: Create the Hadoop User

You must create a user specifically for running Hadoop components. This user will also be
used to log in to Hadoop's web interface. Run the following command to create the user and
set a password:

sudo adduser hadoop

Copy

Output:

2/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 5: Switch User

Switch to the newly created 'hadoop' user using the following command:

su - hadoop

Copy

Step 6: Configure SSH

Next, you should set up password-less SSH access for the 'Hadoop' user to streamline the
authentication process. You'll generate an SSH keypair for this purpose. This avoids the need
to enter a password or passphrase each time you want to access the Hadoop system:

ssh-keygen -t rsa

Copy

Output:

3/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 7: Set Permissions

Copy the generated public key to the authorized key file and set the proper permissions:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 640 ~/.ssh/authorized_keys

Copy

Step 8: SSH to the localhost

You will be asked to authenticate hosts by adding RSA keys to known hosts. Type 'yes' and
hit Enter to authenticate the localhost:

ssh localhost

Copy

Output:

4/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 9: Switch User

Switch to the 'hadoop' user again using the following command:

su - hadoop

Copy

Step 10: Install Hadoop

To begin, download Hadoop version 3.3.6 using the 'wget' command:

wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz

Copy

Once the download is complete, extract the contents of the downloaded file using the 'tar'
command. Optionally, you can rename the extracted folder to 'hadoop' for easier
5/18
11/30/24, 2:56 PM Hadoop Installation Guide

configuration:

tar -xvzf hadoop-3.3.6.tar.gz

mv hadoop-3.3.6 hadoop

Copy

Next, you need to set up environment variables for Java and Hadoop in your system. Open
the '~/.bashrc' Could you file in your preferred text editor? If you're using 'nano,' you can
paste code with 'Ctrl+Shift+V,' save with 'Ctrl+X,' 'Ctrl+Y,' and hit 'Enter':

nano ~/.bashrc

Copy

Append the following lines to the file:

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

Copy

Output:

6/18
11/30/24, 2:56 PM Hadoop Installation Guide

Load the above configuration into the current environment:

source ~/.bashrc

Copy

Additionally, you should configure the 'JAVA_HOME' in the 'hadoop-env.sh' file. Edit this file
with a text editor:

nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Copy

Search for the “export JAVA_HOME” and configure it .

JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

Copy

Output:

7/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 11: Configuring Hadoop

Create the namenode and datanode directories within the 'hadoop' user's home directory
using the following commands:

cd hadoop/

mkdir -p ~/hadoopdata/hdfs/{namenode,datanode}

Copy

Next, edit the 'core-site.xml' file and replace the name with your system hostname:

nano $HADOOP_HOME/etc/hadoop/core-site.xml

Copy

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>

8/18
11/30/24, 2:56 PM Hadoop Installation Guide

</property>
</configuration>

Copy

Output:

Save and close the file. Then, edit the 'hdfs-site.xml' file:

Next, edit the 'hdfs-site.xml' file and replace the name with your system hostname:

nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Copy

Change the NameNode and DataNode directory paths as shown below:

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>

https://kongu.edu/support/hadoop/index.html 9/18
11/30/24, 2:56 PM Hadoop Installation Guide

</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

Copy

Output:

https://kongu.edu/support/hadoop/index.html 10/18
11/30/24, 2:56 PM Hadoop Installation Guide

Save and close the file. Then, edit the 'mapred-site.xml' file:

nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Copy

Make the following changes:

<configuration>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME/home/hadoop/hadoop/bin/hadoop</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

Copy

Output:

https://kongu.edu/support/hadoop/index.html 11/18
11/30/24, 2:56 PM Hadoop Installation Guide

Finally, edit the 'yarn-site.xml' file:

nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Copy

Make the following changes:

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Copy

Output:

https://kongu.edu/support/hadoop/index.html 12/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 12: Start Hadoop Cluster

Before starting the Hadoop cluster, you need to format the Namenode as the 'hadoop' user.
Format the Hadoop Namenode with the following command:

hdfs namenode -format

Copy

Output:

https://kongu.edu/support/hadoop/index.html 13/18
11/30/24, 2:56 PM Hadoop Installation Guide

Once the Namenode directory is successfully formatted with the HDFS file system, you will
see the message "Storage directory /home/hadoop/hadoopdata/hdfs/namenode has been
successfully formatted." Start the Hadoop cluster using:

start-all.sh

Copy

Output:

You can check the status of all Hadoop services using the command:

jps

Copy

Output:

https://kongu.edu/support/hadoop/index.html 14/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 13: Access Hadoop Namenode and Resource

Manager

First, determine your IP address by running:

ifconfig

Copy

If needed, install 'net-tools' using:

sudo apt install net-tools

Copy

To access the Namenode, open your web browser and visit http://your-server-ip:9870.
Replace 'your-server-ip' with your actual IP address. You should see the Namenode web
interface.

Output:

https://kongu.edu/support/hadoop/index.html 15/18
11/30/24, 2:56 PM Hadoop Installation Guide

To access the Resource Manager, open your web browser and visit the URL http://your-
server-ip:8088. You should see the following screen:

Output:

Step 14: Verify the Hadoop Cluster

https://kongu.edu/support/hadoop/index.html 16/18
11/30/24, 2:56 PM Hadoop Installation Guide

The Hadoop cluster is installed and configured. Next, we will create some directories in the
HDFS filesystem to test Hadoop. Create directories in the HDFS filesystem using the
following command:

hdfs dfs -mkdir /test1

Copy

hdfs dfs -mkdir /logs

Copy

Next, run the following command to list the above directory:

hdfs dfs -ls /

Copy

You should get the following output:

Also, put some files into the Hadoop file system. For example, put log files from the host
machine into the Hadoop file system:

hdfs dfs -put /var/log/* /logs/

Copy

You can also verify the above files and directories in the Hadoop web interface. Go to the
web interface, click on Utilities => Browse the file system. You should see the directories you
created earlier on the following screen:

https://kongu.edu/support/hadoop/index.html 17/18
11/30/24, 2:56 PM Hadoop Installation Guide

Step 15: To Stop Hadoop Services

To stop the Hadoop service, run the following command as a Hadoop user:

stop-all.sh

Copy

Output:

In summary, you've learned how to install Hadoop on Ubuntu. Now, you're ready to unlock
the potential of big data analytics. Happy exploring!

https://kongu.edu/support/hadoop/index.html 18/18

Bda Manual
No ratings yet
Bda Manual
80 pages
IBM UrbanCode Deploy V6 Lab-Workbook
100% (1)
IBM UrbanCode Deploy V6 Lab-Workbook
93 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
Instalisasi Hadoop Dengan Ubuntu
No ratings yet
Instalisasi Hadoop Dengan Ubuntu
17 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
2 - Installation
No ratings yet
2 - Installation
15 pages
Install Hadoop in RHEL 8 PDF
No ratings yet
Install Hadoop in RHEL 8 PDF
9 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Experiment-2_BDA_Lab
No ratings yet
Experiment-2_BDA_Lab
13 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
BigData_Lab_Manual
No ratings yet
BigData_Lab_Manual
44 pages
Lab Manual
No ratings yet
Lab Manual
27 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Single Node Cluster
No ratings yet
Single Node Cluster
31 pages
How To Install Hadoop On Ubuntu 18
No ratings yet
How To Install Hadoop On Ubuntu 18
15 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
ASSIGNMENT_TANUPRIYA_BDDV
No ratings yet
ASSIGNMENT_TANUPRIYA_BDDV
8 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Lab 0-Cluster With Multiple VMs-30-01-2024
No ratings yet
Lab 0-Cluster With Multiple VMs-30-01-2024
6 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Installing Multi Node Cluster - Handbook 2.0
No ratings yet
Installing Multi Node Cluster - Handbook 2.0
2 pages
Hadoop & Spark
No ratings yet
Hadoop & Spark
40 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Hadoop codes
No ratings yet
Hadoop codes
3 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Big Data Manual Ai
No ratings yet
Big Data Manual Ai
33 pages
213nt1306- Big Data Analytics Lab Manual
No ratings yet
213nt1306- Big Data Analytics Lab Manual
80 pages
BIG DATA WITH HADOOP, HDFS & MAPREDUCE (Hands On Training)
No ratings yet
BIG DATA WITH HADOOP, HDFS & MAPREDUCE (Hands On Training)
35 pages
CCS334-BDA LAB MANUAL final (1)
No ratings yet
CCS334-BDA LAB MANUAL final (1)
46 pages
Hadoop Installatio1
No ratings yet
Hadoop Installatio1
22 pages
Bigdatamanualfinal 231019063224 d211cb48
No ratings yet
Bigdatamanualfinal 231019063224 d211cb48
45 pages
NEW BDA MANUAL
No ratings yet
NEW BDA MANUAL
80 pages
BDAO
No ratings yet
BDAO
23 pages
BDA Lab Manual
No ratings yet
BDA Lab Manual
34 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Installation Process of HADOOP
No ratings yet
Installation Process of HADOOP
12 pages
A Report On Distributed Computing
No ratings yet
A Report On Distributed Computing
25 pages
bigdatamanual(2)
No ratings yet
bigdatamanual(2)
45 pages
Cloud Computing Lab Setup Using Hadoop & Open Nebula
100% (4)
Cloud Computing Lab Setup Using Hadoop & Open Nebula
46 pages
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
No ratings yet
Sqoop Tutorial: Sqoop: "SQL To Hadoop and Hadoop To SQL"
11 pages
hadoop6
No ratings yet
hadoop6
5 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Setup 8
No ratings yet
Setup 8
16 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Find Procedure To Set Up The One Node Hadoop Cluster
No ratings yet
Find Procedure To Set Up The One Node Hadoop Cluster
5 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
bda-manual
No ratings yet
bda-manual
33 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet
Java Report File
No ratings yet
Java Report File
11 pages
Infrastructure Penetration Testing Course Online 1647255337
No ratings yet
Infrastructure Penetration Testing Course Online 1647255337
28 pages
A Java GUI Programmer's Primer
No ratings yet
A Java GUI Programmer's Primer
215 pages
Fundamentals OOP
No ratings yet
Fundamentals OOP
12 pages
Object Oriented Programming Exercises 1
100% (1)
Object Oriented Programming Exercises 1
3 pages
Crash 2024 02 19 - 19.12.44 Client
No ratings yet
Crash 2024 02 19 - 19.12.44 Client
12 pages
Getting Inside Java Beginners Guide Prem Kumar download
100% (2)
Getting Inside Java Beginners Guide Prem Kumar download
50 pages
PDC - Cs Department Java Programming Sae4A
No ratings yet
PDC - Cs Department Java Programming Sae4A
18 pages
Final Exam Java Quiz
No ratings yet
Final Exam Java Quiz
11 pages
TREX Document.
No ratings yet
TREX Document.
8 pages
133 Core Java Interview Questions Answers From Last 5 Years - The MEGA List
No ratings yet
133 Core Java Interview Questions Answers From Last 5 Years - The MEGA List
20 pages
Experiment No 3: Mitesh Chauhan Te It - 1 B1 Roll No:-08
No ratings yet
Experiment No 3: Mitesh Chauhan Te It - 1 B1 Roll No:-08
6 pages
Object Oriented Programming and Java Second Edition Danny Poo - The ebook is ready for download with just one simple click
100% (1)
Object Oriented Programming and Java Second Edition Danny Poo - The ebook is ready for download with just one simple click
47 pages
Raghavendra Akarapu: Technology Professional
No ratings yet
Raghavendra Akarapu: Technology Professional
6 pages
Health Prediction Management System PDF
No ratings yet
Health Prediction Management System PDF
105 pages
Batch 20 Project Report
No ratings yet
Batch 20 Project Report
36 pages
PDF
No ratings yet
PDF
75 pages
Lab 231 Final-Modified
No ratings yet
Lab 231 Final-Modified
47 pages
Android: Operating System
No ratings yet
Android: Operating System
15 pages
Open Bravo 5
No ratings yet
Open Bravo 5
4 pages
BatchUploader ProgrammersGuide
No ratings yet
BatchUploader ProgrammersGuide
5 pages
Java Unit-4 Assignment Answers
No ratings yet
Java Unit-4 Assignment Answers
8 pages
Settings Provider
No ratings yet
Settings Provider
214 pages
Anr 6.42 (64200002) 0
No ratings yet
Anr 6.42 (64200002) 0
10 pages
Maximizing E-Business Suite Performance
No ratings yet
Maximizing E-Business Suite Performance
102 pages
Crypto J Installation Guide
No ratings yet
Crypto J Installation Guide
48 pages
Java Programming Lab Manual
100% (1)
Java Programming Lab Manual
112 pages
Advanced Java Manual
No ratings yet
Advanced Java Manual
3 pages
Project File
No ratings yet
Project File
52 pages

Hadoop Installation Guide

Uploaded by

Hadoop Installation Guide

Uploaded by

HADOOP INSTALLATION GUIDE

Name: Fazal Rahim Daftani

Step 1: Install Java Development Kit

sudo apt update && sudo apt install openjdk-8-jdk

Step 2: Verify Java Version

Step 3: Install SSH

sudo apt install ssh

Step 4: Create the Hadoop User

sudo adduser hadoop

Step 5: Switch User

Step 6: Configure SSH

Step 7: Set Permissions

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

chmod 640 ~/.ssh/authorized_keys

Step 8: SSH to the localhost

Step 9: Switch User

Switch to the 'hadoop' user again using the following command:

Step 10: Install Hadoop

To begin, download Hadoop version 3.3.6 using the 'wget' command:

tar -xvzf hadoop-3.3.6.tar.gz

Append the following lines to the file:

Load the above configuration into the current environment:

Search for the “export JAVA_HOME” and configure it .

Step 11: Configuring Hadoop

Change the NameNode and DataNode directory paths as shown below:

Make the following changes:

Finally, edit the 'yarn-site.xml' file:

Make the following changes:

Step 12: Start Hadoop Cluster

hdfs namenode -format

Step 13: Access Hadoop Namenode and Resource

First, determine your IP address by running:

If needed, install 'net-tools' using:

sudo apt install net-tools

Step 14: Verify the Hadoop Cluster

hdfs dfs -mkdir /test1

hdfs dfs -mkdir /logs

Next, run the following command to list the above directory:

hdfs dfs -ls /

You should get the following output:

hdfs dfs -put /var/log/* /logs/

Step 15: To Stop Hadoop Services

You might also like