0% found this document useful (0 votes)

29 views

Hadoop 3 Installation

This document provides step-by-step instructions to install Apache Hadoop on Ubuntu 22.04. It details downloading Java, creating a hadoop user, downloading and configuring Hadoop, and starting the Hadoop services. Configuring involves setting environment variables, HDFS directories, and YARN and MapReduce properties. Formatting the namenode and starting services completes the installation.

Uploaded by

Không Trân

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views

Hadoop 3 Installation

Uploaded by

Không Trân

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

How to Install Apache Hadoop 3 on Ubuntu

22.04
By Rahul6 Mins Read
Understanding unstructured data and analyzing massive amounts of data is a different ball game today.
And so, businesses have resorted to Apache Hadoop and other related technologies to manage their
unstructured data more efficiently. Not just businesses but also individuals are using Apache Hadoop
for various purposes, such as analyzing large datasets or creating a website that can process user
queries. However, installing Apache Hadoop on Ubuntu may seem like a difficult task for users new to
the world of Linux servers. Fortunately, you don’t need to be an experienced system administrator to
install Apache Hadoop on Ubuntu.
The following step-by-step installation guide will get you through the entire process from downloading
the software to configuring the server with ease. In this article, we will explain how to install Apache
Hadoop on Ubuntu 22.04 LTS system. This can be also used for other Ubuntu versions.

Step 1: Install Java Development Kit

Java is a necessary component of Apache Hadoop, so you need to download and install a Java
Development Kit on all the nodes in your network where Hadoop will be installed. You can either
download the JRE or JDK. If you’re only looking to run Hadoop, then JRE is sufficient, but if you want
to create applications that run on Hadoop, then you’ll need to install the JDK. The latest version of Java
that Hadoop supports is Java 8 and 11. You can verify this on Apache’s website and download the
relevant version of Java depending on your OS.
1. The default Ubuntu repositories contain Java 8 and Java 11 both. Use the following command to
install it.
sudo apt update && sudo apt install openjdk-11-jdk

2. Once you have successfully installed it, check the current Java version:
java -version

3. You can find the location of the JAVA_HOME directory by running the following command.
That will be required latest in this article.
dirname $(dirname $(readlink -f $(which java)))

ADVERTISEMENT
Step 2: Create User for Hadoop
All the Hadoop components will run as the user that you create for Apache Hadoop, and the user will
also be used for logging in to Hadoop’s web interface. You can create a new user account with the
“sudo” command or you can create a user account with “root” permissions. The user account with root
permissions is more secure but might not be as convenient for users who are not familiar with the
command line.
1. Run the following command to create a new user with the name “hadoop”:
sudo adduser hadoop

2. Switch to the newly created hadoop user:

su - hadoop

3. Now configure password-less SSH access for the newly created hadoop user. Generate an SSH
keypair first:
ssh-keygen -t rsa
4. Copy the generated public key to the authorized key file and set the proper permissions:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 640 ~/.ssh/authorized_keys

5. Now try to SSH to the localhost.

ssh localhost

You will be asked to authenticate hosts by adding RSA keys to known hosts. Type yes and hit
Enter to authenticate the localhost:

Step 3: Install Hadoop on Ubuntu

Once you’ve installed Java, you can download Apache Hadoop and all its related components,
including Hive, Pig, Sqoop, etc. You can find the latest version on the official Hadoop’s download
page. Make sure to download the binary archive (not the source).
ADVERTISEMENT
1. Use the following command to download Hadoop 3.3.4:
wget https://dlcdn.apache.org/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz

2. Once you’ve downloaded the file, you can unzip it to a folder on your hard drive.
tar xzf hadoop-3.3.4.tar.gz

3. Rename the extracted folder to remove version information. This is an optional step, but if you
don’t want to rename, then adjust the remaining configuration paths.
mv hadoop-3.3.4 hadoop

4. Next, you will need to configure Hadoop and Java Environment Variables on your system.
Open the ~/.bashrc file in your favorite text editor:
nano ~/.bashrc

Append the below lines to the file. You can find the JAVA_HOME location by running
dirname $(dirname $(readlink -f $(which java))) command on the
terminal.
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME export
HADOOP_MAPRED_HOME=$HADOOP_HOME export
HADOOP_COMMON_HOME=$HADOOP_HOME export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME export
HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export PATH=$PATH:
$HADOOP_HOME/sbin:$HADOOP_HOME/bin export HADOOP_OPTS="-
Djava.library.path=$HADOOP_HOME/lib/native"

export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-
amd64
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
1 export
2 HADOOP_MAPRED_HOME=$HADOOP_HOME
3 export
4 HADOOP_COMMON_HOME=$HADOOP_HOME
5
export HADOOP_HDFS_HOME=$HADOOP_HOME
6
7 export HADOOP_YARN_HOME=$HADOOP_HOME
8 export
9 HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP
1 _HOME/lib/native
0 export PATH=$PATH:$HADOOP_HOME/sbin:
$HADOOP_HOME/bin
export HADOOP_OPTS="-
Djava.library.path=$HADOOP_HOME/lib/native"
Save the file and close it.
5. Load the above configuration in the current environment.
source ~/.bashrc

6. You also need to configure JAVA_HOME in hadoop-env.sh file. Edit the Hadoop
environment variable file in the text editor:
nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Search for the “export JAVA_HOME” and configure it with the value found in step 1. See the
below screenshot:

Save the file and close it.

Step 4: Configuring Hadoop

Next is to configure Hadoop configuration files available under etc directory.
1. First, you will need to create the namenode and datanode directories inside the Hadoop user
home directory. Run the following command to create both directories:
mkdir -p ~/hadoopdata/hdfs/{namenode,datanode}

2. Next, edit the core-site.xml file and update with your system hostname:
nano $HADOOP_HOME/etc/hadoop/core-site.xml

Change the following name as per your system hostname:

<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property>
</configuration>

<configuration>
1 <property>
2 <name>fs.defaultFS</name>
3
4
<value>hdfs://localhost:9000</value>
5 </property>
6 </configuration>

Save and close the file.

3. Then, edit the hdfs-site.xml file:
nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Change the NameNode and DataNode directory paths as shown below:

<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property>
<name>dfs.name.dir</name> <value>file:///home/hadoop/hadoopdata/hdfs/namenode</value> </property>
<property> <name>dfs.data.dir</name> <value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property> </configuration>

<configuration
>
<property>
<name>df
s.replication</
name>
<value>1
</value>
</property>
1
2 <property>
3
<name>df
4
5 s.name.dir</
6 name>
7 <value>fil
8 e:///home/
9 hadoop/
1 hadoopdata/
0
hdfs/
1
1 namenode</
1 value>
2 </property>
1
3 <property>
1 <name>df
4
s.data.dir</
1
5 name>
1 <value>fil
6 e:///home/
hadoop/
hadoopdata/
hdfs/
datanode</
value>
</property>
</
configuration>

Save and close the file.

4. Then, edit the mapred-site.xml file:
nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Make the following changes:

<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
</configuration>

<configuration
>
<property>
<name>m
1 apreduce.frame
2 work.name</
3
4
name>
5 <value>y
6 arn</value>
</property>
</
configuration>

Save and close the file.

5. Then, edit the yarn-site.xml file:
nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Make the following changes:

<configuration> <property> <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value> </property> </configuration>

<configuration
>
<property>
<name>y
arn.nodemanag
1 er.aux-
2 services</
3
4
name>
5 <value>m
6 apreduce_shuff
le</value>
</property>
</
configuration>

Save the file and close it.

Step 5: Start Hadoop Cluster
Before starting the Hadoop cluster. You will need to format the Namenode as a hadoop user.
 Run the following command to format the Hadoop Namenode:
hdfs namenode -format

Once the namenode directory is successfully formatted with hdfs file system, you will see the
message “Storage directory /home/hadoop/hadoopdata/hdfs/namenode has been successfully
formatted“.

 Then start the Hadoop cluster with the following command.

start-all.sh

 Once all the services started, you can access the Hadoop at: http://localhost:9870
 And the Hadoop application page is available at http://localhost:8088
Conclusion
Installing Apache Hadoop on Ubuntu can be a tricky task for newbies, especially if they only follow
the instructions in the documentation. Thankfully, this article provides a step-by-step guide that will
help you install Apache Hadoop on Ubuntu with ease. All you have to do is follow the instructions
listed in this article, and you can be sure that your Hadoop installation will be up and running in no
time.

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6134)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (627)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brene Brown
4/5 (1148)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (935)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4/5 (8215)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (631)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1253)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4/5 (8365)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (860)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (877)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (954)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4/5 (2923)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (484)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (277)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (4973)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (444)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2061)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4281)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (447)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (1988)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (278)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2283)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1068)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (1993)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2641)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (1936)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (125)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (692)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (1912)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4074)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (75)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (830)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (901)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (143)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2544)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M L Stedman
4.5/5 (790)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
Little Women
From Everand
Little Women
Louisa May Alcott
4/5 (105)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
3.5/5 (109)

Hadoop 3 Installation

Uploaded by

Hadoop 3 Installation

Uploaded by

How to Install Apache Hadoop 3 on Ubuntu

Step 1: Install Java Development Kit

2. Switch to the newly created hadoop user:

5. Now try to SSH to the localhost.

Step 3: Install Hadoop on Ubuntu

Save the file and close it.

Step 4: Configuring Hadoop

Change the following name as per your system hostname:

Save and close the file.

Change the NameNode and DataNode directory paths as shown below:

Save and close the file.

Make the following changes:

Save and close the file.

Make the following changes:

Save the file and close it.

 Then start the Hadoop cluster with the following command.

You might also like