0% found this document useful (0 votes)

5 views

DataVisuaization Lab

Uploaded by

Odrib Deb

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

DataVisuaization Lab

Uploaded by

Odrib Deb

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Practical: 1

Aim: Configure Hadoop cluster in pseudo distributed mode and run basic Hadoop
commands.

Installation of Hadoop 3.3.2 on Ubuntu 18.04 LTS

1. Installing Java

$ sudo apt update

$ sudo apt install openjdk-8-jdk openjdk-8-jre
$ java -version

Set JAVA_HOME in .bashrc

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:/usr/lib/jvm/java-8-openjdk-amd64/bin

Apply changes of bashrc in ubuntu environment either by rebooting the system or

applying source ~/.bashrc

2. Adding dedicated hadoop user

$ sudo addgroup hadoop

$ sudo adduser --ingroup hadoop hduser

3. Adding hduser in sudoers file

$ sudo visudo

Add following line in the /etc/sudoers.tmp file

hduser ALL=(ALL:ALL) ALL

4. Now switch to hduser

$ su -hduser

5. Setting up SSH

Hadoop services like Resource Manager & Node Manager uses ssh to share the status of
nodes b/w slave to master & master to master.

$ sudo apt-get install openssh-server openssh-client

After installing ssh, generate ssh keys and copy them in ~/.ssh/authorized_keys.
Generate Keys for secure communication:
$ ssh-keygen -t rsa -P “”
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

6. Download Hadoop 3.3.2 tar file, extract it into /usr/local/Hadoop folder.

$ sudo tar xvzf hadoop-3.0.2.tar.gz

$ sudo mv -r hadoop-3.0.2 /usr/local/hadoop

7. Changing ownership to hduser:Hadoop group and full permission to them.

$ sudo chown -R hduser:hadoop /usr/local/hadoop $ sudo chmod -R 777

/usr/local/Hadoop

8. Hadoop Setup

This setup, also called pseudo-distributed mode, allows each Hadoop daemon to run as
a single Java process. A Hadoop environment is configured by editing a set of
configuration files:

bashrc hadoop-env.sh core-site.xml hdfs-site.xml mapred-site-xml yarn-site.xml

8.1 bashrc

$ sudo gedit ~/.bashrc

Add following lines at the end:

#Hadoop Related Options

export HADOOP_HOME=/usr/local/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export
PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

$ source ~/.bashrc

8.2 hadoop-env.sh

Lets change the working directory to hadoop configurations location $ cd

/usr/local/hadoop/etc/hadoop/

$ sudo gedit hadoop-env.sh

Add this line:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

8.3 yarn-site.xml

$ sudo gedit yarn-site.xml

Add following lines:
<property> <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>

</property>
<property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

8.4 hdfs-site.xml

$ sudo gedit hdfs-site.xml

Add following lines: <property> <name>dfs.replication</name> <value>1</value>

</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/yarn_data/hdfs/namenode</value> </property>
<property>
<name>dfs.datanode.data.dir</name>

<value>file:/usr/local/hadoop/yarn_data/hdfs/datanode</value> </property>

8.5 core-site.xml

$ sudo gedit core-site.xml

Add following lines:
<property>
<name>hadoop.tmp.dir</name> <value>/home/hduser/hadoop/tmp</value>
</property>

<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value>

</property>

8.6 mapred-site.xml

$ sudo gedit mapred-site.xml

Add following lines:
<property> <name>mapred.framework.name</name> <value>yarn</value>

</property>
<property> <name>mapreduce.jobhistory.address</name>
<value>localhost:10020</value>
</property>

9. Create temp directory, directory for datanode and namenode

$ sudo mkdir -p /home/hduser/hadoop/tmp

$ sudo chown -R hduser:hadoop /home/hduser/hadoop/tmp

$ sudo chmod -R 777 /home/hduser/hadoop/tmp

$ sudo mkdir -p /usr/local/hadoop/yarn_data/hdfs/namenode

$ sudo mkdir -p /usr/local/hadoop/yarn_data/hdfs/datanode
$ sudo chmod -R 777 /usr/local/hadoop/yarn_data/hdfs/namenode
$ sudo chmod -R 777 /usr/local/hadoop/yarn_data/hdfs/datanode
$ sudo chown -R hduser:hadoop /usr/local/hadoop/yarn_data/hdfs/namenode $ sudo
chown -R hduser:hadoop /usr/local/hadoop/yarn_data/hdfs/datanode

10. Format Hadoop namenode to get the fresh start

$ hdfs namenode -format
Start all hadoop services by executing command one by one. $ start-dfs.sh
$ start-yarn.sh

or
$ start-all.sh

Type this simple command to check if all the daemons are active and running as Java
processes:
$ jps

Following output is expected if all went well:

6960 SecondaryNameNode 7380 NodeManager

6632 NameNode
11066 Jps

7244 ResourceManager 6766 DataNode

Access Hadoop UI from Browser

The default port number 9870 gives you access to the Hadoop NameNode UI:

http://localhost:9870

The NameNode user interface provides a comprehensive overview of the entire cluster.

The default port 9864 is used to access individual DataNodes directly from your
browser:
http://localhost:9864
The YARN Resource Manager is accessible on port 8088: http://localhost:8088

Saral Jyotish
50% (2)
Saral Jyotish
296 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
BDAO
No ratings yet
BDAO
23 pages
Install Hadoop
No ratings yet
Install Hadoop
8 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
Installation of Hadoop in Ubuntu
No ratings yet
Installation of Hadoop in Ubuntu
15 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Hadoop Installatio1
No ratings yet
Hadoop Installatio1
22 pages
How To Install Hadoop On Ubuntu 18.04 or 20.04
No ratings yet
How To Install Hadoop On Ubuntu 18.04 or 20.04
15 pages
CDH3 Pseudo Installation On Ubuntu
No ratings yet
CDH3 Pseudo Installation On Ubuntu
4 pages
TP2 _3IM - En
No ratings yet
TP2 _3IM - En
7 pages
Hadoop Cluster Creation
No ratings yet
Hadoop Cluster Creation
8 pages
HADOOP 1.X Installation Steps On Ubuntu
No ratings yet
HADOOP 1.X Installation Steps On Ubuntu
3 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Experiment-2_BDA_Lab
No ratings yet
Experiment-2_BDA_Lab
13 pages
Start Hadoop
No ratings yet
Start Hadoop
4 pages
Hadoop Installation Steps
No ratings yet
Hadoop Installation Steps
4 pages
unit 3 PART 2
No ratings yet
unit 3 PART 2
11 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
HDFS Installation Guide-Anju
No ratings yet
HDFS Installation Guide-Anju
4 pages
Exp-1-1
No ratings yet
Exp-1-1
24 pages
Installing Multi Node Cluster - Handbook 2.0
No ratings yet
Installing Multi Node Cluster - Handbook 2.0
2 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
4 pages
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6.5 Installing On Ubuntu 16.04 and 18.04 (Single-Node Cluster)
7 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
8 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Week 1 in Terminal
No ratings yet
Week 1 in Terminal
10 pages
BDA Lab Manual-1
No ratings yet
BDA Lab Manual-1
60 pages
Lab 0-Cluster With Multiple VMs-30-01-2024
No ratings yet
Lab 0-Cluster With Multiple VMs-30-01-2024
6 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
Exp_1
No ratings yet
Exp_1
24 pages
Install Sqoop
No ratings yet
Install Sqoop
7 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Hadoop Install
No ratings yet
Hadoop Install
19 pages
Hadoop Multinode Cluster Installation
No ratings yet
Hadoop Multinode Cluster Installation
4 pages
Hadoop Installation Commands
No ratings yet
Hadoop Installation Commands
3 pages
Experiment 1 Hadoop Installation
No ratings yet
Experiment 1 Hadoop Installation
6 pages
BigData_Lab_Manual
No ratings yet
BigData_Lab_Manual
44 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Create A Multi-Node Cluster For Distributed Hadoop Environment
No ratings yet
Create A Multi-Node Cluster For Distributed Hadoop Environment
5 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Unix Commands Part 2
No ratings yet
Unix Commands Part 2
37 pages
Hadoop Installation Step by Step
No ratings yet
Hadoop Installation Step by Step
8 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
6 Hadoop
No ratings yet
6 Hadoop
20 pages
Hadoop Installation On Linux
No ratings yet
Hadoop Installation On Linux
4 pages
Nitish Steps To Install Hadoop
No ratings yet
Nitish Steps To Install Hadoop
3 pages
Hadoop 2.7.3 Setup On Ubuntu 15.10
No ratings yet
Hadoop 2.7.3 Setup On Ubuntu 15.10
7 pages
Bash Command Line Pro Tips
From Everand
Bash Command Line Pro Tips
Jason Cannon
4.5/5 (8)
Heat Stress Program - J38
No ratings yet
Heat Stress Program - J38
24 pages
Transactions Papers: Rickard Stridh, Mats Bengtsson, and BJ Orn Ottersten
No ratings yet
Transactions Papers: Rickard Stridh, Mats Bengtsson, and BJ Orn Ottersten
9 pages
Patanjali Project
No ratings yet
Patanjali Project
75 pages
Isaacv1.7.5.0000 20211114 231007 9636 2460
No ratings yet
Isaacv1.7.5.0000 20211114 231007 9636 2460
57 pages
DS 113 Science, Technology and Innovation For Development
No ratings yet
DS 113 Science, Technology and Innovation For Development
50 pages
s4 Ilp Teacher Leader Project
No ratings yet
s4 Ilp Teacher Leader Project
4 pages
Tutorial On Ellipsoid Method
No ratings yet
Tutorial On Ellipsoid Method
15 pages
3D Simulation of Tool Machining
No ratings yet
3D Simulation of Tool Machining
8 pages
Micro Economics I Note
100% (1)
Micro Economics I Note
161 pages
DX Diag
No ratings yet
DX Diag
31 pages
CMA Plan
No ratings yet
CMA Plan
113 pages
ME8692 - FEA Important Questions
No ratings yet
ME8692 - FEA Important Questions
11 pages
Tabla Se Carga Kato 10tn
No ratings yet
Tabla Se Carga Kato 10tn
3 pages
0606 s16 QP 11
No ratings yet
0606 s16 QP 11
16 pages
Criminal Profiling FBI
No ratings yet
Criminal Profiling FBI
34 pages
Spiro Project Titles 2024-2025
No ratings yet
Spiro Project Titles 2024-2025
28 pages
Guideline On Reuse of Existing Piles
100% (1)
Guideline On Reuse of Existing Piles
9 pages
Adc001 01
No ratings yet
Adc001 01
9 pages
Scientific Reasoning and Argumentation The Roles of Domain Specific and Domain General Knowledge 1st Edition Frank Fischer (Editor) - Quickly download the ebook to read anytime, anywhere
100% (1)
Scientific Reasoning and Argumentation The Roles of Domain Specific and Domain General Knowledge 1st Edition Frank Fischer (Editor) - Quickly download the ebook to read anytime, anywhere
75 pages
Control Aveo
No ratings yet
Control Aveo
42 pages
TC1766 Layout Guideline
No ratings yet
TC1766 Layout Guideline
9 pages
Chapter 9 The Process of Interaction Design
No ratings yet
Chapter 9 The Process of Interaction Design
26 pages
Torre de Señalizacion Luminosa
No ratings yet
Torre de Señalizacion Luminosa
8 pages
21st Century Midterms
No ratings yet
21st Century Midterms
23 pages
JB Service Manual PDF
No ratings yet
JB Service Manual PDF
52 pages
People Soft Bundle Release Note 9 Bundle19
No ratings yet
People Soft Bundle Release Note 9 Bundle19
28 pages
Unit PPT
No ratings yet
Unit PPT
81 pages
MCQ (Stack & Queue)
No ratings yet
MCQ (Stack & Queue)
11 pages

DataVisuaization Lab

Uploaded by

DataVisuaization Lab

Uploaded by

Practical: 1

Installation of Hadoop 3.3.2 on Ubuntu 18.04 LTS

$ sudo apt update

Set JAVA_HOME in .bashrc

Apply changes of bashrc in ubuntu environment either by rebooting the system or

2. Adding dedicated hadoop user

$ sudo addgroup hadoop

3. Adding hduser in sudoers file

Add following line in the /etc/sudoers.tmp file

hduser ALL=(ALL:ALL) ALL

4. Now switch to hduser

$ sudo apt-get install openssh-server openssh-client

6. Download Hadoop 3.3.2 tar file, extract it into /usr/local/Hadoop folder.

$ sudo tar xvzf hadoop-3.0.2.tar.gz

$ sudo mv -r hadoop-3.0.2 /usr/local/hadoop

7. Changing ownership to hduser:Hadoop group and full permission to them.

$ sudo chown -R hduser:hadoop /usr/local/hadoop $ sudo chmod -R 777

bashrc hadoop-env.sh core-site.xml hdfs-site.xml mapred-site-xml yarn-site.xml

$ sudo gedit ~/.bashrc

#Hadoop Related Options

Lets change the working directory to hadoop configurations location $ cd

$ sudo gedit hadoop-env.sh

$ sudo gedit yarn-site.xml

$ sudo gedit hdfs-site.xml

$ sudo gedit core-site.xml

<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value>

$ sudo gedit mapred-site.xml

9. Create temp directory, directory for datanode and namenode

$ sudo mkdir -p /home/hduser/hadoop/tmp

$ sudo chmod -R 777 /home/hduser/hadoop/tmp

$ sudo mkdir -p /usr/local/hadoop/yarn_data/hdfs/namenode

10. Format Hadoop namenode to get the fresh start

Following output is expected if all went well:

6960 SecondaryNameNode 7380 NodeManager

7244 ResourceManager 6766 DataNode

Access Hadoop UI from Browser

You might also like