SlideShare a Scribd company logo
Hadoop Developer Training
Session 3 -
Installation and Commands
Page 1Classification: Restricted
Agenda
• Hadoop Installation and Commands
Page 2Classification: Restricted
Installation Guide
• Step:- 1 Copy hadoop installation files from local to virtual machine
Command :-
hadoop@hadoop-VirtualBox:~$ sudo cp -r /media/sf_Dee/
/home/hadoop/Desktop/
• Step:- 2 Give permission to folder on Desktop
Command :-
hadoop@hadoop -VirtualBox:~$ sudo chmod 777 -R
/home/hadoop/Desktop/sf_Dee/
[sudo] password for hadoop:
• Step:- 3 Make a Folder work in /usr/local/work install hadoop and jave
inside it .
Command:-
hadoop@hadoop-VirtualBox:~$ sudo mkdir /usr/local/work
Page 3Classification: Restricted
Installation Guide
• Step:- 4 Copy the hadoop and java.tar files to the work folder.
Command:-
hadoop@hadoop-VirtualBox:~$ sudo cp -r
/home/hadoop/Desktop/sf_Dee/jdk-8u60-linux-x64.tar.gz /usr/local/work/
hadoop@hadoop-VirtualBox:~$ sudo cp -r
/home/hadoop/Desktop/sf_Dee/de/Setups/hadoop-2.6.0.tar.gz
/usr/local/work/
hadoop@hadoop-VirtualBox:~$ cd /usr/local/work/
• Step:- 5 Untar the java & hadoop.tar files inside work folder
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo tar -xzvf jdk-8u60-linux-
x64.tar.gz
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo tar -xzvf hadoop-
2.6.0.tar.gz
Page 4Classification: Restricted
Installation Guide
• Step:- 6 Move the hadoop-2.6.0 and jdk1.8.0_60 folder as hadoop and java.
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo mv hadoop-2.6.0
/usr/local/work/hadoop
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo mv jdk1.8.0_60
/usr/local/work/java
• Step:- 7 Install rsync & ssh
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo apt-get install ssh
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo apt-get install rsync
Page 5Classification: Restricted
Installation Guide
• Step:- 8 Generate and copy keys to make passwordless connection
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ ssh-keygen -t dsa -P '' -f
~/.ssh/id_dsa
hadoop@hadoop-VirtualBox:/usr/local/work$ cat ~/.ssh/id_dsa.pub >>
~/.ssh/authorized_keys
• Step:- 9 Update java to os
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives --
install "/usr/bin/java" "java" "/usr/local/work/java/bin/java" 1 update-
alternatives: using /usr/local/work/java/bin/java to provide /usr/bin/java
(java) in auto mode
Page 6Classification: Restricted
Installation Guide
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives --
install "/usr/bin/javac" "javac" "/usr/local/work/java/bin/javac" 1 update-
alternatives: using /usr/local/work/java/bin/javac to provide /usr/bin/javac
(javac) in auto mode
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives --
install "/usr/bin/javaws" "javaws" "/usr/local/work/java/bin/javaws" 1
update-alternatives: using /usr/local/work/java/bin/javaws to provide
/usr/bin/javaws (javaws) in auto mode
• Step:- 10 Set java to OS
Command:-
Page 7Classification: Restricted
Installation Guide
• Step:- 11 Update java & hadoop path in bashrc profile (env variables at the
end of doc)
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ sudo nano ~/.bashrc
hadoop@hadoop-VirtualBox:/usr/local/work$ source ~/.bashrc
• Step:- 12 Check if the hadoop and java are installed perfectly.
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$ echo $HADOOP_PREFIX
output would be :- /usr/local/work/hadoop/
Page 8Classification: Restricted
Installation Guide
hadoop@hadoop-VirtualBox:/usr/local/work$ echo $JAVA_HOME
output would be :- /usr/local/work/java
• Step:- 13 Copy the mapred-site.xml.template to mapred-site.xml in
/usr/local/work/hadoop/etc/hadoop
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work$
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo cp
mapred-site.xml.template mapred-site.xml
Page 9Classification: Restricted
Installation Guide
• Step:- 14 Edit hadoop-env.sh file (you should be inside the directory
/usr/local/work/hadoop/etc/hadoop)
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo
nano hadoop-env.sh
enter/change java path to your path of java installation :
export JAVA_HOME=/usr/local/work/java
• Step:- 15 Edit core-site.xml file (you should be inside the directory
/usr/local/work/hadoop/etc/hadoop)
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo
nano core-site.xml
Page 10Classification: Restricted
Installation Guide
Enter the following properties inside it
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:8020</value>
<final>true</final>
</property>
</configuration>
• Step:- 16 Edit hdfs-site.xml file (you should be inside the directory
/usr/local/work/hadoop/etc/hadoop)
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo
nano hdfs-site.xml
Page 11Classification: Restricted
Installation Guide
Enter the following properties inside it
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/work/hadoop/hadoop_data/dfs/name</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/work/hadoop/hadoop_data/dfs/data</value>
</property>
</configuration>
Page 12Classification: Restricted
Installation Guide
• Step:- 17 Edit mapred-site.xml file (you should be inside the directory
/usr/local/work/hadoop/etc/hadoop) Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo
nano mapred-site.xml
Enter the following properties inside it
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Page 13Classification: Restricted
Installation Guide
• Step:- 18 Edit yarn-mapred.xml file (you should be inside the directory
/usr/local/work/hadoop/etc/hadoop) Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo
nano yarn-site.xml
Enter the following properties inside it
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
Page 14Classification: Restricted
Installation Guide
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>localhost:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:8088</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capacity
Scheduler</value>
</property>
Page 15Classification: Restricted
Installation Guide
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capacity
Scheduler</value>
</property>
<property> <name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value> </property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>file:/usr/local/work/hadoop/hadoop_data/yarn/yarn.nodemanager.local-
dirs</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>file:/usr/local/work/hadoop/hadoop_data/yarn/logs</value>
</property>
Page 16Classification: Restricted
Installation Guide
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
• Step:- 19 Format name node
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ hadoop
namenode -format
• Step:- 20 Start services of hadoop
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ start-
all.sh
Page 17Classification: Restricted
Installation Guide
Following services would start
• Namenode
• Secondarynamenode
• Datanode
• Resourcemanger
• Nodemanger
• Jps
• Step:- 21 Stop all services
Command:-
hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ stop-
all.sh
Page 18Classification: Restricted
Installation Guide
Note :- variable to be added in ~/.bashrc profile
• export HADOOP_PREFIX="/usr/local/work/hadoop/"
• export PATH=$PATH:$HADOOP_PREFIX/bin
• export PATH=$PATH:$HADOOP_PREFIX/sbin
• export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
• export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
• export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
• export YARN_HOME=${HADOOP_PREFIX}
• export JAVA_HOME="/usr/local/work/java"
• export PATH=$PATH:$JAVA_HOME/bin
Page 19Classification: Restricted
Topics to be discussed in the next session
• PIG
• PIG - Overview
• Installation and Running Pig
• Load in Pig
• Macros in Pig
Page 20Classification: Restricted
Thank you!
Ad

Recommended

Session 01 - Into to Hadoop
Session 01 - Into to Hadoop
AnandMHadoop
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
ryancox
 
Hadoop cluster configuration
Hadoop cluster configuration
prabakaranbrick
 
Introduction to hadoop administration jk
Introduction to hadoop administration jk
Edureka!
 
Hadoop Installation presentation
Hadoop Installation presentation
puneet yadav
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
Subhas Kumar Ghosh
 
Administer Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Apache HDFS - Lab Assignment
Apache HDFS - Lab Assignment
Farzad Nozarian
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
Edureka!
 
Hadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Hadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
6.hive
6.hive
Prashant Gupta
 
Hive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hadoop
Hadoop
Cassell Hsu
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce program
Praveen Kumar Donta
 
Introduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
 
Hadoop architecture by ajay
Hadoop architecture by ajay
Hadoop online training
 
Introduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop introduction seminar presentation
Hadoop introduction seminar presentation
puneet yadav
 
Apache HBase 1.0 Release
Apache HBase 1.0 Release
Nick Dimiduk
 
Hadoop administration
Hadoop administration
Aneesh Pulickal Karunakaran
 
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
BIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
Oleksiy Krotov
 
HDFS: Hadoop Distributed Filesystem
HDFS: Hadoop Distributed Filesystem
Steve Loughran
 
Hadoop HDFS
Hadoop HDFS
Vigen Sahakyan
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
Kelly Technologies
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 

More Related Content

What's hot (20)

Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
Edureka!
 
Hadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Hadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
6.hive
6.hive
Prashant Gupta
 
Hive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Hadoop
Hadoop
Cassell Hsu
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce program
Praveen Kumar Donta
 
Introduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
 
Hadoop architecture by ajay
Hadoop architecture by ajay
Hadoop online training
 
Introduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop introduction seminar presentation
Hadoop introduction seminar presentation
puneet yadav
 
Apache HBase 1.0 Release
Apache HBase 1.0 Release
Nick Dimiduk
 
Hadoop administration
Hadoop administration
Aneesh Pulickal Karunakaran
 
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
BIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
Oleksiy Krotov
 
HDFS: Hadoop Distributed Filesystem
HDFS: Hadoop Distributed Filesystem
Steve Loughran
 
Hadoop HDFS
Hadoop HDFS
Vigen Sahakyan
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
Kelly Technologies
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
Edureka!
 
Hadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Hadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
Hive Quick Start Tutorial
Hive Quick Start Tutorial
Carl Steinbach
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation, Configuration, and Mapreduce program
Praveen Kumar Donta
 
Introduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop introduction seminar presentation
Hadoop introduction seminar presentation
puneet yadav
 
Apache HBase 1.0 Release
Apache HBase 1.0 Release
Nick Dimiduk
 
Learn Hadoop Administration
Learn Hadoop Administration
Edureka!
 
BIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
Oleksiy Krotov
 
HDFS: Hadoop Distributed Filesystem
HDFS: Hadoop Distributed Filesystem
Steve Loughran
 
Hadoop training in hyderabad-kellytechnologies
Hadoop training in hyderabad-kellytechnologies
Kelly Technologies
 

Similar to Session 03 - Hadoop Installation and Basic Commands (20)

Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 
安装Apache Hadoop的轻松
安装Apache Hadoop的轻松
Enrique Davila
 
簡単にApache Hadoopのインストール
簡単にApache Hadoopのインストール
Enrique Davila
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
Manish Chopra
 
Hadoop Installation
Hadoop Installation
mrinalsingh385
 
Hadoop completereference
Hadoop completereference
arunkumar sadhasivam
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Titus Damaiyanti
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
baabtra.com - No. 1 supplier of quality freshers
 
Hadoop installation on windows
Hadoop installation on windows
habeebulla g
 
Run wordcount job (hadoop)
Run wordcount job (hadoop)
valeri kopaleishvili
 
Exp-3.pptx
Exp-3.pptx
PraveenKumar581409
 
Hadoop installation
Hadoop installation
habeebulla g
 
Single node hadoop cluster installation
Single node hadoop cluster installation
Mahantesh Angadi
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
Aiden Seonghak Hong
 
Hadoop Installation
Hadoop Installation
Ahmed Salman
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installation
mellempudilavanya999
 
Big Data Course - BigData HUB
Big Data Course - BigData HUB
Ahmed Salman
 
Setup and run hadoop distrubution file system example 2.2
Setup and run hadoop distrubution file system example 2.2
Mounir Benhalla
 
Hadoop installation steps
Hadoop installation steps
Mayank Sharma
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
Enrique Davila
 
安装Apache Hadoop的轻松
安装Apache Hadoop的轻松
Enrique Davila
 
簡単にApache Hadoopのインストール
簡単にApache Hadoopのインストール
Enrique Davila
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
Manish Chopra
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Titus Damaiyanti
 
Hadoop installation on windows
Hadoop installation on windows
habeebulla g
 
Hadoop installation
Hadoop installation
habeebulla g
 
Single node hadoop cluster installation
Single node hadoop cluster installation
Mahantesh Angadi
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
Aiden Seonghak Hong
 
Hadoop Installation
Hadoop Installation
Ahmed Salman
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installation
mellempudilavanya999
 
Big Data Course - BigData HUB
Big Data Course - BigData HUB
Ahmed Salman
 
Setup and run hadoop distrubution file system example 2.2
Setup and run hadoop distrubution file system example 2.2
Mounir Benhalla
 
Hadoop installation steps
Hadoop installation steps
Mayank Sharma
 
Ad

More from AnandMHadoop (8)

Overview of Java
Overview of Java
AnandMHadoop
 
Session 14 - Hive
Session 14 - Hive
AnandMHadoop
 
Session 09 - Flume
Session 09 - Flume
AnandMHadoop
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
AnandMHadoop
 
Session 19 - MapReduce
Session 19 - MapReduce
AnandMHadoop
 
Session 04 -Pig Continued
Session 04 -Pig Continued
AnandMHadoop
 
Session 04 pig - slides
Session 04 pig - slides
AnandMHadoop
 
Session 02 - Yarn Concepts
Session 02 - Yarn Concepts
AnandMHadoop
 
Session 09 - Flume
Session 09 - Flume
AnandMHadoop
 
Session 23 - Kafka and Zookeeper
Session 23 - Kafka and Zookeeper
AnandMHadoop
 
Session 19 - MapReduce
Session 19 - MapReduce
AnandMHadoop
 
Session 04 -Pig Continued
Session 04 -Pig Continued
AnandMHadoop
 
Session 04 pig - slides
Session 04 pig - slides
AnandMHadoop
 
Session 02 - Yarn Concepts
Session 02 - Yarn Concepts
AnandMHadoop
 
Ad

Recently uploaded (20)

OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
Powering Multi-Page Web Applications Using Flow Apps and FME Data Streaming
Powering Multi-Page Web Applications Using Flow Apps and FME Data Streaming
Safe Software
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
The Future of AI Agent Development Trends to Watch.pptx
The Future of AI Agent Development Trends to Watch.pptx
Lisa ward
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 
OpenPOWER Foundation & Open-Source Core Innovations
OpenPOWER Foundation & Open-Source Core Innovations
IBM
 
Powering Multi-Page Web Applications Using Flow Apps and FME Data Streaming
Powering Multi-Page Web Applications Using Flow Apps and FME Data Streaming
Safe Software
 
From Manual to Auto Searching- FME in the Driver's Seat
From Manual to Auto Searching- FME in the Driver's Seat
Safe Software
 
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Creating Inclusive Digital Learning with AI: A Smarter, Fairer Future
Impelsys Inc.
 
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Smarter Aviation Data Management: Lessons from Swedavia Airports and Sweco
Safe Software
 
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Oh, the Possibilities - Balancing Innovation and Risk with Generative AI.pdf
Priyanka Aash
 
AI VIDEO MAGAZINE - June 2025 - r/aivideo
AI VIDEO MAGAZINE - June 2025 - r/aivideo
1pcity Studios, Inc
 
Securing AI - There Is No Try, Only Do!.pdf
Securing AI - There Is No Try, Only Do!.pdf
Priyanka Aash
 
The Future of AI Agent Development Trends to Watch.pptx
The Future of AI Agent Development Trends to Watch.pptx
Lisa ward
 
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Wenn alles versagt - IBM Tape schützt, was zählt! Und besonders mit dem neust...
Josef Weingand
 
2025_06_18 - OpenMetadata Community Meeting.pdf
2025_06_18 - OpenMetadata Community Meeting.pdf
OpenMetadata
 
PyCon SG 25 - Firecracker Made Easy with Python.pdf
PyCon SG 25 - Firecracker Made Easy with Python.pdf
Muhammad Yuga Nugraha
 
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC and Open Hackathons Monthly Highlights June 2025
OpenACC
 
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
"How to survive Black Friday: preparing e-commerce for a peak season", Yurii ...
Fwdays
 
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
Tech-ASan: Two-stage check for Address Sanitizer - Yixuan Cao.pdf
caoyixuan2019
 
AI vs Human Writing: Can You Tell the Difference?
AI vs Human Writing: Can You Tell the Difference?
Shashi Sathyanarayana, Ph.D
 
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
GenAI Opportunities and Challenges - Where 370 Enterprises Are Focusing Now.pdf
Priyanka Aash
 
9-1-1 Addressing: End-to-End Automation Using FME
9-1-1 Addressing: End-to-End Automation Using FME
Safe Software
 
"Database isolation: how we deal with hundreds of direct connections to the d...
"Database isolation: how we deal with hundreds of direct connections to the d...
Fwdays
 
Cyber Defense Matrix Workshop - RSA Conference
Cyber Defense Matrix Workshop - RSA Conference
Priyanka Aash
 

Session 03 - Hadoop Installation and Basic Commands

  • 1. Hadoop Developer Training Session 3 - Installation and Commands
  • 2. Page 1Classification: Restricted Agenda • Hadoop Installation and Commands
  • 3. Page 2Classification: Restricted Installation Guide • Step:- 1 Copy hadoop installation files from local to virtual machine Command :- hadoop@hadoop-VirtualBox:~$ sudo cp -r /media/sf_Dee/ /home/hadoop/Desktop/ • Step:- 2 Give permission to folder on Desktop Command :- hadoop@hadoop -VirtualBox:~$ sudo chmod 777 -R /home/hadoop/Desktop/sf_Dee/ [sudo] password for hadoop: • Step:- 3 Make a Folder work in /usr/local/work install hadoop and jave inside it . Command:- hadoop@hadoop-VirtualBox:~$ sudo mkdir /usr/local/work
  • 4. Page 3Classification: Restricted Installation Guide • Step:- 4 Copy the hadoop and java.tar files to the work folder. Command:- hadoop@hadoop-VirtualBox:~$ sudo cp -r /home/hadoop/Desktop/sf_Dee/jdk-8u60-linux-x64.tar.gz /usr/local/work/ hadoop@hadoop-VirtualBox:~$ sudo cp -r /home/hadoop/Desktop/sf_Dee/de/Setups/hadoop-2.6.0.tar.gz /usr/local/work/ hadoop@hadoop-VirtualBox:~$ cd /usr/local/work/ • Step:- 5 Untar the java & hadoop.tar files inside work folder Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ sudo tar -xzvf jdk-8u60-linux- x64.tar.gz hadoop@hadoop-VirtualBox:/usr/local/work$ sudo tar -xzvf hadoop- 2.6.0.tar.gz
  • 5. Page 4Classification: Restricted Installation Guide • Step:- 6 Move the hadoop-2.6.0 and jdk1.8.0_60 folder as hadoop and java. Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ sudo mv hadoop-2.6.0 /usr/local/work/hadoop hadoop@hadoop-VirtualBox:/usr/local/work$ sudo mv jdk1.8.0_60 /usr/local/work/java • Step:- 7 Install rsync & ssh Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ sudo apt-get install ssh hadoop@hadoop-VirtualBox:/usr/local/work$ sudo apt-get install rsync
  • 6. Page 5Classification: Restricted Installation Guide • Step:- 8 Generate and copy keys to make passwordless connection Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa hadoop@hadoop-VirtualBox:/usr/local/work$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys • Step:- 9 Update java to os Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives -- install "/usr/bin/java" "java" "/usr/local/work/java/bin/java" 1 update- alternatives: using /usr/local/work/java/bin/java to provide /usr/bin/java (java) in auto mode
  • 7. Page 6Classification: Restricted Installation Guide hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives -- install "/usr/bin/javac" "javac" "/usr/local/work/java/bin/javac" 1 update- alternatives: using /usr/local/work/java/bin/javac to provide /usr/bin/javac (javac) in auto mode hadoop@hadoop-VirtualBox:/usr/local/work$ sudo update-alternatives -- install "/usr/bin/javaws" "javaws" "/usr/local/work/java/bin/javaws" 1 update-alternatives: using /usr/local/work/java/bin/javaws to provide /usr/bin/javaws (javaws) in auto mode • Step:- 10 Set java to OS Command:-
  • 8. Page 7Classification: Restricted Installation Guide • Step:- 11 Update java & hadoop path in bashrc profile (env variables at the end of doc) Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ sudo nano ~/.bashrc hadoop@hadoop-VirtualBox:/usr/local/work$ source ~/.bashrc • Step:- 12 Check if the hadoop and java are installed perfectly. Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ echo $HADOOP_PREFIX output would be :- /usr/local/work/hadoop/
  • 9. Page 8Classification: Restricted Installation Guide hadoop@hadoop-VirtualBox:/usr/local/work$ echo $JAVA_HOME output would be :- /usr/local/work/java • Step:- 13 Copy the mapred-site.xml.template to mapred-site.xml in /usr/local/work/hadoop/etc/hadoop Command:- hadoop@hadoop-VirtualBox:/usr/local/work$ hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo cp mapred-site.xml.template mapred-site.xml
  • 10. Page 9Classification: Restricted Installation Guide • Step:- 14 Edit hadoop-env.sh file (you should be inside the directory /usr/local/work/hadoop/etc/hadoop) Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo nano hadoop-env.sh enter/change java path to your path of java installation : export JAVA_HOME=/usr/local/work/java • Step:- 15 Edit core-site.xml file (you should be inside the directory /usr/local/work/hadoop/etc/hadoop) Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo nano core-site.xml
  • 11. Page 10Classification: Restricted Installation Guide Enter the following properties inside it <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:8020</value> <final>true</final> </property> </configuration> • Step:- 16 Edit hdfs-site.xml file (you should be inside the directory /usr/local/work/hadoop/etc/hadoop) Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo nano hdfs-site.xml
  • 12. Page 11Classification: Restricted Installation Guide Enter the following properties inside it <configuration> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/work/hadoop/hadoop_data/dfs/name</value> </property> <property> <name>dfs.blocksize</name> <value>268435456</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/work/hadoop/hadoop_data/dfs/data</value> </property> </configuration>
  • 13. Page 12Classification: Restricted Installation Guide • Step:- 17 Edit mapred-site.xml file (you should be inside the directory /usr/local/work/hadoop/etc/hadoop) Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo nano mapred-site.xml Enter the following properties inside it <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
  • 14. Page 13Classification: Restricted Installation Guide • Step:- 18 Edit yarn-mapred.xml file (you should be inside the directory /usr/local/work/hadoop/etc/hadoop) Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ sudo nano yarn-site.xml Enter the following properties inside it <configuration> <property> <name>yarn.resourcemanager.address</name> <value>localhost:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>localhost:8030</value> </property>
  • 15. Page 14Classification: Restricted Installation Guide <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>localhost:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>localhost:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>localhost:8088</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capacity Scheduler</value> </property>
  • 16. Page 15Classification: Restricted Installation Guide <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.Capacity Scheduler</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>file:/usr/local/work/hadoop/hadoop_data/yarn/yarn.nodemanager.local- dirs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>file:/usr/local/work/hadoop/hadoop_data/yarn/logs</value> </property>
  • 17. Page 16Classification: Restricted Installation Guide <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration> • Step:- 19 Format name node Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ hadoop namenode -format • Step:- 20 Start services of hadoop Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ start- all.sh
  • 18. Page 17Classification: Restricted Installation Guide Following services would start • Namenode • Secondarynamenode • Datanode • Resourcemanger • Nodemanger • Jps • Step:- 21 Stop all services Command:- hadoop@hadoop-VirtualBox:/usr/local/work/hadoop/etc/hadoop$ stop- all.sh
  • 19. Page 18Classification: Restricted Installation Guide Note :- variable to be added in ~/.bashrc profile • export HADOOP_PREFIX="/usr/local/work/hadoop/" • export PATH=$PATH:$HADOOP_PREFIX/bin • export PATH=$PATH:$HADOOP_PREFIX/sbin • export HADOOP_COMMON_HOME=${HADOOP_PREFIX} • export HADOOP_MAPRED_HOME=${HADOOP_PREFIX} • export HADOOP_HDFS_HOME=${HADOOP_PREFIX} • export YARN_HOME=${HADOOP_PREFIX} • export JAVA_HOME="/usr/local/work/java" • export PATH=$PATH:$JAVA_HOME/bin
  • 20. Page 19Classification: Restricted Topics to be discussed in the next session • PIG • PIG - Overview • Installation and Running Pig • Load in Pig • Macros in Pig