SlideShare a Scribd company logo
Cloudera Hadoop (CDH 4)
Installation on Ubuntu 12.04 LTS
Sumitra Pundlik
Assistant Professor
Department of Computer Engineering
MIT College of Engineering
Kothrud, Pune 411038
asavari.deshpande@mitcoe.edu.in
Agenda
● Introduction to Hadoop
● Various components of Hadoop
● Installation steps for Cloudera Hadoop
Introduction to Hadoop
               ● The Apache Hadoop software library is a
framework that allows for the distributed
processing of large data sets across clusters
of computers using simple programming
models.
● It is designed to scale up from single servers
to thousands of machines, each offering local
computation and storage.
● The library itself is designed to detect and
handle failures at the application layer.
Various Components of Hadoop
The project includes these modules:
Hadoop Common: The common utilities that
support the other Hadoop modules.
Hadoop Distributed File System (HDFS™): A
distributed file system that provides high-throughput
access to application data.
Hadoop YARN: A framework for job scheduling and
cluster resource management.
Hadoop MapReduce: A YARN-based system for
parallel processing of large data sets.
Ambari™: A web-based tool for provisioning, managing, and
monitoring Apache Hadoop clusters which includes support for
Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase,
ZooKeeper, Oozie, Pig and Sqoop.
Avro™: A data serialization system.
Cassandra™: A scalable multi-master database with no single
points of failure.
Chukwa™: A data collection system for managing large
distributed systems.
HBase™: A scalable, distributed database that supports
structured data storage for large tables.
Hive™: A data warehouse infrastructure that provides data
summarization and ad hoc querying.
Mahout™: A Scalable machine learning and data mining
library.
● Pig™: A high-level data-flow language and
execution framework for parallel computation.
● Spark™: A fast and general compute engine for
Hadoop data. Spark provides a simple and
expressive programming model that supports a
wide range of applications, including ETL,
machine learning, stream processing, and graph
computation.
● Tez™: A generalized data-flow programming
framework, built on Hadoop YARN.
● ZooKeeper™: A high-performance coordination
service for distributed applications
Cloudera Hadoop Installation
● What is Cloudera Hadoop?
● What is Cloudera Manager?
● Prerequisite for installation
● Installation Steps with Screen Shot
What is Cloudera Hadoop
● CDH is the world’s most complete, tested, and
popular distribution of Apache Hadoop.
● CDH is 100% Apache-licensed open source.
● CDH bundled all Hadoop related projects at one
place.
Cloudera hadoop installation
What is Cloudera Manager
● Cloudera Manager automates the installation
and configuration of CDH on an entire cluster.
● Prerequisite
 Update your Ubuntu
 Password less ssh
 Password less sudo
 Edit host file
 Install database(MySQL/PostgreSQL/Oracle)
 Install JDBC connector for above databases.
Update Your Ubuntu Machine
● Run sudo apt-get update
● If you have any problem for update
sudo -i
apt-get clean
cd /var/lib/apt
mv lists lists.old
mkdir -p lists/partial
apt-get clean
apt-get update
● Still you are facing problem contact your
Technical Assistant
Password less SSH
● Secure Shell (SSH) is a cryptographic network protocol
for secure data communication, remote command-line
login, remote command execution, and other secure
network services between two networked computers.
● Install OpenSSH
sudo apt-get install openssh-server openssh-client
and change configuration of sshd_config file /etc/ssh/ by
using
sudo gedit /etc/ssh/sshd_config and set
PubkeyAuthentication to YES
sudo /etc/init.d/ssh reload
Password less SSH
● Run following command for password less ssh
1 ssh-keygen
2 ssh-add
3 ssh-copy-id -i exam@172.20.55.67
4 ssh exam@172.20.55.67
Run
3 and 4 command for cluster implementation with specific
hostname or user_name@ip_address from master machine
It means connect client machines from master machine.
Password less sudo
● Make Sudo password less
● Make changes in sudoers file
sudo gedit /etc/sudoers
%sudo ALL:= NOPASSWD:ALL
save that file
● For Cluster Implementation Need to change
sudoers file of each and every client machine
Edit hosts file
● In this file mention IP address and host name
of machine
example
172.20.55.62 ccompl0910
for cluster implementation mention all client IP
address and Host name in Masters hosts file
and masters IP address and Host Name in
each clients hosts file
Install database MySQL
sudo apt-get install mysql-server-5.5
login :-root
password :-password
Install JDBC connector and
configure for secure installation
sudo apt-get install libmysql-java
sudo /usr/bin/mysql_secure_installation
Enter current password for root (enter for none): password
Change the root password? [Y/n] n
Remove anonymous users? [Y/n] y
Disallow root login remotely? [Y/n] n
Remove test database and access to it? [Y/n] y
Reload privilege tables now? [Y/n] y
Restart mysql server
sudo service mysql restart
Create Database
Mysql -u root -p and enter password
create database sttpdatabase;
create database hive;
We need separate database for following activities
Activity Monitor
Service Monitor
Report Manager
Host Monitor
Cloudera Navigator
Supported OS
● Ubuntu 10.04 (Lucid Lynx), 64-bit
● Ubuntu 12.04 (Precise Pangolin), 64-bit
● Supported Browsers
Firefox 11 or later
Google Chrome
Internet Explorer 9
Safari 5 or later
● Supported Databases
● MySQL - 5.0, 5.1, 5.5
● Oracle - 10g Release 2, 11g Release 2
● PostgreSQL - 8.1, 8.3, 8.4, 9.1
● Supported JDK
● JDK1.7 or later
● Resources
● Cloudera Manager Server:
5 GB on the partition hosting /var.
500 MB on the partition hosting /usr
RAM - 4 GB is appropriate for most cases, and is
required when using Oracle databases
Python - Cloudera Manager uses Python.
● Installation Path
Path A: Automated Path
Path B: Your Own Method
PATH A Installation
● Step 1: Download and Run the Cloudera Manager Installer
● Download cloudera-manager-installer.bin
● Install Cloudera Manager on a single host.
● Change it to have executable permission
chmod u+x cloudera-manager-installer.bin
● Run installer bin
sudo ./cloudera-manager-installer.bin
● after completion of installer bin set up open browser with
http://localhost:7180
● Login : admin
● Password : admin
Cloudera hadoop installation
Cloudera hadoop installation
Row 1 Row 2 Row 3 Row 4
0
2
4
6
8
10
12
Column 1
Column 2
Column 3
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Cloudera hadoop installation
Ad

More Related Content

What's hot (20)

Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
Douglas Bernardini
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
jijukjoseph
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
DataWorks Summit
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Hadoop administration
Hadoop administrationHadoop administration
Hadoop administration
Aneesh Pulickal Karunakaran
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Instant hadoop of your own
Instant hadoop of your ownInstant hadoop of your own
Instant hadoop of your own
Jack (Yaakov) Bezalel
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
Alex Moundalexis
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
Chetan Khatri
 
Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for Hadoop
DataWorks Summit
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
Edureka!
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
Edureka!
 
Word count program execution steps in hadoop
Word count program execution steps in hadoopWord count program execution steps in hadoop
Word count program execution steps in hadoop
jijukjoseph
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
Shashwat Shriparv
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Upgrading hadoop
Upgrading hadoopUpgrading hadoop
Upgrading hadoop
Shashwat Shriparv
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
Positive Hack Days
 
ha_module5
ha_module5ha_module5
ha_module5
Gurmukh Singh
 
Hortonworks.Cluster Config Guide
Hortonworks.Cluster Config GuideHortonworks.Cluster Config Guide
Hortonworks.Cluster Config Guide
Douglas Bernardini
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
jijukjoseph
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
DataWorks Summit
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
Shashwat Shriparv
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
Shashwat Shriparv
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
Edureka!
 
Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
Alex Moundalexis
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
Chetan Khatri
 
Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for Hadoop
DataWorks Summit
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
Edureka!
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
Edureka!
 
Word count program execution steps in hadoop
Word count program execution steps in hadoopWord count program execution steps in hadoop
Word count program execution steps in hadoop
jijukjoseph
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
Shashwat Shriparv
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
mundlapudi
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
Nikita Kesharwani
 
Охота на уязвимости Hadoop
Охота на уязвимости HadoopОхота на уязвимости Hadoop
Охота на уязвимости Hadoop
Positive Hack Days
 

Viewers also liked (20)

Extending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIExtending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via API
ClouderaUserGroups
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera manager
Chris Westin
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation
Mahantesh Angadi
 
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
Phuong Truong
 
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Cloudera, Inc.
 
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & ExtensibilityCloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
ClouderaUserGroups
 
What the Enterprise Requires - Usability
What the Enterprise Requires - UsabilityWhat the Enterprise Requires - Usability
What the Enterprise Requires - Usability
Cloudera, Inc.
 
Backup+restore+linux
Backup+restore+linuxBackup+restore+linux
Backup+restore+linux
phanleson
 
AnalyzingMovieData and Business Intelligence
AnalyzingMovieData and Business IntelligenceAnalyzingMovieData and Business Intelligence
AnalyzingMovieData and Business Intelligence
JUNWEI GUAN
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
Andrei Savu
 
Unit testing Agile OpenSpace
Unit testing Agile OpenSpaceUnit testing Agile OpenSpace
Unit testing Agile OpenSpace
Andrei Savu
 
Directed Acyclic Graph
Directed Acyclic Graph Directed Acyclic Graph
Directed Acyclic Graph
AJAL A J
 
Apache Accumulo and Cloudera
Apache Accumulo and ClouderaApache Accumulo and Cloudera
Apache Accumulo and Cloudera
Joey Echeverria
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera, Inc.
 
CDH5最新情報 #cwt2013
CDH5最新情報 #cwt2013CDH5最新情報 #cwt2013
CDH5最新情報 #cwt2013
Cloudera Japan
 
Recommendation Engine using Apache Mahout
Recommendation Engine using Apache MahoutRecommendation Engine using Apache Mahout
Recommendation Engine using Apache Mahout
Ambarish Hazarnis
 
YARN High Availability
YARN High AvailabilityYARN High Availability
YARN High Availability
DataWorks Summit
 
Inside Flume
Inside FlumeInside Flume
Inside Flume
Cloudera, Inc.
 
Introducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data BashIntroducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data Bash
Andrei Savu
 
Extending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIExtending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via API
ClouderaUserGroups
 
Cluster management and automation with cloudera manager
Cluster management and automation with cloudera managerCluster management and automation with cloudera manager
Cluster management and automation with cloudera manager
Chris Westin
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation
Mahantesh Angadi
 
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
ĐỒ ÁN LÝ THUYẾT _ NHÓM 12
Phuong Truong
 
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Cloudera, Inc.
 
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & ExtensibilityCloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
ClouderaUserGroups
 
What the Enterprise Requires - Usability
What the Enterprise Requires - UsabilityWhat the Enterprise Requires - Usability
What the Enterprise Requires - Usability
Cloudera, Inc.
 
Backup+restore+linux
Backup+restore+linuxBackup+restore+linux
Backup+restore+linux
phanleson
 
AnalyzingMovieData and Business Intelligence
AnalyzingMovieData and Business IntelligenceAnalyzingMovieData and Business Intelligence
AnalyzingMovieData and Business Intelligence
JUNWEI GUAN
 
One Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data MeetupOne Hadoop, Multiple Clouds - NYC Big Data Meetup
One Hadoop, Multiple Clouds - NYC Big Data Meetup
Andrei Savu
 
Unit testing Agile OpenSpace
Unit testing Agile OpenSpaceUnit testing Agile OpenSpace
Unit testing Agile OpenSpace
Andrei Savu
 
Directed Acyclic Graph
Directed Acyclic Graph Directed Acyclic Graph
Directed Acyclic Graph
AJAL A J
 
Apache Accumulo and Cloudera
Apache Accumulo and ClouderaApache Accumulo and Cloudera
Apache Accumulo and Cloudera
Joey Echeverria
 
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera Manager Webinar | Cloudera Enterprise 3.7
Cloudera, Inc.
 
CDH5最新情報 #cwt2013
CDH5最新情報 #cwt2013CDH5最新情報 #cwt2013
CDH5最新情報 #cwt2013
Cloudera Japan
 
Recommendation Engine using Apache Mahout
Recommendation Engine using Apache MahoutRecommendation Engine using Apache Mahout
Recommendation Engine using Apache Mahout
Ambarish Hazarnis
 
Introducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data BashIntroducing Cloudera Director at Big Data Bash
Introducing Cloudera Director at Big Data Bash
Andrei Savu
 
Ad

Similar to Cloudera hadoop installation (20)

Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
Jim Kaskade
 
DC HUG Hadoop for Windows
DC HUG Hadoop for WindowsDC HUG Hadoop for Windows
DC HUG Hadoop for Windows
Terry Padgett
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
Mandakini Kumari
 
Exp-3.pptx
Exp-3.pptxExp-3.pptx
Exp-3.pptx
PraveenKumar581409
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
Unit 5
Unit  5Unit  5
Unit 5
Ravi Kumar
 
Hortonworks Setup & Configuration on Azure
Hortonworks Setup & Configuration on AzureHortonworks Setup & Configuration on Azure
Hortonworks Setup & Configuration on Azure
Anita Luthra
 
LuisRodriguezLocalDevEnvironmentsDrupalOpenDays
LuisRodriguezLocalDevEnvironmentsDrupalOpenDaysLuisRodriguezLocalDevEnvironmentsDrupalOpenDays
LuisRodriguezLocalDevEnvironmentsDrupalOpenDays
Luis Rodríguez Castromil
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
Sean Roberts
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
Amrut Patil
 
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
OpenShift Origin
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
Azure Virtual Machines Deployment Scenarios
Azure Virtual Machines Deployment ScenariosAzure Virtual Machines Deployment Scenarios
Azure Virtual Machines Deployment Scenarios
Brian Benz
 
DevOps: Cooking Drupal Deployment
DevOps: Cooking Drupal DeploymentDevOps: Cooking Drupal Deployment
DevOps: Cooking Drupal Deployment
Gerald Villorente
 
Spark with HDInsight
Spark with HDInsightSpark with HDInsight
Spark with HDInsight
Khalid Salama
 
Hadoop and Big Data
Hadoop and Big DataHadoop and Big Data
Hadoop and Big Data
Harshdeep Kaur
 
Micro Datacenter & Data Warehouse
Micro Datacenter & Data WarehouseMicro Datacenter & Data Warehouse
Micro Datacenter & Data Warehouse
mdcdwh
 
Apache Street Smarts Presentation (SANS 99)
Apache Street Smarts Presentation (SANS 99)Apache Street Smarts Presentation (SANS 99)
Apache Street Smarts Presentation (SANS 99)
Michael Dobe, Ph.D.
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
Jim Kaskade
 
DC HUG Hadoop for Windows
DC HUG Hadoop for WindowsDC HUG Hadoop for Windows
DC HUG Hadoop for Windows
Terry Padgett
 
Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04Big data with hadoop Setup on Ubuntu 12.04
Big data with hadoop Setup on Ubuntu 12.04
Mandakini Kumari
 
Applications on Hadoop
Applications on HadoopApplications on Hadoop
Applications on Hadoop
markgrover
 
Hortonworks Setup & Configuration on Azure
Hortonworks Setup & Configuration on AzureHortonworks Setup & Configuration on Azure
Hortonworks Setup & Configuration on Azure
Anita Luthra
 
LuisRodriguezLocalDevEnvironmentsDrupalOpenDays
LuisRodriguezLocalDevEnvironmentsDrupalOpenDaysLuisRodriguezLocalDevEnvironmentsDrupalOpenDays
LuisRodriguezLocalDevEnvironmentsDrupalOpenDays
Luis Rodríguez Castromil
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
Sean Roberts
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
sudhakara st
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
Amrut Patil
 
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
Build Your Own PaaS, Just like Red Hat's OpenShift from LinuxCon 2013 New Orl...
OpenShift Origin
 
CCD-410 Cloudera Study Material
CCD-410 Cloudera Study MaterialCCD-410 Cloudera Study Material
CCD-410 Cloudera Study Material
Roxycodone Online
 
Azure Virtual Machines Deployment Scenarios
Azure Virtual Machines Deployment ScenariosAzure Virtual Machines Deployment Scenarios
Azure Virtual Machines Deployment Scenarios
Brian Benz
 
DevOps: Cooking Drupal Deployment
DevOps: Cooking Drupal DeploymentDevOps: Cooking Drupal Deployment
DevOps: Cooking Drupal Deployment
Gerald Villorente
 
Spark with HDInsight
Spark with HDInsightSpark with HDInsight
Spark with HDInsight
Khalid Salama
 
Micro Datacenter & Data Warehouse
Micro Datacenter & Data WarehouseMicro Datacenter & Data Warehouse
Micro Datacenter & Data Warehouse
mdcdwh
 
Apache Street Smarts Presentation (SANS 99)
Apache Street Smarts Presentation (SANS 99)Apache Street Smarts Presentation (SANS 99)
Apache Street Smarts Presentation (SANS 99)
Michael Dobe, Ph.D.
 
Ad

Recently uploaded (20)

DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 
DSP and MV the Color image processing.ppt
DSP and MV the  Color image processing.pptDSP and MV the  Color image processing.ppt
DSP and MV the Color image processing.ppt
HafizAhamed8
 
Mathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdfMathematical foundation machine learning.pdf
Mathematical foundation machine learning.pdf
TalhaShahid49
 
Compiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptxCompiler Design_Lexical Analysis phase.pptx
Compiler Design_Lexical Analysis phase.pptx
RushaliDeshmukh2
 
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptxExplainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
Explainable-Artificial-Intelligence-XAI-A-Deep-Dive (1).pptx
MahaveerVPandit
 
some basics electrical and electronics knowledge
some basics electrical and electronics knowledgesome basics electrical and electronics knowledge
some basics electrical and electronics knowledge
nguyentrungdo88
 
Compiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptxCompiler Design Unit1 PPT Phases of Compiler.pptx
Compiler Design Unit1 PPT Phases of Compiler.pptx
RushaliDeshmukh2
 
Level 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical SafetyLevel 1-Safety.pptx Presentation of Electrical Safety
Level 1-Safety.pptx Presentation of Electrical Safety
JoseAlbertoCariasDel
 
theory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptxtheory-slides-for react for beginners.pptx
theory-slides-for react for beginners.pptx
sanchezvanessa7896
 
introduction to machine learining for beginers
introduction to machine learining for beginersintroduction to machine learining for beginers
introduction to machine learining for beginers
JoydebSheet
 
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITYADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ADVXAI IN MALWARE ANALYSIS FRAMEWORK: BALANCING EXPLAINABILITY WITH SECURITY
ijscai
 
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
211421893-M-Tech-CIVIL-Structural-Engineering-pdf.pdf
inmishra17121973
 
Reagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptxReagent dosing (Bredel) presentation.pptx
Reagent dosing (Bredel) presentation.pptx
AlejandroOdio
 
Artificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptxArtificial Intelligence (AI) basics.pptx
Artificial Intelligence (AI) basics.pptx
aditichinar
 
Introduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptxIntroduction to Zoomlion Earthmoving.pptx
Introduction to Zoomlion Earthmoving.pptx
AS1920
 
IntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdfIntroSlides-April-BuildWithAI-VertexAI.pdf
IntroSlides-April-BuildWithAI-VertexAI.pdf
Luiz Carneiro
 
Introduction to FLUID MECHANICS & KINEMATICS
Introduction to FLUID MECHANICS &  KINEMATICSIntroduction to FLUID MECHANICS &  KINEMATICS
Introduction to FLUID MECHANICS & KINEMATICS
narayanaswamygdas
 
AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)AI-assisted Software Testing (3-hours tutorial)
AI-assisted Software Testing (3-hours tutorial)
Vəhid Gəruslu
 
International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)International Journal of Distributed and Parallel systems (IJDPS)
International Journal of Distributed and Parallel systems (IJDPS)
samueljackson3773
 
Machine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptxMachine learning project on employee attrition detection using (2).pptx
Machine learning project on employee attrition detection using (2).pptx
rajeswari89780
 
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdffive-year-soluhhhhhhhhhhhhhhhhhtions.pdf
five-year-soluhhhhhhhhhhhhhhhhhtions.pdf
AdityaSharma944496
 

Cloudera hadoop installation

  • 1. Cloudera Hadoop (CDH 4) Installation on Ubuntu 12.04 LTS Sumitra Pundlik Assistant Professor Department of Computer Engineering MIT College of Engineering Kothrud, Pune 411038 [email protected]
  • 2. Agenda ● Introduction to Hadoop ● Various components of Hadoop ● Installation steps for Cloudera Hadoop
  • 3. Introduction to Hadoop                ● The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. ● It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. ● The library itself is designed to detect and handle failures at the application layer.
  • 5. The project includes these modules: Hadoop Common: The common utilities that support the other Hadoop modules. Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data. Hadoop YARN: A framework for job scheduling and cluster resource management. Hadoop MapReduce: A YARN-based system for parallel processing of large data sets.
  • 6. Ambari™: A web-based tool for provisioning, managing, and monitoring Apache Hadoop clusters which includes support for Hadoop HDFS, Hadoop MapReduce, Hive, HCatalog, HBase, ZooKeeper, Oozie, Pig and Sqoop. Avro™: A data serialization system. Cassandra™: A scalable multi-master database with no single points of failure. Chukwa™: A data collection system for managing large distributed systems. HBase™: A scalable, distributed database that supports structured data storage for large tables. Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying. Mahout™: A Scalable machine learning and data mining library.
  • 7. ● Pig™: A high-level data-flow language and execution framework for parallel computation. ● Spark™: A fast and general compute engine for Hadoop data. Spark provides a simple and expressive programming model that supports a wide range of applications, including ETL, machine learning, stream processing, and graph computation. ● Tez™: A generalized data-flow programming framework, built on Hadoop YARN. ● ZooKeeper™: A high-performance coordination service for distributed applications
  • 8. Cloudera Hadoop Installation ● What is Cloudera Hadoop? ● What is Cloudera Manager? ● Prerequisite for installation ● Installation Steps with Screen Shot
  • 9. What is Cloudera Hadoop ● CDH is the world’s most complete, tested, and popular distribution of Apache Hadoop. ● CDH is 100% Apache-licensed open source. ● CDH bundled all Hadoop related projects at one place.
  • 11. What is Cloudera Manager ● Cloudera Manager automates the installation and configuration of CDH on an entire cluster. ● Prerequisite  Update your Ubuntu  Password less ssh  Password less sudo  Edit host file  Install database(MySQL/PostgreSQL/Oracle)  Install JDBC connector for above databases.
  • 12. Update Your Ubuntu Machine ● Run sudo apt-get update ● If you have any problem for update sudo -i apt-get clean cd /var/lib/apt mv lists lists.old mkdir -p lists/partial apt-get clean apt-get update ● Still you are facing problem contact your Technical Assistant
  • 13. Password less SSH ● Secure Shell (SSH) is a cryptographic network protocol for secure data communication, remote command-line login, remote command execution, and other secure network services between two networked computers. ● Install OpenSSH sudo apt-get install openssh-server openssh-client and change configuration of sshd_config file /etc/ssh/ by using sudo gedit /etc/ssh/sshd_config and set PubkeyAuthentication to YES sudo /etc/init.d/ssh reload
  • 14. Password less SSH ● Run following command for password less ssh 1 ssh-keygen 2 ssh-add 3 ssh-copy-id -i [email protected] 4 ssh [email protected] Run 3 and 4 command for cluster implementation with specific hostname or user_name@ip_address from master machine It means connect client machines from master machine.
  • 15. Password less sudo ● Make Sudo password less ● Make changes in sudoers file sudo gedit /etc/sudoers %sudo ALL:= NOPASSWD:ALL save that file ● For Cluster Implementation Need to change sudoers file of each and every client machine
  • 16. Edit hosts file ● In this file mention IP address and host name of machine example 172.20.55.62 ccompl0910 for cluster implementation mention all client IP address and Host name in Masters hosts file and masters IP address and Host Name in each clients hosts file
  • 17. Install database MySQL sudo apt-get install mysql-server-5.5 login :-root password :-password
  • 18. Install JDBC connector and configure for secure installation sudo apt-get install libmysql-java sudo /usr/bin/mysql_secure_installation Enter current password for root (enter for none): password Change the root password? [Y/n] n Remove anonymous users? [Y/n] y Disallow root login remotely? [Y/n] n Remove test database and access to it? [Y/n] y Reload privilege tables now? [Y/n] y Restart mysql server sudo service mysql restart
  • 19. Create Database Mysql -u root -p and enter password create database sttpdatabase; create database hive; We need separate database for following activities Activity Monitor Service Monitor Report Manager Host Monitor Cloudera Navigator
  • 20. Supported OS ● Ubuntu 10.04 (Lucid Lynx), 64-bit ● Ubuntu 12.04 (Precise Pangolin), 64-bit ● Supported Browsers Firefox 11 or later Google Chrome Internet Explorer 9 Safari 5 or later
  • 21. ● Supported Databases ● MySQL - 5.0, 5.1, 5.5 ● Oracle - 10g Release 2, 11g Release 2 ● PostgreSQL - 8.1, 8.3, 8.4, 9.1 ● Supported JDK ● JDK1.7 or later
  • 22. ● Resources ● Cloudera Manager Server: 5 GB on the partition hosting /var. 500 MB on the partition hosting /usr RAM - 4 GB is appropriate for most cases, and is required when using Oracle databases Python - Cloudera Manager uses Python. ● Installation Path Path A: Automated Path Path B: Your Own Method
  • 23. PATH A Installation ● Step 1: Download and Run the Cloudera Manager Installer ● Download cloudera-manager-installer.bin ● Install Cloudera Manager on a single host. ● Change it to have executable permission chmod u+x cloudera-manager-installer.bin ● Run installer bin sudo ./cloudera-manager-installer.bin ● after completion of installer bin set up open browser with http://localhost:7180 ● Login : admin ● Password : admin
  • 26. Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3