0% found this document useful (0 votes)

115 views

Noeud

The document provides instructions for setting up a Hadoop 3.1.1 and Spark 2.4.0 cluster on Ubuntu 18.04 using one master node and two slave nodes. It includes steps to configure static IP addresses, install Java, download and extract Hadoop and Spark, configure environment variables, copy configuration files to the master node, start Hadoop and Spark services, and verify the cluster is functioning properly. It also provides instructions for cloning the virtual machines to the slave nodes, formatting HDFS, and creating a simple Spark application to test the setup.

Uploaded by

aitlhaj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

115 views

Noeud

Uploaded by

aitlhaj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 4

# Setup Hadoop-3.1.1 & spark-2.4.

0 cluster
# using ubuntu 18.04

#1
sudo vi /etc/hosts
172.20.10.4 server
172.20.10.5 slave1
172.20.10.6 slave2

#2 vi /etc/netplan/50-cloud-init.yml
#changing to static ip

network:
ethernets:
enp0s3:
addresses: [172.20.10.4/24]
gateway4: 172.20.10.1
nameservers:
addresses: [8.8.8.8,8.8.4.4]
dhcp4: no
version: 2

sudo netplan apply

#3 change name by editing /etc/hostname or hostnamectl
sudo hostnamectl set-hostname master
hostname
sudo usermod -aG sudo ziyati

#4 connect to master to install openssh-server

sudo apt-get remove --purge openssh-server
sudo apt-get install openssh-server
#if problem during install openssh-server
sudo apt-get install aptitude
sudo aptitude install openssh-client=required_version
sudo aptitude install openssh-client=1:7.6p1-4
# Problem install openssh-server
ssh-keygen -t rsa -P ""
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys

#5 Test secure connection

ssh ziyati@localhost
logout

#6 install java
sudo apt install openjdk-8-jdk
update-java-alternatives -l

# may be problems occurs during install

## then
sudo rm /var/lib/dpkg/updates/000*
sudo apt-get clean
sudo apt-get update
sudo apt-get install ttf-mscorefonts-installer

#7 download hadoop
curl -O http://mirror.cogentco.com/pub/apache/hadoop/common/hadoop-3.1.1/hadoop-
3.1.1.tar.gz
tar -xzf hadoop-3.1.2.tar.gz
sudo mv hadoop-3.1.2 /usr/local/hadoop
mkdir -p /home/ziyati/hadoop_tmp/{data,name}
rm hadoop-3.1.2.tar.gz

#7.1 Set up hadoop environment variables.

echo export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 >> ~/.bashrc

echo export PATH=\$JAVA_HOME/bin:\$PATH >> ~/.bashrc
echo export HADOOP_HOME=/usr/local/hadoop >> ~/.bashrc
echo export PATH=\$HADOOP_HOME/bin:\$HADOOP_HOME/sbin:\$PATH >> ~/.bashrc
echo export HADOOP_CONF_DIR=\$HADOOP_HOME"/etc/hadoop" >> ~/.bashrc

source ~/.bashrc

#7.2 Copy configuration files for master node.

cp master/* /usr/local/hadoop/etc/hadoop/

#8 download spark
curl -O https://www-eu.apache.org/dist/spark/spark-2.4.0/spark-2.4.0-bin-
hadoop2.7.tgz
tar -xzf spark-2.4.3-bin-hadoop2.7.tgz
sudo mv spark-2.4.3-bin-hadoop2.7 /usr/local/spark
rm spark-2.4.0-bin-hadoop2.7.tgz

#8.1 Set up spark environment variables.

echo export SPARK_HOME=/usr/local/spark >> ~/.bashrc

echo export PATH=\$SPARK_HOME/bin:\$PATH >> ~/.bashrc
echo export PATH=\$SPARK_HOME/sbin:\$PATH >> ~/.bashrc
source .bashrc

#8.2 Set up spark files.

vi $SPARK_HOME/conf/slaves
172.20.10.5
172.20.10.6
#9 Clone VM to slave1 and slave2
# change IP and hostname

after change ip
you have to actualize

sudo netplan apply

# Format HDFS
hdfs namenode -format

#10 Start hadoop services

cd /usr/local/hadoop/sbin
./start-dfs.sh
./start-yarn.sh

#11 Connect to master to verify

http://172.20.10.4:9870
http://172.20.10.4:8088

#start spark
cd /usr/local/spark/sbin

./start-all.sh
http://172.20.10.4:8080
spark should launched from master

############
# Thanks !
############

#all services are up in server

#check in slave1
jps

# Complete the env Anaconda

# Installing Jupyter
curl -O https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
bash Anaconda3-2019.03-Linux-x86_64.sh

# create a virtual env called jupyter

conda create -n jupyter
# activate it
source activate jupyter
conda install notebook

# start jupyter using ip (Server)

jupyter notebook --ip 172.20.10.4

# Installing findspark
# findspark is a Python library that automatically allow you to import and use
PySpark as any other Python library.

pip install findspark

# Create your first Spark application

#Node1
import findspark
#Node2
findspark.init()
#Node3
import pyspark
#Node4
sc = pyspark.SparkContext(master='spark://172.20.10.4:7077', appName='myApp')

Catalog Dia Compe 2016 2017
No ratings yet
Catalog Dia Compe 2016 2017
40 pages
Big Data Analytics - Lab-Manual
No ratings yet
Big Data Analytics - Lab-Manual
19 pages
Hadoop Installation 2
No ratings yet
Hadoop Installation 2
5 pages
Installationof Hadoop 3
No ratings yet
Installationof Hadoop 3
6 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hedaya Alasooly
No ratings yet
Single Node Cluster Creation in AWS Educate EC2
No ratings yet
Single Node Cluster Creation in AWS Educate EC2
4 pages
Configuration of a Simple Samba File Server, Quota and Schedule Backup
From Everand
Configuration of a Simple Samba File Server, Quota and Schedule Backup
Dr. Hidaia Mahmood Alassouli
No ratings yet
HDFS Installation Guide-Anju
No ratings yet
HDFS Installation Guide-Anju
4 pages
Hadoop 3x Installation With HA
No ratings yet
Hadoop 3x Installation With HA
17 pages
BDA LAB Programs
No ratings yet
BDA LAB Programs
56 pages
BDA lab manual UPDATED
No ratings yet
BDA lab manual UPDATED
45 pages
Hadoop 3 Installation
No ratings yet
Hadoop 3 Installation
10 pages
Installing A Single Node Hadoop Cluster
No ratings yet
Installing A Single Node Hadoop Cluster
4 pages
Online:: Setting Up The Environment
No ratings yet
Online:: Setting Up The Environment
9 pages
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
No ratings yet
Installing Standalone and Pseudocode Hadoop Cluster: 1. Setting Up Vmware Virtual Machine
14 pages
Hadoop Installation
No ratings yet
Hadoop Installation
3 pages
Hadoop Installation Commands
No ratings yet
Hadoop Installation Commands
3 pages
Create A Multi-Node Cluster For Distributed Hadoop Environment
No ratings yet
Create A Multi-Node Cluster For Distributed Hadoop Environment
5 pages
Nitish Steps To Install Hadoop
No ratings yet
Nitish Steps To Install Hadoop
3 pages
Original
No ratings yet
Original
17 pages
Hadoop Installation On Linux
No ratings yet
Hadoop Installation On Linux
4 pages
Hadoop Installation Manual 2.odt
No ratings yet
Hadoop Installation Manual 2.odt
20 pages
BDAO
No ratings yet
BDAO
23 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
Step 1 - Install Oracle Java 8 On Ubuntu
No ratings yet
Step 1 - Install Oracle Java 8 On Ubuntu
7 pages
hadoop configure 3.3.6 configuration
No ratings yet
hadoop configure 3.3.6 configuration
2 pages
Hadoop
No ratings yet
Hadoop
5 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
How To Set Up A Multi-Node Hadoop Cluster On Amazon EC2
No ratings yet
How To Set Up A Multi-Node Hadoop Cluster On Amazon EC2
25 pages
Hadoop All Installations
No ratings yet
Hadoop All Installations
19 pages
Hadoop Installation Steps
100% (1)
Hadoop Installation Steps
6 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster) STEP:1
13 pages
Bda Lab
No ratings yet
Bda Lab
37 pages
Hadoop Installation Guide
No ratings yet
Hadoop Installation Guide
18 pages
PRACTICAL 4 - Single and Multi Node Hadoop Install
No ratings yet
PRACTICAL 4 - Single and Multi Node Hadoop Install
11 pages
BDA Practical
No ratings yet
BDA Practical
38 pages
$ Sudo Apt-Get Install Oracle-Java8-Installer
No ratings yet
$ Sudo Apt-Get Install Oracle-Java8-Installer
4 pages
Setting Hadoop and Mysql 8.0
No ratings yet
Setting Hadoop and Mysql 8.0
3 pages
DataVisuaization Lab
No ratings yet
DataVisuaization Lab
5 pages
Hadoop 2 - Pseudo Node Installation
No ratings yet
Hadoop 2 - Pseudo Node Installation
9 pages
Expt 1 - Hadoop Installation
No ratings yet
Expt 1 - Hadoop Installation
10 pages
Installation of Hadoop
No ratings yet
Installation of Hadoop
6 pages
Hadoop Installation (1)
No ratings yet
Hadoop Installation (1)
6 pages
Hadoop Single Node Installation
No ratings yet
Hadoop Single Node Installation
7 pages
Hadoop and Hive Installation
No ratings yet
Hadoop and Hive Installation
19 pages
Hadoop Installation
No ratings yet
Hadoop Installation
7 pages
Experiment No - 1
No ratings yet
Experiment No - 1
13 pages
BDA exp-1.2
No ratings yet
BDA exp-1.2
3 pages
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
No ratings yet
Hadoop 2.6 Installing On Ubuntu 14.04 (Single-Node Cluster)
27 pages
Group A 1st
No ratings yet
Group A 1st
4 pages
BDA Practical1 MC18-23
No ratings yet
BDA Practical1 MC18-23
17 pages
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
No ratings yet
2023MCS320004 HEMANTH TARRA - Hadoop Installation - Assignment
9 pages
dsbda 1
No ratings yet
dsbda 1
2 pages
Updated CMD
No ratings yet
Updated CMD
23 pages
big data
No ratings yet
big data
5 pages
Install Hdfs
No ratings yet
Install Hdfs
3 pages
Hadoop Installation
No ratings yet
Hadoop Installation
6 pages
Hadoop Multi Node Cluster
No ratings yet
Hadoop Multi Node Cluster
7 pages
Single Node Hadoop Cluster
No ratings yet
Single Node Hadoop Cluster
9 pages
Hadoop Installation
No ratings yet
Hadoop Installation
4 pages
Backend Handbook: for Ruby on Rails Apps
From Everand
Backend Handbook: for Ruby on Rails Apps
Francisco Quintero
1/5 (1)
Parselineproduct: Float Int
No ratings yet
Parselineproduct: Float Int
1 page
Drugs
No ratings yet
Drugs
1 page
HR Comma Sep - CSV
No ratings yet
HR Comma Sep - CSV
255 pages
Install and Configure Nfs
No ratings yet
Install and Configure Nfs
3 pages
Creating and Configuring Network Bonding Nic
No ratings yet
Creating and Configuring Network Bonding Nic
5 pages
How To Setup A Basic Zookeeper Ensemble: Tony Zampogna
No ratings yet
How To Setup A Basic Zookeeper Ensemble: Tony Zampogna
3 pages
CGI Programming Part 1 - Perl Hacks
No ratings yet
CGI Programming Part 1 - Perl Hacks
15 pages
Add Disk and Create A Partition
No ratings yet
Add Disk and Create A Partition
2 pages
Querying With Transact-SQL: Lab 8 - Grouping Sets and Pivoti NG Data
No ratings yet
Querying With Transact-SQL: Lab 8 - Grouping Sets and Pivoti NG Data
2 pages
Docker Compose NoSQL
No ratings yet
Docker Compose NoSQL
6 pages
Querying With Transact-SQL: Lab 4 - Using Set Operators
No ratings yet
Querying With Transact-SQL: Lab 4 - Using Set Operators
2 pages
2.1 02 Correlation
No ratings yet
2.1 02 Correlation
3 pages
Lab 03
No ratings yet
Lab 03
2 pages
Querying With Transact-SQL: Lab 7 - Using Table Expressions
No ratings yet
Querying With Transact-SQL: Lab 7 - Using Table Expressions
2 pages
Querying With Transact-SQL: Lab 5 - Using Functi Ons and Aggregati NG Data
No ratings yet
Querying With Transact-SQL: Lab 5 - Using Functi Ons and Aggregati NG Data
2 pages
Querying With Transact-SQL: Lab 6 - Using Subqueries and APPLY
No ratings yet
Querying With Transact-SQL: Lab 6 - Using Subqueries and APPLY
2 pages
4.1 04 Deviation
No ratings yet
4.1 04 Deviation
3 pages
Platformsh Fleet Ops Alternative To Kubernetes
No ratings yet
Platformsh Fleet Ops Alternative To Kubernetes
19 pages
2.1 02 Continuous and Discrete
No ratings yet
2.1 02 Continuous and Discrete
1 page
Multi-Channel Measurement Card ML801B (Continued) : B00554 - 30 - E00 - 01 16.03.2022 HBM: Public HBM
No ratings yet
Multi-Channel Measurement Card ML801B (Continued) : B00554 - 30 - E00 - 01 16.03.2022 HBM: Public HBM
1 page
Q 1 Module 2 Separation Method
No ratings yet
Q 1 Module 2 Separation Method
7 pages
MechMPharm2010 11
No ratings yet
MechMPharm2010 11
2 pages
Energy Density Approach To Calculation of Inelastic Strain-Stress Near Notches and Cracks
No ratings yet
Energy Density Approach To Calculation of Inelastic Strain-Stress Near Notches and Cracks
25 pages
Bulk Active Structural System
100% (1)
Bulk Active Structural System
30 pages
1 Steady-State Tornado Vortex Models 2 The Rankine Combined Vortex
No ratings yet
1 Steady-State Tornado Vortex Models 2 The Rankine Combined Vortex
14 pages
Permeability and Compression Characteristics of Clay Contaminated With Kerosene and Gasoil
No ratings yet
Permeability and Compression Characteristics of Clay Contaminated With Kerosene and Gasoil
10 pages
6955 PDF
No ratings yet
6955 PDF
24 pages
5th sem Admit Card
No ratings yet
5th sem Admit Card
1 page
'99 Suzuki Hayabusa GSX1300R Motorcycle Service Manual PDF
No ratings yet
'99 Suzuki Hayabusa GSX1300R Motorcycle Service Manual PDF
503 pages
012.quality Assurance of Penetrant Inspection Materials
No ratings yet
012.quality Assurance of Penetrant Inspection Materials
1 page
PL_ALAT_DAPUR final (1)
No ratings yet
PL_ALAT_DAPUR final (1)
7 pages
Unige 89711 Attachment01
No ratings yet
Unige 89711 Attachment01
7 pages
Easa Pocket Book PDF
0% (1)
Easa Pocket Book PDF
116 pages
3coated Steel
No ratings yet
3coated Steel
20 pages
Is ACE-V A Process or A Method?: Michele Triplett
No ratings yet
Is ACE-V A Process or A Method?: Michele Triplett
2 pages
Brochure Bonderite Process Solutions Automotive Components
100% (1)
Brochure Bonderite Process Solutions Automotive Components
28 pages
SWENG 582 Real-Time Systems Design and Analysis: Dr. Phillip A. Laplante, PE Associate Professor of Software Engineering
No ratings yet
SWENG 582 Real-Time Systems Design and Analysis: Dr. Phillip A. Laplante, PE Associate Professor of Software Engineering
48 pages
Line Scan Camera With KSDK [ADC + PIT + GPIO]
No ratings yet
Line Scan Camera With KSDK [ADC + PIT + GPIO]
25 pages
Activity Point Details-IsE
No ratings yet
Activity Point Details-IsE
6 pages
H3PO4
No ratings yet
H3PO4
14 pages
Class12 ProggAssign Answers Notes
No ratings yet
Class12 ProggAssign Answers Notes
66 pages
Irc Gov in 073 1990 PDF
No ratings yet
Irc Gov in 073 1990 PDF
64 pages
Drillmax Float Valve Brochure
No ratings yet
Drillmax Float Valve Brochure
16 pages
Petroleum and Natural Gas Industries - Well Integrity Standard
No ratings yet
Petroleum and Natural Gas Industries - Well Integrity Standard
12 pages
Global Supplier Quality Manual-Second Edition
No ratings yet
Global Supplier Quality Manual-Second Edition
43 pages
FrigidaIre Front Load Washer Rear Bearing Replacement
100% (1)
FrigidaIre Front Load Washer Rear Bearing Replacement
7 pages
Full Download Civil Engineering Materials: From Theory To Practice 1st Edition Qiang Yuan - Ebook PDF
100% (3)
Full Download Civil Engineering Materials: From Theory To Practice 1st Edition Qiang Yuan - Ebook PDF
41 pages
Full Download The Three Dimensional Navier Stokes Equations Classical Theory 1st Edition James C. Robinson PDF DOCX
100% (1)
Full Download The Three Dimensional Navier Stokes Equations Classical Theory 1st Edition James C. Robinson PDF DOCX
65 pages

Noeud

Uploaded by

Noeud

Uploaded by

# Setup Hadoop-3.1.1 & spark-2.4.

sudo netplan apply

#4 connect to master to install openssh-server

#5 Test secure connection

# may be problems occurs during install

#7.1 Set up hadoop environment variables.

echo export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-amd64 >> ~/.bashrc

#7.2 Copy configuration files for master node.

#8.1 Set up spark environment variables.

echo export SPARK_HOME=/usr/local/spark >> ~/.bashrc

#8.2 Set up spark files.

sudo netplan apply

#10 Start hadoop services

#11 Connect to master to verify

#all services are up in server

# Complete the env Anaconda

# create a virtual env called jupyter

# start jupyter using ip (Server)

pip install findspark

# Create your first Spark application

You might also like