0% found this document useful (0 votes)
17 views

Dsa Practical File

This document provides steps to install Hadoop on Windows and describes basic commands to add, retrieve, and delete data from HDFS. It outlines 10 steps to install Hadoop, including downloading Java and Hadoop, setting environment variables, editing configuration files, replacing Windows-specific files, and starting processes. It then demonstrates commands like hdfsdfs -mkdir to create directories, hdfsdfs -put to add files, hdfsdfs -cat to view file contents, and hdfsdfs -rmr to delete files and directories from HDFS.

Uploaded by

Giri Kanchan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Dsa Practical File

This document provides steps to install Hadoop on Windows and describes basic commands to add, retrieve, and delete data from HDFS. It outlines 10 steps to install Hadoop, including downloading Java and Hadoop, setting environment variables, editing configuration files, replacing Windows-specific files, and starting processes. It then demonstrates commands like hdfsdfs -mkdir to create directories, hdfsdfs -put to add files, hdfsdfs -cat to view file contents, and hdfsdfs -rmr to delete files and directories from HDFS.

Uploaded by

Giri Kanchan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Hadoop Installation

Step 1 – Download and Install Java

Step 2 – Download Hadoop and Extract it to a Folder

Step 3 – Setup System Environment Variables


Step 4 - Now we need to edit some files located in the Hadoop directory of the etc folder
List of files need to be edited:
core – site.xml

mapred – site
Create a folder ‘data’ and other two subfolders ‘datanode’ and ‘namenode’ in hadoop
directory.

Edit the file hdfs-site.xml and add below property in the configuration Note: The path of
namenode and datanode across value would be the path of the datanode and namenode
folders you just created.
Hdfs – site

yarn – site

Edit hadoop-env.cmd and replace %JAVA_HOME% with the path of the java jdkfolder

3
Step 5 – Hadoop needs windows OS specific files which does not come with default
download of Hadoop
To include those files, replace the bin folder in Hadoop directory with the bin folder provided
in this github link.
https://github.com/s911415/apache-hadoop-3.1.0-winutils
Download it as zip file. Extract it and copy the bin folder in it. If you want to save the old bin
folder, rename it like bin_old and paste the copied bin folder in that directory.

Step 6 – Check whether Hadoop is successfully installed by running this command on cmd-
hadoop version

Since it doesn’t throw error and successfully shows the Hadoop version, that means Hadoop
is successfully installed in the system.
Formatting the NameNode is done once when Hadoop is installed and not for running
Hadoop file system, else it will delete all the data inside HDFS. Run this command-
hdfsnamenode -format

Step 7 - Now change the directory in cmd to sbin folder of Hadoop directory with this
command, (Note: Make sure you are writing the path as per your system)

Step 8 – Start namenode and datanode with this command –


start -dfs.cmd
Two more cmd windows will open for NameNode and DataNode Now start yarn through this
command-
start -yarn.cmd

5
Note: Make sure all the 4 Apache Hadoop Distribution windows are up running. If they are
not running, you will see an error or a shutdown message. In that case, you need to debug the
error
Step 9 - To access information about resource manager current jobs, successful and failed
jobs, go to this link in browser-
http://localhost:8088/cluster

Step 10 - To check the details about the hdfs (namenode and datanode),
Open this link on browser-
http://localhost:9870/
Working with HDFS
1) We will be using a text file in local file system
2) We will create a directory name ‘sample’ in my hadoop directory using the cmd –
hdfsdfs -mkdir /sample

3) To verify if the directory is created in hdfs, use ‘ls’ cmd which will list the files
present in hdfs

4) Copy the text file named ‘potatoes’ from local file system to the folder we just created
in hdfs using copyFromLocalcmd –
5) To verify if the file is copied to the folder, I will use ‘ls’ command by specifying the
folder name which will read the list of files in that folder –

6) To view the contents of the file we copied, I will use cat command-

7) To Copy file from hdfs to local directory, I will use get command-
Syntax and Commands to Add, Retrieve and Delete Data From HDFS:
Adding Files and Directories to HDFS: Before you'll run Hadoop programs on data stored in
HDFS, you ‘ll got to put the info into HDFS first. Let ‘s creates a directory and put a enter it.
HDFS features a default working directory of /user/$USER, where $USER is your login user
name. This directory isn‘t automatically created for you, though, so let ‘s creates it with the
mkdir command. For the aim of illustration, we use chuck. you ought to substitute your user
name within the example commands. Hadoop fs -mkdir /user/chuck
hadoopfs -put example.txt
hadoopfs -put example.txt /user/chuck
Retrieving Files from HDFS: The Hadoop command get copies files from HDFS back to the
native filesystem. To retrieve example.txt, we are ready to run the next command:
hadoopfs -cat example.txt
Deleting Files from HDFS hadoopfs -rm example.txt. Command for creating a directory in
hdfs is
“hdfsdfs –mkdir /lendicse”.
Adding directory is done through the command “hdfsdfs –put lendi_english /”.
Copying Data from NFS to HDFS:
o Copying from directory command is “hdfsdfs –copyFromLocal
/home/lendi/Desktop/shakes/glossary /lendicse/”
o View the file by using the command “hdfsdfs –cat /lendi_english/glossary”
o Command for listing of things in Hadoop is “hdfsdfs –
lshdfs://localhost:9000/”. Command for Deleting files is “hdfsdfs –rm r
/example”.

Hadoop Commands
1) ls: This command is employed to list all the files. Use lsr for recursive approach. it's
useful once we need a hierarchy of a folder.
Syntax: bin/hdfsdfs -ls
Example: bin/hdfsdfs -ls /
It will print all the directories present in HDFS. bin directory contains executables so,
bin/hdfs means we would like the executables of hdfs particularly dfs (Distributed
File System) commands.

2) mkdir: To create a directory. In Hadoopdfs there's no home directory by default. So,


let’s first create it.
Syntax:
bin/hdfsdfs -mkdir
creating home directory:
hdfs/bin -mkdir /user
hdfs/bin -mkdir /user/username

Example:
bin/hdfsdfs -mkdir /geeks => '/' means absolute path
bin/hdfsdfs -mkdir geeks2 => Relative path -> the folder are going to be created
relative tothe house directory.

3) touchz:It creates an empty file.


Syntax:
bin/hdfsdfs -touchz
Example:
bin/hdfsdfs -touchz /geeks/myfile.txt

4) copyFromLocal (or) put: To copy files/folders from local filing system to hdfs store.
this is often the foremost important command. Local filesystem means the files
present on the OS.
Syntax:
bin/hdfsdfs -copyFromLocal
Example: Let’s suppose we've a file AI.txt on Desktop which we would like to repeat
to folder geeks present on hdfs.
bin/hdfsdfs -copyFromLocal ../Desktop/AI.txt /geeks

5) cat: To print file contents.


Syntax:
bin/hdfsdfs -cat <path>
Example:
// print the content of AI.txt present
// inside geeks folder.
bin/hdfsdfs -cat /geeks/AI.txt ->
6) copyToLocal (or) get: To copy files/folders from hdfs store to local file system.
Syntax:
bin/hdfsdfs -copyToLocal<<srcfile(on hdfs)><local file dest>
Example:
bin/hdfsdfs -copyToLocal /geeks ../Desktop/hero
(OR)
bin/hdfsdfs -get /geeks/myfile.txt ../Desktop/hero
myfile.txt from geeks folder will be copied to folder hero present on Desktop.

7) moveFromLocal: This command will move file from local to hdfs.


Syntax:
bin/hdfsdfs -moveFromLocal<local src><dest(on hdfs)>
Example:
bin/hdfsdfs -moveFromLocal ../Desktop/cutAndPaste.txt /geeks

8) cp: This command is used to copy files within hdfs. Lets copy
folder geeks to geeks_copied.
Syntax:
bin/hdfsdfs -cp<src(on hdfs)><dest(on hdfs)>
Example:
bin/hdfs -cp /geeks /geeks_copied
9) mv: This command is used to move files within hdfs. Lets cut-paste a
file myfile.txt from geeks folder to geeks_copied.
Syntax:
bin/hdfsdfs -mv <src(on hdfs)><src(on hdfs)>
Example:
bin/hdfs -mv /geeks/myfile.txt /geeks_copied

10) rmr: This command deletes a file from HDFS recursively. It is very useful command
when you want to delete a non-empty directory.
Syntax:
bin/hdfsdfs -rmr<filename/directoryName>
Example:
bin/hdfsdfs -rmr /geeks_copied -> It will delete all the content inside the
directory then the directory itself.

You might also like