Cs8711-Cloud Computing Laboratory Manual
Cs8711-Cloud Computing Laboratory Manual
PREPARED BY APPROVED BY
Exercises:
TOTAL: 45 PERIODS
FORMAT NO: LP 01
COURSE OBJECTIVES:
To develop web applications in cloud
To learn the design and development process involved in creating a cloud based application
To learn to implement and use parallel programming using Hadoop
LIST OF EXPERIMENTS:
Allotted
S.No Batch Experiment Name
hours
CLOU D COMPU TING LAB
Install Virtualbox/VMware Workstation with different
1 I, II 3 Hours
flavours of linux or windows OS on top of windows
Install a C compiler in the virtual machine
2 I, II created using virtual box and execute Simple 3 Hours
Programs.
Install Google App Engine. Create hello world app
3 I, II and other simple web applications using 3 Hours
python/java.
4 I, II Use GAE launcher to launch the web applications. 3 Hours
Simulate a cloud scenario using CloudSim and run a
5 I, II 3 Hours
scheduling algorithm that is not present in CloudSim
Find a procedure to transfer the files from one virtual
6 I, II 3 Hours
machine to another virtual machine
Find a procedure to launch virtual machine using trystack
7 I, II 3 Hours
(Online Openstack Demo Version)
Install Hadoop single node cluster and run simple
8 I, II 3 Hours
applications like wordcount.
Content beyond syllabus
9 I, II Installation of Open Nebula sunstone 3 Hours
Creation of Virtual block to a Virtual Machine in Open
10 I, II 3 Hours
Nebula
11 I, II Virtual Machine Migration in Open Nebula
COURSE OUTCOMES:
CO1: Configure various virtualization tools such as Virtual Box, VMware
workstation. CO2: Design and deploy a web application in a PaaS
environment.
CO3: Learn how to simulate a cloud environment to implement new
schedulers. CO4: Install and use a generic cloud environment that can be
used as a private cloud. CO5: Manipulate large data sets in a parallel
environment.
PROGRAM OUTCOMES:
PO1: Engineering knowledge: Apply the knowledge of mathematics, science, engineering
fundamentals, and an engineering specialization to the solution of complex engineering
problems.
PO2 : Problem analysis: Identify, formulate, research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.
PO5 : Modern tool usage: Create, select, and apply appropriate techniques, resources, and
modern engineering and IT tools including prediction and modeling to complex engineering
activities with an understanding of the limitations.
PO6 :The engineer and society: Apply reasoning informed by the contextual knowledge to
assess societal, health, safety, legal and cultural issues and the consequent responsibilities
relevant to the professional engineering practice.
PO7: Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and need for
sustainable development.
PO8 : Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
PO9 : Individual and team work: Function effectively as an individual, and as a member or
leader in diverse teams, and in multidisciplinary settings.
PO10 : Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive
clear instructions.
PO11: Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
PO12 : Life-long learning: Recognize the need for, and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change.
PROGRAM SPECIFIC OUTCOMES:
PSO1: An ability to analyze a problem, and identify and define the computing requirements
appropriate to its solution.
PSO2: An ability to apply design and development principles in the construction of software systems of
varying complexity.
PSO3: An ability to analyze the impact of computing on individuals, organizations, and society.
PSO4: An ability to use current techniques, skills, and tools necessary for industrial needs.
SLIGHT 1
MODERATE 2
SUBSTANTIAL 3
MAPPING OF COURSE OUTCOMES WITH THE PROGRAM OBJECTIVES:
CO/PO PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 3 3 3 3 3 2 2 2 3 2 3 3
CO2 3 3 3 3 3 2 2 2 3 2 2 3
CO3 3 3 3 3 3 2 2 2 3 2 2 3
CO4 3 3 3 3 3 2 3 3 3 2 2 3
CO5 3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 3 3 3 3 2 3
CS8711.3 Designing solutions using cloud tool kit, which can moderate attribute Engineering
knowledge (PO1), This will help in problem solving (PO2), design and development of
solution (PO3) and in investigation of complex problem (PO4).Developing simple
models helps in learning the commercial software (PO5). Demonstrate understanding
of the societal, health, safety, legal and cultural issues and the consequent
responsibilities relevant to engineering practice (PO6), Understand the impact of
professional engineering solutions in a societal and environmental context and
demonstrate the need for sustainable development (PO7).This will also help to solve
problems as a team (PO9), leads to learn continuously. Communicate effectively with
the engineering community and with society at large, such as being able to
comprehend, write effective reports and design documentation, make effective
presentations, and give/ receive clear instructions(PO10), Recognize the need for,
and have the ability to engage in independent and life-long learning(PO12).And to
create software products (PSO1 & PSO2), provide solutions for society (PSO3) which
can lead to innovative solutions (PSO4).
By ability to correlate the map reduce algorithm to the current trend, which can
strongly attribute to Engineering knowledge (PO1). This will help in problem solving
(PO2), design and development of solution (PO3) and in investigation of complex
problem (PO4) strong. Encourages modern tool usage (PO5). Demonstrate
understanding of the societal, health, safety, legal and cultural issues and the
consequent responsibilities relevant to engineering practice (PO6), Understand the
impact of professional engineering solutions in a societal and environmental context
and demonstrate the need for sustainable development (PO7), Function effectively as
an individual, and as a member in multi-disciplinary settings (PO8).As result of team
CS8711.4 work, multidisciplinary solutions can be given (PO9). Communicate effectively with
the engineering community and with society at large, such as being able to
comprehend, write effective reports and design documentation, make effective
presentations, and give/ receive clear instructions(PO10), Demonstrate knowledge
and understanding of the engineering and management principles, and apply these
to one’s own work, as a member and leader in a team, to manage projects in
multidisciplinary environments(PO11), Recognize the need for, and have the ability
to engage in independent and life-long learning(PO12) and to create and analyze the
issues (PSO1 & PSO2)..PSO3 and PSO4 are also addressed in this outcome which leads
to innovative solutions.
By ability to use different open source tool, which can strongly attribute to
Engineering knowledge (PO1). This will help in problem solving (PO2), design and
development of solution (PO3) and in investigation of complex problem (PO4) strong.
Encourages modern tool usage (PO5). Demonstrate understanding of the societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to
engineering practice (PO6), Understand the impact of professional engineering
solutions in a societal and environmental context and demonstrate the need for
sustainable development (PO7), Function effectively as an individual, and as a
member in multi-disciplinary settings (PO8).As result of team work, multidisciplinary
CS8711.5 solutions can be given (PO9). Communicate effectively with the engineering
community and with society at large, such as being able to comprehend, write
effective reports and design documentation, make effective presentations, and give/
receive clear instructions(PO10), Demonstrate knowledge and understanding of the
engineering and management principles, and apply these to one’s own work, as a
member and leader in a team, to manage projects in multidisciplinary
environments(PO11), Recognize the need for, and have the ability to engage in
independent and life-long learning(PO12) and to create and analyze the issues (PSO1
& PSO2)..PSO3 and PSO4 are also addressed in this outcome which leads to
innovative solutions.
INDEX
CLOUD COMPUTING
1
Install Virtualbox/VMware Workstation with different
flavours of linux or windows OS on top of windows
AIM:
To Install Virtualbox/VMware Workstation with different flavours of linux or windows OS on
top of windows
PROCEDURE:
STEPS TO INSTALL VIRTUAL BOX:
1. Download VirtualBox installer for windows.
3. Click “Windows host” to download the binary version for windows host.
4. The installer file downloaded will have the file name format like VirtualBox-
VersionNumber-BuildNumber-Win.exe.
Example: VirtualBox-6.1.12-139181-Win.exe.
5. Double click on the installer to launch the setup Wizard. Click on Next to continue.
6. Custom setup dialog box will be opened. Accept the default settings and click next.
7. Select the way you want the features to be installed. You can accept the default
and click next.
8. A dialog box opens with Network Interfaces warning. Click Yes to proceed.
10. When prompted with a message to install (Trust) Oracle Universal Serial Bus,
click Install to continue.
11. After the installation completes, click finish to exit the setup wizard.
3. Enter a name for the new virtual machine. Choose the Type and Version. Note that VirtualBox
automatically changes 'Type' to Linux and 'Version' to 'Ubuntu (64 bit)' if the name is given
as
„Ubuntu‟. Click Next
4. Select the amount of RAM to use. The ideal amount of RAM will automatically be
selected. Do not increase the RAM into the red section of the slider; keep the slider in
the green section.
5. Accept the default 'Create a virtual hard drive now' and click 'Create' button.
6. Choose the hard disk file type as VDI (VirtualBox Disk Image). Click Next.
7. Click Next to accept the default option „Dynamically allocated‟ for storage on
physical hard drive.
8. Select the size of the virtual hard disk and click create.
10. Download the ISO file [Ubuntu disk image file]. Latest version of Ubuntu iso file
can be downloaded from the link h ttps://ubuntu.com/download/desktop. Click
Download button.
14. In Attributes section, click the disk image and then "Choose Virtual Optical Disk File".
15. Browse and select the downloaded iso file. Click ok.
16. Select the newly created virtual machine in the dashboard and click start button.
17. In the welcome screen, click „Install Ubuntu‟ button.
19. Make sure 'Erase disk and install Ubuntu' option is selected and click 'Install Now'
button.
22. After installation is complete, click 'Restart Now' button and follow the instructions.
23. The Ubuntu OS is ready to use. Login with the username and password.
OUTPUT 1:
Virtualbox on top of windows.
OUTPUT 2:
Installation of Virtual box with Linux OS (Guest OS/VM )on top of windows Host.
RESULT:
The Virtual box installation is completed and the Virtual machine is created on top of windows
host operating system.
EXPT.NO:2
Install a C compiler in the virtual machine created using virtual box and execute Simple
Programs.
AIM:
To Install a C compiler in the virtual machine created using virtual box and execute Simple
Programs.
PROCEDURE:
Step 1: Open VMware/Virtual box and turn on the virtual Machine OS (Linux/Ubuntu)
Step2: And then open terminal
Step3: Type getdit and press enter. This will open the notepad
Step4: Enter the program and save the file as .c extension
Step5: Close the notepad it will take you to terminal.
Step6: Type "sudo apt install gcc". This will install Compiler for c or the C package.
Step7: Type "ls" to view the directory/file which you have made
Step8: Type "./a.out" to View the Output
Or
Installing GCC:
To install GCC type the
command yum install gcc
Sample.c:
#include<stdio.h>
#include<conio.h>
int main()
{
printf(“Hello World”);
return 0;
}
To Run:
gcc Sample.c
./a.out
OUTPUT: HELLOWORLD
Install Google App Engine and create web applications using python/java
AIM:
To Install Google App Engine. Create hello world app and other simple web applications
using python/java
PROCEDURE:
Step 3:Download the Windows installer – the simplest thing is to download it to your
Desktop or another folder that you remember.
Step4: Double Click on the GoogleApplicationEngine installer.
Step 5:Click through the installation wizard, and it should install the App
Engine. If you do not have Python 2.5, it will install Python 2.5 as well.
Step 6:Once the install is complete you can discard the downloaded installer
Using a text editor such as JEdit (www.jedit.org), create a file called app.yaml in
the
ae-01-trivial folder with the following contents:
application:
ae-01-trivial
version: 1
runtime:
python
api_version: 1
handlers:
- url: /.*
script: index.py
Note: Please do not copy and paste these lines into your text
editor – you might end up with strange characters – simply
type them into your editor.
Then create a file in the ae-01-trivial folder called index.py with three lines in it:
print 'Content-Type:
text/plain' print ' '
print 'Hello there Chuck'
3
Once you have selected your application and press Run. After a
few moments your application will start and the launcher will
show a little green icon next to your application. Then press
Browse to open a browser pointing at your application which
is running at http://localhost:8080/
Paste http://localhost:8080 into your browser and you
should see your application as follows:
RESULT: Thus Installed the Google App Engine and Created hello world app
successfully.
EXPT.NO:4
AIM:
To Use Google App Engine launcher to launch the web applications.
PROCEDURE:
In this exercise, we are going to create a GAE based Python web project (hello world)using
Eclipse.
Python 2.7
Eclipse 3.7 + PyDev plugin
Google App Engine SDK for Python 1.6.4
PROCEDURE:
Figure 1 – In Eclipse , menu, “Help –> Install New Software..” and put above URL. Select
“PyDev for Eclipse” option, follow steps, and restart Eclipse once Completed.
Step 2. Verify PyDev
After Eclipse is restarted, make sure PyDev’s interpreter is pointed to your “python.exe“.
Figure 2 – Eclipse -> Windows –> Preferences, make sure “Interpreter – Python” is
configured properly
Figure 4.1 – Eclipse menu, File -> New -> Other… , PyDev folder, choose “PyDev Google App Engine
Project“.
Figure 4.2 – Type project name, if the interpreter is not configure yet (in step 2), you can do it now.
And select this option – “Create ‘src’ folder and add it to PYTHONPATH“.
Figure 4.3 – Click “Browse” button and point it to the Google App Engine installed directory (in step
3).
Figure 4.4 – Name your application id in GAE, type anything, you can change it later
class MainPage(webapp.RequestHandler):
def get(self):
self.response.headers['Content-Type'] = 'text/plain'
self.response.out.write('Hello, webapp World!')
application = webapp.WSGIApplication([('/', MainPage)], debug=True)
def main():
run_wsgi_app(application)
main()
Copy
File : app.yaml – GAE need this file to run and deploy your Python project, it‟s quite self- explanatory,
for
application: mkyong-python
version: 1 runtime: python api_version: 1
handlers:
- url: /.*
script: helloworld.py
Copy
Step:5. Run it locally
To run it locally, right click on the helloworld.py, choose “Run As” –> “Run Configuration”, create a
new
Figure 5.1 – In Main tab -> Main module, manually type the directory path of “dev_appserver.py“.
application. Review “app.yaml” again, this web app will be deployed to GAE with application ID
“mkyong-python“.
File : app.yaml
handlers:
- url: /.*
script: helloworld.py
Copy
RESULT:
Thus a hello world web application has been launched using GAE.
EXPT.NO:5
Simulate A Cloud Scenario Using Cloudsim And Run A Scheduling
Algorithm That Is Not Present In Cloudsim
AIM:
To Simulate a cloud scenario using CloudSim and run a scheduling algorithm that is not
present in CloudSim
PROCEDURE:
What is Cloudsim?
CloudSim is a simulation toolkit that supports the modeling and simulation of the
core functionality of cloud, like job/task queue, processing of events, creation of
cloud entities(datacenter, datacenter brokers, etc), communication between
different entities, implementation of broker policies, etc. This toolkit allows to:
CloudSim is written in Java. The knowledge you need to use CloudSim is basic Java
programming and some basics about cloud computing. Knowledge of programming IDEs such
as Eclipse or NetBeans is also helpful. It is a library and, hence, CloudSim does not have to be
installed. Normally, you can unpack the downloaded package in any directory, add it to the
Java classpath and it is ready to be used. Please verify whether Java is available on your
system.
broker.submitVmList(vmlist)
10. Create a cloudlet with length, file size, output size, and utilisation model:
broker.submitCloudletList(cloudletList)
12. Start the simulation:
CloudSim.startSimulation()
Program:
package org.cloudbus.cloudsim.e xa mples;
import java.te xt.Dec imalFormat;
import java.util.Array List;
import java.util.Calendar;
import java.util.Linked List;
import java.util.List;
import org.cloudbus.cloudsim.Cloudlet;
import org.cloudbus.cloudsim.CloudletSchedulerTime Shared;
import org.cloudbus.cloudsim.Datacenter;
import org.cloudbus.cloudsim.DatacenterBroker;
import org.cloudbus.cloudsim.DatacenterCharacteristics;
import org.cloudbus.cloudsim.Host;
import org.cloudbus.cloudsim.Log;
import org.cloudbus.cloudsim.Pe;
import org.cloudbus.cloudsim.Storage;
import org.cloudbus.cloudsim.UtilizationModel;
import org.cloudbus.cloudsim.UtilizationModelFull;
import org.cloudbus.cloudsim.Vm;
import org.cloudbus.cloudsim.VmAllocationPolicySimple;
import org.cloudbus.cloudsim.VmSchedulerTime Shared;
import org.cloudbus.cloudsim.core.CloudSim;
import org.cloudbus.cloudsim.provisioners.BwProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.PeProvisionerSimple;
import org.cloudbus.cloudsim.provisioners.RamProvisionerSimple;
/**
* A simple example showing how to create
* a datacenter with one host and run two
* cloudlets on it. The cloudlets run in
* VMs with the same MIPS requirements.
* The cloudlets will take the same time to
* complete the e xecution.
*/
public class CloudSimExample2 {
/**
* Creates main() to run this e xa mple
*/
public static void main(String[] args) {
Log.printLine("Starting CloudSimExample2...");
try {
// First step: Initialize the CloudSim package. It should be called
// before creating any entities.
int num_user = 1; // number of cloud users
Calendar calendar = Calendar.getInstance();
boolean trace_flag = false; // mean trace events
//VM description
int vmid = 0;
int mips = 250;
long size = 10000; //image size
(MB) int ram = 512; //vmmemory
(MB) long bw = 1000;
int pesNumber = 1; //number of cpus
String vmm = "Xen"; //VMM name
vmid++;
Vm vm2 = new Vm(vmid, brokerId, mips, pesNumber, ram, bw, size, vmm, new
CloudletSchedulerTime Shared());
//add the VMs to the vmList
vmlist.add(vm1);
vmlist.add(vm2);
//Cloudlet properties
int id = 0;
pesNumber=1;
long length = 250000;
long fileSize = 300;
long outputSize =
300;
UtilizationModel utilizationModel = new UtilizationModelFull();
id++;
Cloudlet cloudlet2 = new Cloudlet(id, length, pesNumber, fileSize, outputSize,
utilizationModel, utilizationModel, utilizationModel);
cloudlet2.setUserId(brokerId);
CloudSim.stopSimulation();
printCloudletList(newList);
Log.printLine("CloudSimExample 2 finished!");
}
catch (Exception e) {
e.printStackTrace();
Log.printLine("The simulation has been terminated due to an unexpected error");
}
}
//4. Create Host with its id and list of PEs and add them to the list of machines
int hostId=0;
int ram = 2048; //host memory (MB)
long storage = 1000000; //host storage
int bw = 10000;
hostList.add(
new Host(
hostId,
new RamProvisionerSimple(ram),
new BwProvisionerSimple(bw),
storage,
peList,
new VmSchedulerTimeShared(peList)
)
); // This is our machine
return datacenter;
}
//We strongly encourage users to develop their own broker policies, to submit vms and cloudlets
according
//to the specific rules of the simulated scenario
private static DatacenterBroker createBroker(){
/**
* Prints the Cloudlet objects
* @param list list of Cloudlets
*/
private static void printCloudlet List(List<Cloudlet> list) {
int size = list.size ();
Cloudlet cloudlet;
if (cloudlet.getCloudletStatus() == Cloudlet.SUCCESS){
Log.print("SUCCESS");
}
}
OUTPUT:
Starting CloudSimExample2...
Initialising...
Starting CloudSim version 3.0
Datacenter_0 is starting...
Broker is
starting... Entities
started.
0.0: Broker: Cloud Resource List received with 1 resource(s)
0.0: Broker: Trying to Create VM #0 in Datacenter_0
: Broker: Trying to Create VM #1 in Datacenter_0
: Broker: VM #0 has been created in Datacenter #2, Host #0
0.1: Broker: VM #1 has been created in Datacenter #2, Host #0
0.1: Broker: Sending cloudlet 0 to VM #0
: Broker: Sending cloudlet 1 to VM #1
1000.1: Broker: Cloudlet 0 received
1000.1: Broker: Cloudlet 1 received
1000.1: Broker: All Cloudlets executed.
Finishing... 1000.1: Broker: Destroying VM #0
1000.1: Broker: Destroying VM
#1 Broker is shutting down...
Simulation: No more future events
CloudInformationService: Notify all CloudSim entities for shutting down.
Datacenter_0 is shutting down...
Broker is shutting
down... Simulation
completed.
Simulation completed.
Result:
Thus cloud sim is installed successfully.
EXP NO:6
MOVING FILES BETWEEN VIRTUAL MACHINES
AIM:
To move the files between virtual machine.
PROCEDURE:
You can move files between virtual machines in several ways:
You can copy files using network utilities as you would between physical
computers on your network. To do this between two virtual machine:
Both virtual machines must be configured to allow
access to your network. Any of the networking methods
(host-only, bridged and NAT) are appropriate.
With host-only networking, you copy files from the
virtual machines to the host and vice-versa, since host-
only networking only allows the virtual machines see
your host computer.
With bridged networking or NAT enabled, you
can copy files across your network between the
virtual machines.
You can create a shared drive, either a virtual disk or a raw
partition, and mount the drive in each of the virtual machines.
8. Locate and highlight (from the Host OS) the folder that you want to share
between the VirtualBox Guest machine and the Host and click Select Folder.
*
9. Now, in the 'Add Share' options, type a name (if you want) at the 'Folder Name box, click the
Auto Mount and the Make Permanent checkboxes and click OK twice to close the Shared
Folder Settings.
10. To access the shared folder from the Guest OS, open Windows Explorer and
under the 'Network locations' you should see a new network drive that corresponds to
the shared folder on the Host OS.
OUTPUT:
RESULT: The files are shared between two virtual machine is executed succcessfully.
EX NO:8
Find a procedure to launch virtual machine using trystack (Online Openstack Demo
Version)
AIM:
To install the KVM and Opensatck in Ubuntu 14.04 version an d creation of virtual
Machine
PROCEDURE:
1. add new user named stack – This stack user is the adminstrator of the
openstack services. To add new user – run the command as root user.
adduser stack
2. run the command
apt-get install sudo -y || install -y sudo
export http_proxy=http://172.16.0.3:8080
sudo apt-get install git
6. Run the command (This clones updatesd version of dev-stack (which is
binary auto- installer package of Openstack)
stack@JBL01:/$ export
http_proxy=http://172.16.0.3:8080
stack@JBL01:/$ export
https_proxy=http://172.16.0.3:8080
stack@JBL01:/$ git config --global
http.proxy $http_proxy
stack@JBL01:/$ git config --global
https.proxy $http_proxy
git clone
http://git.openstack.org/openstack-
dev/devstack ls (this shows a folder
named devstack)
[[local|localrc]]
FLOATING_RANGE=192.168.1.224/27
FIXED_RANGE=10.11.11.0/24
FIXED_NETWORK_SIZE=256 FLAT_INTERFACE=eth0
ADMIN_PASSWORD=root
DATABASE_PASSWORD=root
RABBIT_PASSWORD=root SERVICE_PASSWORD=root
SERVICE_TOCKEN=root
Save this
file Change
File
Permission:
stack@JBL0
1:~$ chown
stack * -R
12. Open the browser, http://IP address of your machine, you will get the openstack portal.
13. If you restart the machine, then to again start open stack
open
terminal,
su stack
14. Again you can access openstack services in the browser, http://IP address of your
machine,
OUTPUT
RESULT:
Thus Find a procedure to launch virtual machine using trystack (Online Openstack
Demo Version)is successfully completed
EXP NO:9
Install Hadoop single node cluster and run simple applications like wordcount.
AIM:
To Install Hadoop single node cluster and run simple applications like wordcount.
PROCEDURE
(The above mentioned is the link where you can find the Installing Steps)
Step 1: Click here to download the Java 8 Package. Save this file in your home directory.
Step 5: Add the Hadoop and Java paths in the bash file (.bashrc).
Open. bashrc file. Now, add Hadoop and Java Path as shown below.
Command: vi .bashrc
Fig: Hadoop Installation – Setting Environment Variable
For applying all these changes to the current Terminal, execute the source command.
To make sure that Java and Hadoop have been properly installed on your system and can be
accessed through the Terminal, execute the java -version and hadoop version commands.
Command: ls
All the Hadoop configuration files are located in hadoop-2.7.3/etc/hadoop directory as you
can see in the snapshot below:
Step 7: Open core-site.xml and edit the property mentioned below inside configuration tag:
core-site.xml informs Hadoop daemon where NameNode runs in the cluster. It contains
configuration settings of Hadoop core such as I/O settings that are common to HDFS &
MapReduce.
Command: vi core-site.xml
Step 8: Edit hdfs-site.xml and edit the property mentioned below inside configuration tag:
Step 9: Edit the mapred-site.xml file and edit the property mentioned below inside
configuration tag:
In some cases, mapred-site.xml file is not available. So, we have to create the mapred-site.xml
file using mapred-site.xml template.
Command: vi mapred-site.xml.
Fig: Hadoop Installation – Configuring mapred-site.xml
Step 10: Edit yarn-site.xml and edit the property mentioned below inside configuration tag:
Command: vi yarn-site.xml
1 <?xml version="1.0">
2 <configuration>
3 <property>
4 <name>yarn.nodemanager.aux-services</name>
5 <value>mapreduce_shuffle</value>
6 </property>
7 <property>
8 <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
9 <value>org.apache.hadoop.mapred.ShuffleHandler</value>
10 </property>
11 </configuration>
Step 11: Edit hadoop-env.sh and add the Java Path as mentioned below:
hadoop-env.sh contains the environment variables that are used in the script to run Hadoop like
Java home path, etc.
Command: vi hadoop–env.sh
Command: cd
Command: cd hadoop-2.7.3
This formats the HDFS via NameNode. This command is only executed for the first time.
Formatting the file system means initializing the directory specified by the
dfs.name.dir variable.
Never format, up and running Hadoop filesystem. You will lose all your data stored in the
HDFS.
Step 13: Once the NameNode is formatted, go to hadoop-2.7.3/sbin directory and start all the
daemons.
Command: cd hadoop-2.7.3/sbin
Either you can start all daemons with a single command or do it individually.
Command: ./start-all.sh
Start NameNode:
The NameNode is the centerpiece of an HDFS file system. It keeps the directory tree of all files
stored in the HDFS and tracks all the file stored across the cluster.
Start DataNode:
On startup, a DataNode connects to the Namenode and it responds to the requests from the
Namenode for different operations.
Start ResourceManager:
ResourceManager is the master that arbitrates all the available cluster resources and thus helps
in managing the distributed applications running on the YARN system. Its work is to manage
each NodeManagers and the each application’s ApplicationMaster.
Start NodeManager:
The NodeManager in each machine framework is the agent which is responsible for managing
containers, monitoring their resource usage and reporting the same to the ResourceManager.
Command: ./yarn-daemon.sh start nodemanager
Start JobHistoryServer:
JobHistoryServer is responsible for servicing all job history related requests from client.
Step 14: To check that all the Hadoop services are up and running, run the below command.
Command: jps
Step 15: Now open the Mozilla browser and go to localhost:50070/dfshealth.html to check
the NameNode interface.
Now you Have Successfully Installed Hadoop Single node Cluster.
Step 16: Now you must download Hadoop Jar File and save that file in home directory of
ubuntu/ linux.
Step 17: Program of WordCount and save that file as WordCount
importjava.io.IOException;
importjava.util.StringTokenizer;
importorg.apache.hadoop.conf.Configuration;
importorg.apache.hadoop.fs.Path;
importorg.apache.hadoop.io.IntWritable;
importorg.apache.hadoop.io.Text;
importorg.apache.hadoop.mapreduce.Job;
importorg.apache.hadoop.mapreduce.Mapper;
importorg.apache.hadoop.mapreduce.Reducer;
importorg.apache.hadoop.mapreduce.lib.input.FileInputFormat;
importorg.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount{
public static class TokenizerMapper extends Mapper<Object, Text, Text,IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text value,Context context) throws IOException,
InterruptedException{
String Tokenizeritr = new
StringTokenizer(value.toString());
While(it.hasMoreTokens()){
word.set(itr.nextToken());
context.write(word,one);
}}}
public static class IntSumReducer extends Reducer<Text,IntWritable, Text,IntWritable>{
privateIntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable>values,Context context) throws
IOException, InterruptedException{
int sum =0;
for(IntWritableval: values)
{
sum += val.get();
}
result.set(sum);
context.write(key,result);}}
public static void main(String[] args) throws Exception{
Configuration conf = new Configuration();
Job job =Job.getInstance(conf, “word
count”); job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job,new Path(args[0]));
FileInputFormat.setOutputPath(job,new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0:1);}
roonakduggad@ubuntu:~/hadoop-2.7.3/sbin$mkdir /home/roonakduggad/wc
Comment: check that wc.jar file will be created in your home directory
roonakduggad@ubuntu:~/hadoop-2.7.3$ hadoop fs -mkdir /user
roonakduggad@ubuntu:~/hadoop-2.7.3$ hadoop fs -mkdir /user /input
roonakduggad@ubuntu:~/hadoop-2.7.3$ bin/hadoop jar /home/roonakduggad/wc.jar
WordCount /user/input /user/output
roonakduggad@ubuntu:~/hadoop-2.7.3$ hdfs dfs -copyFromLocal file.txt /user/input/
roonakduggad@ubuntu:~/hadoop-2.7.3$ bin/hadoop fs -cat /user/output1/part-r-00000
RESULT:
Installation Hadoop single node cluster and run simple applications like wordcount is
successfully done and output is verified.
CONTENT BEYOND SYLLABUS
EXPT NO:9 CREATION OF VIRTUAL MACHINES
AIM:
PREREQUISITES:
Install node
ALGORITHM:
Command:
CSE@root-1758# su
Step 2: The following commands are used to install open nebula on ubuntu 16.04
Command:
3. apt-get update
5. /usr/share/one/install_gem
Command:
1. su oneadmin
Command:
INSTALL NODE:
Step 2: The following commands are used to install node and they are typed in
terminal;
Command:
Step1: In DashBoard,
iii. Create.
AIM:
To find procedure to attach virtual block to the virtual machine and check whether it
holds the data even after the release of virtual machine
ALGORITHM:
Step 2: Start the services of open nebula by using the following commands.
Commands:
Step 5: Create a file with the extension c using the vi command and write a small c
program in the created file.
Step 6: Type the ls command and power off in the open nebula and check whether
the instances are available.
AIM:
To develop a virtual machine migration based on certain condition from one node to
another.
ALGORITHM:
Commands:
Computer->menu->isconnect.information->virtual->bridge adress.
Step 4: Use the following command to add the bridge adress of your system as node.
Command:
Step 5: In Dashboard check the memory and in the VM select the ip address of your
computer as host.
For migration,