A01_MAJOR
A01_MAJOR
Project Report on
BACHELOR OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
by
Dr. M. NARAYANAN
DEPARTMENT OF CSE
JULY - 2022
St. MARTIN'S ENGINEERING COLLEGE
An Autonomous Institute
NBA & NAAC A+ Accredited
Dhulapally, Secunderabad - 500 100
ww.smec.ac.in
CERTIFICATE
This is to certify that the project entitled “Computing with nearby server: A work
bonafide work carried out by them. The result embodied in this report have been verified
Date:
Place:
ii
St. MARTIN'S ENGINEERING COLLEGE
An Autonomous Institute
NBA & NAAC A+ Accredited
Dhulapally, Secunderabad - 500 100
ww.smec.ac.in
DECLARATION
iii
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompanies the successful completion of any task
would be incomplete without the mention of the people who made it possible and whose
encouragement and guidance have crowded our efforts with success.
We extend our deep sense of gratitude to Principal, Dr. P. SANTOSH KUMAR PATRA,
St. Martin’s Engineering College Dhulapally, for permitting us to undertake this project.
We would like to express our sincere gratitude and indebtedness to our project supervisor
Dr. M. NARAYANAN, Professor and Head of the Department , Department of Computer
Science and Engineering, St. Martins Engineering College, Dhulapally, for his support and
guidance throughout our project.
Finally, we express thanks to all those who have helped us successfully completing this
project. Furthermore, we would like to thank our family and friends for their moral support
and encouragement. We express thanks to all those who have helped us in successfully
completing the project.
iv
ABSTRACT
As mobile devices evolve to be powerful and pervasive computing tools, their usage
also continues to increase rapidly. However, mobile device users frequently experience
problems when running intensive applications on the device itself, or offloading to remote
clouds, due to resource shortage and connectivity issues. Ironically, most users’ environments
are saturated with devices with significant computational resources. This paper argues that
nearby mobile devices can efficiently be utilised as a crowd-powered resource cloud to
complement the remote clouds. Node heterogeneity, unknown worker capability, and
dynamism are identified as essential challenges to be addressed when scheduling work among
nearby mobile devices. We present a work sharing model, called Honeybee, using an
adaptation of the well-known work stealing method to load balance independent jobs among
heterogeneous mobile nodes, able to accommodate nodes randomly leaving and joining the
system. The overall strategy of Honeybee is to focus on short-term goals, taking advantage of
opportunities as they arise, based on the concepts of proactive workers and opportunistic
delegator. We evaluate our model using a prototype framework built using Android and
implement two applications. We report speedups of up to 4 with seven devices and energy
savings up to 71% with eight devices.
The Mobile Edge Cloud (MEC) is a network architecture concept that offers a cloud-
like capability at the edge of the network. Being close to the end users, MECs decrease the
latency and increase the performance of high-bandwidth applications.
v
CONTENTS
CERTIFICATE ii
DECLARATION iii
ACKNOWLEDGEMENT iv
ABSTRACT v
LIST OF TABLES ix
CHAPTER 1: INTODUCTION 01
4.2 DESIGN 32
vi
4.2.8 Activity Diagram 37
4.3 MODULES 39
4.6 TESTING 42
4.6.2 Implementation 42
4.6.3 Testing 42
7.1 CONCLUSION 65
CHAPTER 8: REFERENCES 66
vii
LIST OF FIGURES
FIGURE No. FIGURE TITLE PAGE NO.
viii
LIST OF TABELS
TABLE NO. TABLE NAME PAGE NO.
ix
LIST OF ACRONYMS AND DEFINITIONS
Factorization
x
CHAPTER 1
INTRODUCTION
Today’s environments are becoming embedded with mobile devices with
augmented capabilities, equipped with various sensors, wireless connectivity as well as
limited computational resources. Whether we are on the move, on a train, or at an airport, in
a shopping Centre or on a bus, a plethora of mobile devices surround us every day, thus
creating a resource-saturated ecosystem of machine and human intelligence. However,
beyond some traditional web-based applications, current technology does not facilitate
exploiting this resource rich space of machine and human resources. Collaboration among
such smart mobile devices can pave the way for greater computing opportunities, not just
by by creating crowd-sourced computing opportunities needing a human element, but also
by solving the resource limitation problem inherent to mobile devices. While there are
research projects in areas such as mobile grid computing where mobile work sharing is
centrally coordinated by a remote server (HTC power to give1) and crowd-powered systems
using mobile devices (Kamino2, Parko3) a gap exists for supporting collective resource
sharing without relying on a remote entity for connectivity and coordination. However, such
mobile crowds (also referred to as mobile edge clouds) are not meant to replace the remote
cloud computing model, but to complement it as given below:
1
having them as resource nodes, as been adopted in research such as the Mobile Device
Cloud, Hyrax, Mobile Edge-Clouds, MClouds, MMPI, Virtual cloud computing for
mobile devices.
There are several unique features that differentiate mobile crowd environments
from a typical grid/distributed computing cluster, such as less computation power and
limited energy on nodes, node mobility resulting in frequent disconnections, and node
heterogeneity. Hence, solutions from grid/distributed computing cannot be used as they are,
and need to be adapted to suit the requirements of mobile crowd environments.
2
1.1 OBJECTIVE OF THE PROJECT
3
CHAPTER 2
LITERATURE SURVEY
MapReduce: Simplified Data Processing on Large Clusters
The MapReduce programming model has been successfully used at Google for
many different purposes. We attribute this success to several reasons. First, the model is
easy to use, even for programmers without experience with parallel and distributed systems,
since it hides the details of parallelization, fault-tolerance, locality optimization, and load
balancing. Second, a large variety of problems are easily expressible as MapReduce
computations. For example, MapReduce is used for the generation of data for Google’s
production web search service, for sorting, for data mining, for machine learning, and many
other systems. Third, we have developed an implementation of MapReduce that scales to
large clusters of machines comprising thousands of machines. The implementation makes
efficient use of these machine resources and therefore is suitable for use on many of the
large computational problems encountered at Google. We have learned several things from
this work. First, restricting the programming model makes it easy to parallelize and
distribute computations and to make such computations fault-tolerant. Second, network
bandwidth is a scarce resource. A number of optimizations in our system are therefore
4
targeted at reducing the amount of data sent across the network: the locality optimization
allows us to read data from local disks, and writing a single copy of the intermediate data to
local disk saves network bandwidth. Third, redundant execution can be used to reduce the
impact of slow machines, and to handle machine failures and data loss [1].
Map Task Scheduling in MapReduce with Data Locality: Throughput and Heavy-
Traffic Optimality
5
it interleaves parallel and sequential computation. Past schemes, and especially their
theoretical bounds, on general parallel models are therefore, unlikely to be applied to
MapReduce directly. There are many recent studies on MapReduce job and task scheduling.
These studies assume that the servers are assigned in advance. in current data centres,
multiple MapReduce jobs of different Importance levels run together. In this paper, we
investigate a schedule problem for MapReduce taking server assignment in to consideration
as well. We formulate a MapReduce server-job organizer problem (MSJO) and show that it
is NP-complete. We develop a 3-approximation algorithm and a fast heuristic. We evaluate
our algorithms through both simulations and experiments on Amazon EC2 with an
implementation in Hadoop. The results confirm the advantage of our algorithms
The Hadoop Distributed File System (HDFS) is designed to store very large data
sets reliably, and to stream those data sets at high bandwidth to user applications. In a large
cluster, thousands of servers both hosts directly attached storage and execute user
application tasks. By distributing storage and computation across many servers, the resource
can grow with demand while remaining economical at every size. We describe the
architecture of HDFS and report on experience using HDFS to manage 25 petabytes of
enterprise data at Yahoo.
This section presents some of the future work that the Hadoop team at Yahoo is
considering; Hadoop being an open-source project implies that new features and changes
are decided by the Hadoop development community at large. The Hadoop cluster is
effectively unavailable when its Name Node is down. Given that Hadoop is used primarily
as a batch system, restarting the Name Node has been a satisfactory recovery means.
However, we have taken steps towards auto-mated failover. Currently a Backup Node
6
receives all transactions from the primary Name Node. This will allow a failover to a warm
or even a hot BackupNode if we send block reports to both the primary NameNode and
BackupNode. A few Hadoop users outside Yahoo! have experimented with manual failover.
Our plan is to use Zookeeper, Yahoo’s distributed consensus technology to build an
automated failover solution. Scalability of the NameNode has been a key struggle. Because
the NameNode keeps all the namespace and block locations in memory, the size of the
NameNode heap has limited the number of files, also the number of blocks address-able [4].
7
matrix computations and, hence, are inefficient to implement with the restrictive
programming and communication interface of such frameworks. In this paper we show that
array-based languages such as R [2] are suitable for implementing complex algorithms and
can outperform current data parallel solutions. Since R is single threaded and does not scale
to large datasets, we have built Pronto, a distributed system that extends R and addresses
many of its limitations. Pronto efficiently shares sparse structured data can leverage multi-
cores, and dynamically partitions data to mitigate load imbalance. Our results show the
promise of this approach: many important machine learning and graph algorithms can be
expressed in a single framework and are substantially faster than those in Hadoop and Spark.
Pronto advocates the use of sparse matrix operations to simplify the implementation of
machine learning and graph algorithms in a cluster. Pronto uses distributed arrays for
structured processing, efficiently uses multi-cores, and dynamically partitions data to reduce
load imbalance. Our experience shows that pronto is a flexible computation model that can
be used to implement a variety of complex algorithms [6].
Dealing with large genomic data on a limited computing resource has been an
inevitable challenge in life science. Bioinformatics applications have required high
performance computation capabilities for next-generation sequencing (NGS) data and the
human genome sequencing data with single nucleotide polymorphisms (SNPs). From 2008,
Cloud computing platforms have been widely adopted to deal with the large data sets with
parallel processing tools. MapReduce parallel programming framework is dominantly used
due to its fast and ancient performance for data processing on cloud clusters. This study
introduces various research projects regarding to reducing a data analysis time and
improving usability with their approaches. Hadoop implementations and work ow toolkits
are focused on address parallel data processing tools and easy-to-use environments
These days, individual research laboratory is able to generate terabytes of data (or
even larger), which is no surprises to new sequencing technologies in genomic research.
High performance computation environments keep improving on processing large-scale data
at low cost. The combination of MapReduce and cloud computing facilitates fast and
efficient parallel processing on the virtual environment for terabyte-scale data analysis in
bioinformatics, if the analysis consists of embarrassingly parallel problems. MapReduce
framework is suitable for the simple and dividable tasks such as read alignment, sequence
8
search and image recognition. Easy-to-use methods and user-friendly cloud platforms have
been provided to researchers so that they can easily have ac- cess to the cloud with their
large data sets uploaded on the cloud in a secure manner. Scientific work ow may focus on
improving data transfer and handling tasks regarding these usability problems. More
challenges are expected to deal with data storage and analysis since it grows at unprecen-
dented scales [7].
9
Compressed Nonnegative Matrix Factorization Is Fast and Accurate
In this work we proposed to use structured random projections for NMF and SNMF.
For NMF, we presented formulations for three popular techniques, namely, multiplicative
updates, active set method for nonnegative least squares and ADMM. For SNMF, we
presented a general technique that can be used with any algorithm. In all cases, we showed
that the resulting compressed techniques are faster than their uncompressed variants and, at
the same time; do not introduce significant errors in the final result. There are in the literature
very efficient SNMF algorithms for tall-and-skinny matrices. Interestingly, the use of
structured random projections allows computing SNMF for arbitrarily large matrices,
granting access to very efficient computations in the general setting. As a by-product, we
also propose an algorithmic solution for Computing structured random projections of
extremely large matrices (i.e., matrices so large that even after compression they do not fit
in main memory). This is useful as a general tool for computing many different matrix
decompositions, such as the singular value decomposition, for example. We are currently
investigating the problem of replacing the Frobenius norm with and Norm in our compressed
variants of NMF and SNMF. In this setting, the fast Cauchy transform is a suitable
alternative to structured random projections [9].
10
CHAPTER 3
SYSTEM ANALYSIS AND DESIGN
3.1 EXISTING SYSTEM
Intermediate data are shuffled according to a hash function in Hadoop, which would
lead to large network traffic because it ignores network topology and data size associated
with each key. To tackle this problem incurred by the traffic-oblivious partition scheme, we
take into account of both task locations and data size associated with each key in this paper.
By assigning keys with larger data size to reduce tasks closer to map tasks, network traffic
can be significantly reduced.
Disadvantages:
In this paper, we jointly consider data partition and aggregation for a Map Reduce
job with an objective that is to minimize the total network traffic. In particular, we propose
a distributed algorithm for big data applications by decomposing the original large-scale
problem into several sub problems that can be solved in parallel. Moreover, an online
algorithm is designed to deal with the data partition and aggregation in a dynamic manner.
Finally, extensive simulation results demonstrate that our proposals can significantly reduce
network traffic cost in both offline and online cases.
11
Advantages:
Each aggregator can reduce merged traffic from multiple map tasks. It is designed
to adjust data partition and aggregation in a dynamic manner.
It can significantly reduce network traffic cost in both offline and online cases.
12
CHAPTER 4
SYSTEM REQUIREMENTS & SPECIFICATIONS
4.1. INTRODUCTION OF TECHNOLOGIES USED
About Java:
Initially the language was called as “oak” but it was renamed as “java” in 1995.The
primary motivation of this language was the need for a platform-independent (i.e.,
architecture neutral) language that could be used to create software to be embedded in
various consumer electronic devices. Finally, Java is to Internet Programming where c was
to System Programming.
Java has had a profound effect on the Internet. This is because; java expands the
Universe of objects that can move about freely in Cyberspace. In a network, two categories
of objects are transmitted between the server and the personal computer. They are passive
information and Dynamic active programs. in the areas of Security and probability. But
Java addresses these concerns and by doing so, has opened the door to an exciting new
form of program called the Applet.
An application is a program that runs on our computer under the operating system
of that computer. It is more or less like one creating using C or C++. Java’s ability to create
Applets makes it important. An Applet I san application, designed to be transmitted over the
Internet and executed by a Java-compatible web browser. An applet I actually a tiny Java
program, dynamically downloaded across the network, just like an image. But the difference
is, it is an intelligent program, not just a media file.
Java Architecture
13
Java architecture provides a portable, robust, high performing environment for
development. Java provides portability by compiling the byte codes for the Java Virtual
Machine, which is then interpreted on each platform by the run-time environment. Java is a
dynamic system, able to load code when needed from a machine in the same room or across
the planet.
Compilation of code
When you compile the code, the Java compiler creates machine code (called byte
code) for a hypothetical machine called Java Virtual Machine (JVM). The JVM is supposed
to executed the byte code. The JVM is created for the overcoming the issue of probability.
The code is written and compiled for one machine and interpreted on all machines. This
machine is called Java Virtual Machine.
Java
Pc Java Byte interpreter
compiler code
Macintosh Java
compiler Platform interpreterm
Source independ acintosh
code ent
SPARC Java
Compiler interpreter(
SPARC)
During run-time the Java interpreter tricks the byte code file into thinking that it is
running on a Java Virtual Machine. Fig. 4.1: depicts the compiling and interpretation in java.
In reality this could be an Intel Pentium windows 95 or sun SPARCstation running Solaris
or Apple Macintosh running system and all could receive code from any computer through
internet and run the Applets.
Simple:
14
Java was designed to be easy for the Professional programmer to learn and to use
effectively. If you are an experienced C++ Programmer. Learning Java will orient features
of C++. Most of the confusing concepts from C++ are either left out of Java or implemented
in a cleaner, more approachable manner. In Java there are a small number of clearly defined
ways to accomplish a given task.
Object oriented
Java was not designed to be source-code compatible with any other language. This
allowed the Java team the freedom to design with a blank state. One outcome of this was a
clean usable, pragmatic approach to objects. The object model in Java is simple and easy to
extend, while simple types, such as integers, are kept as high-performance non-objects.
Robust
The multi-platform environment of the web places extraordinary demands on a
program, because the program must execute reliably in a variety of systems. The ability to
create robust programs. Was given a high priority in the design of Java. Java is strictly typed
language; it checks your code at compile time and runtime.
Java virtually eliminates the problems of memory management and deal location,
which is completely automatic. In a well-written Java program, all run-time errors can and
should be managed by your program.
The user interface is that part of a program that interacts with the user of the
program. GUI is a type of user interface that allows users to interact with electronic devices
with images rather than text commands. A class library is provided by the Java programming
language which is known as Abstract Window Toolkit (AWT) for writing graphical
programs. The Abstract Window Toolkit (AWT) contains several graphical widgets which
can be added and positioned to the display area with a layout manager.
15
interface elements provided by the AWT is done using every platform's native GUI toolkit.
One of the AWT's significance is that the look and feel of each platform can be preserved.
Components:
A graphical user interface is built of graphical elements called components.
A component is an object having a graphical representation that can be displayed on the
screen and that can interact with the user. Components allow the user to interact with the
program and provide the input to the program. In the AWT, all user interface components
are instances of class Component or one of its subtypes. Typical components include such
items as buttons, scrollbars, and text fields.
Types of Components:
Before proceeding ahead, first we need to know what containers are. Fig. 4.2: gives
information about the types of components present in AWT package and their further
division
Containers:
Components do not stand alone, but rather are found within containers. In order to
make components visible, we need to add all components to the container. Containers
contain and control the layout of components.
16
In the AWT, all containers are instances of class Container or one of its subtypes.
Components must fit completely within the container that contains them. For adding
components to the container, we will use add() method.
Types of containers:
Different types of containers are shown in the Fig. 4.3: each of them is explained
as follows in a detailed manner.
Basic GUI Logic: The GUI application or applet is created in three steps. These are:
A new thread is started by the interpreter for user interaction when an AWT GUI is
displayed. When any event is received by this new thread such as click of a mouse, pressing
of key etc then one of the event handlers is called by the new thread set up for GUI. One
important point to note here is that the event handler code is executed within the thread.
Creating a Frame:
Method1:
In the first method we will be creating frame by extending Frame class which is
defined in java.awt package. Following program demonstrate the creation of a frame.
17
import java.awt.*;
public class FrameDemo1 extends Frame
{
FrameDemo1()
{
setTitle("Label Frame");
setVisible(true);
setSize(500,500);
}
public static void main(String[] args)
{
new FrameDemo1 ();
}
}
setTitle: For setting the title of the frame we will use this method. It takes String as
an argument which will be the title name.
SetVisible: For making our frame visible we will use this method. This method
takes Boolean value as an argument. If we are passing true then window will be visible
otherwise window will not be visible.
SetSize: For setting the size of the window we will use this method. The first
argument is width of the frame and second argument is height of the frame.
Method 2:
In this method we will be creating the Frame class instance for creating frame
window. Following program demonstrate Method2.
import java.awt.*;
18
Frame f = new Frame();
f.setTitle("My first frame");
f.setVisible(true);
f.setSize(500,500);
}
}
Types of Components:
1) Labels:
This is the simplest component of Java Abstract Window Toolkit. This component
is generally used to show the text or string in your application and label never perform any
type of action.
In the above three lines we have created three labels with the name “one, two, three”.
In the third label we are passing two arguments. Second argument is the justification of the
label. Now after creating components we will be adding it to the container.
add(l1);
add(l2);
add(l3);
We can set or change the text in a label by using the setText( ) method. You can
obtain the current label by calling getText( ). These methods are shown here:
String getText( )
2) Buttons:
This is the component of Java Abstract Window Toolkit and is used to trigger
actions and other events required for your application.
19
The syntax of defining the button is as follows:
We can change the Button's label or get the label's text by using the
Button.setLabel(String) and Button.getLabel() method.
3) CheckBox:
To retrieve the current state of a check box, call getState( ). To set its state, call
setState( ). You can obtain the current label associated with a check box by calling
getLabel(). To set the label, call setLabel( ). These methods are as follows:
boolean getState( )
void setState(boolean on)
String getLabel( )
void setLabel(String str)
Here, if on is true, the box is checked. If it is false, the box is cleared. The string
passed in str becomes the new label associated with the invoking check box.
4) Radio Button:
This is the special case of the Checkbox component of Java AWT package. This is
used as a group of checkboxes which group name is same. Only one Checkbox from a
Checkbox Group can be selected at a time. Syntax for creating radio buttons is as follows:
20
Checkbox Win98 = new Checkbox("Windows 98/XP", cbg , true);
For Radio Button we will be using CheckBox class. The only difference in
Checkboxes and radio button is in Check boxes we will specify null for checkboxgroup but
whereas in radio button we will be specifiying the checkboxgroup object in the second
parameter.
5) Choice:
The Choice class is used to create a pop-up list of items from which the user may
choose. Thus, a Choice control is a form of menu. Syntax for creating choice is as follows:
os.add("Windows 98/XP");
os.add("Windows NT/2000");
os.add("Solaris");
os.add("MacOS");
We will be creating choice with the help of Choice class. Pop up list will be creating with
the creation of object, but it will not have any items. For adding items we will be using add()
method defined in Choice class.
To determine which item is currently selected, you may call either getSelectedItem( ) or
getSelectedIndex( ). These methods are shown here:
String getSelectedItem( )
int getSelectedIndex( )
The getSelectedItem( ) method returns a string containing the name of the item.
getSelectedIndex( ) returns the index of the item. The first item is at index 0. By default,
the first item added to the list is selected.
21
6) List:
List class is also same as choice but the only difference in list and choice is, in
choice user can select only one item whereas in List user can select more than one item.
Syntax for creating list is as follows:
First argument in the List constructor specifies the number of items allowed in the
list. Second argument specifies whether multiple selections are allowed or not.
os.add("Windows 98/XP");
os.add("Windows NT/2000");
os.add("Solaris");
os.add("MacOS");
In list we can retrieve the items which are selected by the users. In multiple
selection user will be selecting multiple values for retrieving all the values we have a method
called getSelectedValues() whose return type is string array. For retrieving single value
again we can use the method defined in Choice i.e. getSelectedItem().
7)TextField:
Text fields allow the user to enter strings and to edit the text using the arrow keys,
cut and paste keys. TextField is a subclass of TextComponent. Syntax for creating list is as
follows:
In the first text field we are specifying the size of the text field and the second text
field is created with the default value. TextField (and its superclass TextComponent)
provides several methods that allow you to utilize a text field. To obtain the string currently
contained in the text field, call getText( ). To set the text, call setText( ). These methods are
as follows:
String getText( )
22
We can control whether the contents of a text field may be modified by the user by
calling setEditable( ). You can determine editability by calling isEditable( ). These methods
are shown here:
boolean isEditable( )
isEditable( ) returns true if the text may be changed and false if not. In setEditable( ), if
canEdit is true, the text may be changed. If it is false, the text cannot be altered.
There may be times when we will want the user to enter text that is not displayed,
such as a password. We can disable the echoing of the characters as they are typed by calling
setEchoChar( ).
8)TextArea:
Above code will create one text area with 20 rows and 30 columns. TextArea is a
subclass of TextComponent. Therefore, it supports the getText( ), setText( ),
getSelectedText( ), select( ), isEditable( ), and setEditable( ) methods described in the
preceding section.
The append( ) method appends the string specified by str to the end of the current
text. insert( ) inserts the string passed in str at the specified index. To replace text, call
replaceRange( ). It replaces the characters from startIndex to endIndex–1, with the
replacement text passed in str.
Layout Managers:
23
manager is an instance of any class that implements the LayoutManager interface. The
layout manager is set by the setLayout( ) method. If no call to setLayout( ) is made, then
the default layout manager is used. Whenever a container is resized (or sized for the first
time), the layout manager is used to position each of the components within it. The
setLayout( ) method has the following general form:
Here, layoutObj is a reference to the desired layout manager. If you wish to disable
the layout manager and position components manually, pass null for layoutObj. If we do
this, you will need to determine the shape and position of each component manually, using
the setBounds( ) method defined by Component.
In which first two arguments are the x and y axis. Third argument is width and
fourth argument is height of the component.
FlowLayout:
FlowLayout is the default layout manager. This is the layout manager that the
preceding examples have used. FlowLayout implements a simple layout style, which is
similar to how words flow in a text editor. Components are laid out from the upper-left
corner, left to right and top to bottom. When no more components fit on a line, the next one
appears on the next line. A small space is left between each component, above and below,
as well as left and right. Here are the constructors for FlowLayout:
FlowLayout( )
FlowLayout(int how)
The first form creates the default layout, which centers components and leaves five
pixels of space between each component. The second form lets you specify how each line is
aligned. Valid values for how are as follows:
24
FlowLayout.LEFT
FlowLayout.CENTER
FlowLayout.RIGHT
These values specify left, center, and right alignment, respectively. The third form
allows you to specify the horizontal and vertical space left between components in horz and
vert, respectively.
BorderLayout:
The BorderLayout class implements a common layout style for top-level windows.
It has four narrow, fixed-width components at the edges and one large area in the center.
The four sides are referred to as north, south, east, and west. The middle area is called the
center. Here are the constructors defined by BorderLayout:
BorderLayout( )
The first form creates a default border layout. The second allows you to specify the
horizontal and vertical space left between components in horz and vert, respectively.
BorderLayout defines the following constants that specify the regions:
BorderLayout.CENTER BorderLayout.SOUTH
BorderLayout.EAST BorderLayout.WEST
BorderLayout.NORTH
When adding components, you will use these constants with the following form of add( ),
which is defined by Container:
Here, compObj is the component to be added, and region specifies where the component
will be added.
GridLayout:
25
The constructors supported by GridLayout are shown here:
GridLayout( )
The first form creates a single-column grid layout. The second form creates a grid
layout with the specified number of rows and columns. The third form allows you to specify
the horizontal and vertical space left between components in horz and vert, respectively.
Either numRows or numColumns can be zero. Specifying numRows as zero allows for
unlimited-length columns. Specifying numColumns as zero allows for unlimited-length
rows.
Swings:
Swing is important to develop Java programs with a graphical user interface (GUI).
There are many components which are used for the building of GUI in Swing. The Swing
Toolkit consists of many components for the building of GUI. These components are also
helpful in providing interactivity to Java applications. Following are components which are
included in Swing toolkit:
list controls
buttons
labels
tree controls
table controls
All AWT flexible components can be handled by the Java Swing. Swing toolkit
contains far more components than the simple component toolkit. It is unique to any other
toolkit in the way that it supports integrated internationalization, a highly customizable text
package, rich undo support etc. Not only this you can also create your own look and feel
using Swing other than the ones that are supported by it. The customized look and feel can
be created using Synth which is specially designed. Not to forget that Swing also contains
the basic user interface such as customizable painting, event handling, drag and drop etc.
26
The Java Foundation Classes (JFC) which supports many more features important
to a GUI program comprises of Swing as well. The features which are supported by Java
Foundation Classes (JFC) are the ability to create a program that can work in different
languages, the ability to add rich graphics functionality etc.
There are several components contained in Swing toolkit such as check boxes,
buttons, tables, text etc. Some very simple components also provide sophisticated
functionality. For instance, text fields provide formatted text input or password field
behaviour. Furthermore, the file browsers and dialogs can be used according to one's need
and can even be customized.
Difference between Swings and AWT:
AWT stands for Abstract Window Toolkit. It is a platform-dependent API to
develop GUI (Graphical User Interface) or window-based applications in Java. It was
developed by heavily Sun Microsystems in 1995. Swing is a lightweight Java graphical user
interface (GUI) that is used to create various applications. Swing has platform-independent
components. Table. 4.1: gives us the information about the Swings and AWT
Swings AWT
27
Java Swing Class Hierarchy:
Java Swing tutorial is a part of Java Foundation Classes (JFC) that is used to create
window-based applications. Fig. 4.4: depicts that structure. It is built on the top of AWT
(Abstract Windowing Toolkit) API and entirely written in java. Unlike AWT, Java Swing
provides platform-independent and lightweight components.
Swing Components:
All the components which are supported in AWT same components are also
supported in Swings with a slight change in their class name. Table. 4.2: had different AWT
components and Swing Components. Both of them as same use cases.
Label JLabel
TextField JTextField
TextArea JTextArea
Choice JComboBox
Checkbox JCheckBox
List JList
Button JButton
- JRadioButton
28
- JPasswordField
- JTable
- JTree
- JTabbedPane
MenuBar JMenuBar
Menu JMenu
MenuItem JMenuItem
- JFileChooser
- JOptionPane
JTabbedPane class:
The JTabbedPane container allows many panels to occupy the same area of the
interface, and the user may select which to show by clicking on a tab.
Constructor
Add tabs to a tabbed pane by calling addTab and passing it a String title and an
instance of a class which should be called when we pressed a tab. That class should be a
subclass of JPanel.
addTab(“String”,instance);
Example program:
import javax.swing.*;
import java.awt.*;
29
{
TabbedPaneDemo()
setLayout(new FlowLayout(FlowLayout.LEFT));
setTitle("Tabbed Demo");
setVisible(true);
setSize(500,500);
setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
pane.addTab("Countries",new Count());
pane.addTab("Cities",new Cit());
add(pane);
new TabbedPaneDemo();
Count()
add(b1);
30
add(b2);
add(b3);
Cit()
add(cb1);
add(cb2);
add(cb3);
A top-level window can have a menu bar associated with it. A menu bar displays a
list of top-level menu choices. Each choice is associated with a drop-down menu. This
concept is implemented in Java by the following classes: JMenuBar, JMenu, and
JMenuItem. In general, a menu bar contains one or more JMenu objects. Each JMenu object
contains a list of JMenuItem objects. Each JMenuItem object represents something that can
be selected by the user. To create a menu bar, first create an instance of JMenuBar. This
class only defines the default constructor. Next, create instances of JMenu that will define
the selections displayed on the bar. Following are the constructors for Menu:
JMenu( )
JMenu(String optionName)
JMenuItem( )
31
4.2 DESIGN
4.2.1 Uml Diagrams
The Unified Modeling Language allows the software engineer to express an analysis
model using the modeling notation that is governed by a set of syntactic semantic and
pragmatic rules.
A UML system is represented using five different views that describe the system
from distinctly different perspective. Each view is defined by a set of diagram, which is as
follows.
The class diagram is the main building block of object-oriented modeling. It is used
both for general conceptual modeling of the systematic of the application, and for detailed
modeling translating the models into programming code. The Fig. 4.5: Class diagram gives
32
information about classes involved. Class diagrams can also be used for data modeling. The
classes in a class diagram represent both the main objects, interactions in the application and
the classes to be programmed.
In the diagram, classes are represented with boxes which contain three parts:
Main
DBCon
msg : String
msg : String
a : String Chart Reducer2
lat : String
reducer : String agg : int
lon : String
lat : double no agg : int start()
reducer : String
lon : double title : String append()
dist : double
getCon()
addReducer() createChart()
setReducer()
prepareStatement()
setDistance()
executeUpdate() Reducer1
add()
start()
append()
WordFrequencyMapper DocumentReducer
DefineReducer data : Text document_count : int
WordFrequency wc : string
reducer : String pattern : Pattern
lat : String path : String wdcounter : String
matcher : matcher
lon : String file : String
start() reduce()
addReducer() waitForCompletion() toString()
getInputSplit()
getPath() split()
33
case diagram doesn't go into a lot of detail for example, don't expect it to model the order in
which steps are performed. Instead, a proper use case diagram depicts a high-level overview
of the relationship between use cases, actors, and systems. Fig. 4.6: gives information about
the use case diagram related to our project.
User
34
sequence diagram. We can also use the terms event diagrams or event scenarios to refer to
a sequence diagram. Sequence diagrams describe how and in what order the objects in a
system function. These diagrams are widely used by businessmen and software developers
to document and understand requirements for new and existing systems.
35
4: Run the reducer application
10: details will be updated at Reducer node
define the
reducer location
Mapper
Upload the
document
Start the
aggregation
Displays the
cost graph
36
4.2.7 DEPLOYMENT DIAGRAM
The nodes appear as boxes, and the artifacts allocated to each node appear as
rectangles within the boxes. Nodes may have sub nodes, which appear as nested boxes. A
single node in a deployment diagram may conceptually represent multiple physical nodes,
such as a cluster of database servers.
Define reducer
location
Run reducer
applications
Upload the
document
Mapper
Start the
aggregation
Produce network
cost graph
So, the control flow is drawn from one operation to another. This flow can be
sequential, branched or concurrent.
37
Define Reducer location
details
Yes No
Data flow diagrams illustrate how data is processed by a system in terms of inputs
and outputs.
Data flow diagrams can be used to provide a clear representation of any business
function. The technique starts with an overall picture of the business and continues by
analyzing each of the functional areas of interest. This analysis can be carried out in
precisely the level of detail required. The technique exploits a method called top-down
expansion to conduct the analysis in a targeted way.
As the name suggests, Data Flow Diagram (DFD) is an illustration that explicates
the passage of information in a process. A DFD can be easily drawn using simple symbols.
Additionally, complicated processes can be easily automated by creating DFDs using easy-
to-use, free downloadable diagramming tools. A DFD is a model for constructing and
analyzing information processes.
38
4.3 MODULES
4.3.1 Mapper Module:
In reducer module, each reduce task fetches its own share of data partitions
from all map tasks to generate the final result. The Reducer takes the grouped key-value
paired data as input and runs a Reducer function on each one of them. Here, the data can be
aggregated, filtered, and combined in a number of ways, and it requires a wide range of
processing. Once the execution is over, it gives zero or more key-value pairs to the final
step.
39
4.4 TEST CASES
Table. 4.3: gives information about different testcases present in our project which
with name, description, steps, status and priority.
Test
Test Test Test Steps Test
Test Case Case
Cas Case Priorit
Desc Statu
e Id Name Expecte y
Step Actual s
d
It defines If we Location Reducers
the doesn’t details details will
reducers provide will not be saved
Reducer particular latitude, be saved successfully
1 location location by longitude High High
details providing values
latitude &
longitude
values
Start the If we not Reducer Reducer
reducer run the don’t node will be
nodes ,and applicatio know started
2 Run all details n the
reducer will be updated High High
updated at details
reducer
node
Data will If we can’t We Input data
Upload be upload the can’t loaded
3 the input uploaded data reduce successfully High High
data from the
shuffle network
phase traffic
It If we not We After
Aggregat aggregates start the can’t processing
ion using all the aggregatio reduce the
4 Map partitioned n the aggregate High High
reduce data network data, it
traffic displays the
count result.
Displays If we can’t Nothing Graph will
Network the graph do any will be be displayed
5 traffic between aggragatio displaye using High High
cost processing n d aggregated/n
graph time & o aggregated
Technique data
40
4.5 SYSTEM REQUIREMENTS
4.5.1 Hardware Requirements:
41
4.6 TESTING
Implementation is one of the most important tasks in project is the phase in which
one has to be cautions because all the efforts undertaken during the project will be very
interactive. Implementation is the most crucial stage in achieving successful system and
giving the users confidence that the new system is workable and effective. Each program is
tested individually at the time of development using the sample data and has verified that
these programs link together in the way specified in the program specification. The computer
system and its environment are tested to the satisfaction of the user.
4.6.2 Implementation
4.6.3 Testing
Testing is the process where the test data is prepared and is used for testing the
modules individually and later the validation given for the fields. Then the system testing
takes place which makes sure that all components of the system property functions as a unit.
The test data should be chosen such that it passed through all possible condition. Actually
testing is the state of implementation which aimed at ensuring that the system works
accurately and efficiently before the actual operation commence. The following is the
description of the testing strategies, which were carried out during the testing period.
Testing has become an integral part of any system or project especially in the field
of information technology. The importance of testing is a method of justifying, if one is
ready to move further, be it to be check if one is capable to with stand the rigors of a
42
particular situation cannot be underplayed and that is why testing before development is so
critical. When the software is developed before it is given to user to user the software must
be tested whether it is solving the purpose for which it is developed. This testing involves
various types through which one can ensure the software is reliable. The program was tested
logically and pattern of execution of the program for a set of data are repeated. Thus the
code was exhaustively checked for all possible correct data and the outcomes were also
checked.
To locate errors, each module is tested individually. This enables us to detect error
and correct it without affecting any other modules. Whenever the program is not satisfying
the required function, it must be corrected to get the required result. Thus all the modules
are individually tested from bottom up starting with the smallest and lowest modules and
proceeding to the next level. Each module in the system is tested separately. For example
the job classification module is tested separately. This module is tested with different job
and its approximate execution time and the result of the test is compared with the results that
are prepared manually. The comparison shows that the results proposed system works
efficiently than the existing system. Each module in the system is tested separately. In this
system the resource classification and job scheduling modules are tested separately and their
corresponding results are obtained which reduces the process waiting time.
After the module testing, the integration testing is applied. When linking the
modules there may be chance for errors to occur, these errors are corrected by using this
testing. In this system all modules are connected and tested. The testing results are very
correct. Thus, the mapping of jobs with resources is done correctly by the system.
When that user fined no major problems with its accuracy, the system passers
through a final acceptance test. This test confirms that the system needs the original goals,
objectives and requirements established during analysis without actual execution which
elimination wastage of time and money acceptance tests on the shoulders of users and
management, it is finally acceptable and ready for the operation.
43
CHAPTER 5
SOURCE CODE
Main.java
package mapreduce;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JButton;
import javax.swing.JPanel;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import javax.swing.UIManager;
import java.awt.BorderLayout;
import java.awt.Dimension;
import java.awt.Color;
import java.awt.Font;
import javax.swing.JOptionPane;
import javax.swing.JFileChooser;
import java.io.File;
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
import java.io.ObjectInputStream;
import java.io.ObjectOutputStream;
import java.net.Socket;
import java.sql.Statement;
import java.sql.ResultSet;
import org.jfree.ui.RefineryUtilities;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
public class Main extends JFrame
{
GradientPanel p1;
JPanel p2;
JLabel title;
JButton b1,b2,b3,b4,b5;
Font f1;
JFileChooser chooser;
File file;
static StringBuilder sb = new StringBuilder();
ArrayList<Location> location = new ArrayList<Location>();
public void remove(){
File remove = new File("output");
File list[] = remove.listFiles();
if(list != null){
for(int i=0;i<list.length;i++){
if(list[i] != null)
44
System.out.println(list[i].delete()+" delete =======");
}
}
if(remove != null)
remove.delete();
}
public Main(){
super("Map Reduce System");
p1 = new GradientPanel(600,200);
p1.setLayout(null);
f1 = new Font("Courier New",Font.BOLD,14);
p2 = new JPanel();
p2.setBackground(new Color(204, 110, 155));
title = new JLabel("<HTML><BODY><CENTER>ON TRAFFIC-AWARE
PARTITION AND AGGREGATION IN MAPREDUCE<BR/>FOR BIG DATA
APPLICATIONS</CENTER></BODY></HTML>");
title.setForeground(Color.white);
title.setFont(new Font("Times New ROMAN",Font.PLAIN,17));
p2.add(title);
chooser = new JFileChooser(new File("."));
chooser.setFileSelectionMode(JFileChooser.DIRECTORIES_ONLY);
JPanel pan3 = new JPanel();
b1 = new JButton("Define Reducer Location");
b1.setFont(f1);
b1.setBounds(220,50,250,30);
p1.add(b1);
b1.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent ae){
DefineReducer dr = new DefineReducer();
dr.setVisible(true);
dr.setSize(600,360);
dr.setLocationRelativeTo(null);
dr.setResizable(false);
}
});
b2 = new JButton("Upload Documents");
b2.setFont(f1);
b2.setBounds(220,100,250,30);
p1.add(b2);
b2.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent ae){
int option = chooser.showOpenDialog(Main.this);
if(option == chooser.APPROVE_OPTION){
sb.delete(0,sb.length());
file = chooser.getSelectedFile();
JOptionPane.showMessageDialog(Main.this,"Input
documents loaded");
}
}
});
45
b3 = new JButton("Start MapReduce Aggregation");
b3.setFont(f1);
b3.setBounds(220,150,250,30);
p1.add(b3);
b3.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent ae){
try{
remove();
long start = System.currentTimeMillis();
String a[] = {"test"};
WordFrequency.setInputPath(file.getPath());
WordFrequency.start(a);
Location loc = location.get(0);
Socket soc = null;
if(loc.getReducer().equals("Reducer1"))
soc = new Socket("localhost",2222);
if(loc.getReducer().equals("Reducer2"))
soc = new Socket("localhost",3333);
ObjectOutputStream out = new
ObjectOutputStream(soc.getOutputStream());
Object req[] = {"input",sb.toString()};
out.writeObject(req);
out.flush();
ObjectInputStream in = new
ObjectInputStream(soc.getInputStream());
Object res[] = (Object[])in.readObject();
String msg = (String)res[0];
long end = System.currentTimeMillis();
ViewResult vr = new ViewResult();
vr.setVisible(true);
vr.setSize(600,400);
if(msg.equals("output")){
String output = (String)res[1];
String arr[] = output.split("\n");
for(int i=0;i<arr.length;i++){
Object ar[] = arr[i].split("\t");
vr.dtm.addRow(ar);
}
}
FileWriter fw = new FileWriter("D:/agg.txt");
fw.write(""+(end-start));
fw.close();
JOptionPane.showMessageDialog(Main.this,"Processing Time "+(end-start));
}catch(Exception e){
e.printStackTrace();
}
}
});
b4 = new JButton("Network Traffic Cost Graph");
b4.setFont(f1);
46
b4.setBounds(220,200,250,30);
p1.add(b4);
b4.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent ae){
try{
BufferedReader br = new BufferedReader(new FileReader("D:/no_agg.txt"));
int noagg = Integer.parseInt(br.readLine().trim());
br.close();
br = new BufferedReader(new FileReader("D:/agg.txt"));
int agg = Integer.parseInt(br.readLine().trim());
br.close();
Chart chart1 = new Chart("Processing Time Chart",noagg,agg);
chart1.pack();
RefineryUtilities.centerFrameOnScreen(chart1);
chart1.setVisible(true);
}catch(Exception e){
e.printStackTrace();
}
}
});
b5 = new JButton("Exit");
b5.setFont(f1);
b5.setBounds(220,250,250,30);
p1.add(b5);
b5.addActionListener(new ActionListener(){
public void actionPerformed(ActionEvent ae){
System.exit(0);
}
});
getContentPane().add(p1,BorderLayout.CENTER);
getContentPane().add(p2,BorderLayout.NORTH);
}
public static void main(String a[])throws Exception{
UIManager.setLookAndFeel("com.sun.java.swing.plaf.nimbus.NimbusLookAndFe
el");
Main main = new Main();
main.setVisible(true);
main.setExtendedState(JFrame.MAXIMIZED_BOTH);
main.readReducerLoc();
}
public double distance(double lat1, double lon1, double lat2, double lon2, char unit){
double theta = lon1 - lon2;
double dist = Math.sin(deg2rad(lat1)) * Math.sin(deg2rad(lat2)) +
Math.cos(deg2rad(lat1)) * Math.cos(deg2rad(lat2)) * Math.cos(deg2rad(theta));
dist = Math.acos(dist);
dist = rad2deg(dist);
dist = dist * 60 * 1.1515;
if (unit == 'K') {
dist = dist * 1.609344;
} else if (unit == 'N') {
47
dist = dist * 0.8684;
}
return (dist);
}
public double deg2rad(double deg) {
return (deg * Math.PI / 180.0);
}
public double rad2deg(double rad) {
return (rad * 180.0 / Math.PI);
}
public void readReducerLoc(){
try{
location.clear();
Statement stmt = DBCon.getCon().createStatement();
ResultSet rs = stmt.executeQuery("select * from reducer");
while(rs.next()){
String reducer = rs.getString(1);
double lat = rs.getDouble(2);
double lon = rs.getDouble(3);
double dis = distance(17.4359786,78.4481956,lat,lon,'M');
Location loc = new Location();
loc.setReducer(reducer);
loc.setDistance(dis);
location.add(loc);
}
java.util.Collections.sort(location,new Location());
for(int i=0;i<location.size();i++){
Location loc = location.get(i);
System.out.println(loc.getReducer());
}
}catch(Exception e){
e.printStackTrace();
}
}
}
DefineReducer.java
package mapreduce;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JTextField;
import javax.swing.JButton;
import javax.swing.JPanel;
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import javax.swing.UIManager;
import java.awt.BorderLayout;
import java.awt.Dimension;
import java.awt.Color;
48
import java.awt.Font;
import javax.swing.JComboBox;
import javax.swing.JOptionPane;
public class DefineReducer extends JFrame{
GradientPanel p1;
JPanel p2;
JLabel l1,l2,l3,l4;
JTextField tf1,tf2;
JButton b1;
JComboBox c1;
Font f1;
public DefineReducer(){
super("Define Reducer");
p1 = new GradientPanel(600,200);
p1.setLayout(null);
l1 = new JLabel("Define Reducer Location Screen");
l1.setFont(new Font("Courier New",Font.BOLD,18));
l1.setBounds(250,20,400,30);
p1.add(l1);
l2 = new JLabel("Reducer Name");
l2.setFont(f1);
l2.setBounds(200,60,100,30);
p1.add(l2);
c1 = new JComboBox();
c1.addItem("Reducer1");
c1.addItem("Reducer2");
c1.setFont(f1);
c1.setBounds(300,60,130,30);
p1.add(c1);
l3 = new JLabel("Latitude");
l3.setFont(f1);
l3.setBounds(200,110,100,30);
p1.add(l3);
tf1 = new JTextField(15);
tf1.setFont(f1);
tf1.setBounds(300,110,130,30);
p1.add(tf1);
l4 = new JLabel("Longitude");
l4.setFont(f1);
l4.setBounds(200,160,100,30);
p1.add(l4);
tf2 = new JTextField(15);
tf2.setFont(f1);
tf2.setBounds(300,160,130,30);
p1.add(tf2);
b1 = new JButton("Save Reducer");
b1.setFont(f1);
b1.setBounds(220,210,140,30);
p1.add(b1);
b1.addActionListener(new ActionListener(){
49
public void actionPerformed(ActionEvent ae){
login();
}
});
getContentPane().add(p1,BorderLayout.CENTER);
}
public void clear(){
tf1.setText("");
tf2.setText("");
}
public void login(){
String reducer = c1.getSelectedItem().toString().trim();
String lat = tf1.getText();
String lon = tf2.getText();
if(lat == null || lat.trim().length() <= 0){
JOptionPane.showMessageDialog(this,"Latitude must be enter");
tf1.requestFocus();
return;
}
if(lon == null || lon.trim().length() <= 0){
JOptionPane.showMessageDialog(this,"Longitude must be enter");
tf2.requestFocus();
return;
}
double lat1 = 0;
double lon1 = 0;
try{
lat1 = Double.parseDouble(lat.trim());
}catch(NumberFormatException nfe){
JOptionPane.showMessageDialog(this,"Latitude must be decimal value
only");
tf1.requestFocus();
return;
}
try{
lon1 = Double.parseDouble(lon.trim());
}catch(NumberFormatException nfe){
JOptionPane.showMessageDialog(this,"Longitude must be decimal value
only");
tf2.requestFocus();
return;
}
try{
String msg = DBCon.addReducer(reducer,lat.trim(),lon.trim());
if(msg.equals("success")){
JOptionPane.showMessageDialog(this,"Reducer details added");
setVisible(false);
}else{
JOptionPane.showMessageDialog(this,"Error in adding reducer
details");
50
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
Chart.java
package mapreduce;
import java.awt.Color;
import java.awt.Dimension;
import java.awt.GradientPaint;
import javax.swing.JPanel;
import org.jfree.chart.ChartFactory;
import org.jfree.chart.ChartPanel;
import org.jfree.chart.JFreeChart;
import org.jfree.chart.axis.CategoryAxis;
import org.jfree.chart.axis.CategoryLabelPositions;
import org.jfree.chart.axis.NumberAxis;
import org.jfree.chart.labels.StandardCategorySeriesLabelGenerator;
import org.jfree.chart.plot.CategoryPlot;
import org.jfree.chart.plot.PlotOrientation;
import org.jfree.chart.renderer.category.BarRenderer;
import org.jfree.data.category.CategoryDataset;
import org.jfree.data.category.DefaultCategoryDataset;
import org.jfree.ui.ApplicationFrame;
import org.jfree.ui.RefineryUtilities;
import java.util.ArrayList;
import java.awt.event.WindowEvent;
import javax.swing.JScrollPane;
import org.jfree.chart.ChartUtilities;
public class Chart extends ApplicationFrame{
static int noagg;
static int agg;
static String title;
public Chart(String paramString,int a1,int a2){
super(paramString);
noagg = a1;
agg = a2;
JPanel localJPanel = createDemoPanel();
localJPanel.setPreferredSize(new Dimension(800, 370));
JScrollPane jsp = new JScrollPane(localJPanel);
setContentPane(localJPanel);
}
private static CategoryDataset createDataset(){
DefaultCategoryDataset localDefaultCategoryDataset = new
DefaultCategoryDataset();
localDefaultCategoryDataset.addValue(noagg,"No Aggregation","No
Aggregation");
51
localDefaultCategoryDataset.addValue(agg,"Aggregation","Aggregation");
return localDefaultCategoryDataset;
}
public void windowClosing(WindowEvent we) {
this.setVisible(false);
}
private static JFreeChart createChart(CategoryDataset paramCategoryDataset){
JFreeChart localJFreeChart = ChartFactory.createBarChart(title,"Technique
Name", "Processing Time", paramCategoryDataset, PlotOrientation.VERTICAL, true,
true, false);
CategoryPlot localCategoryPlot = (CategoryPlot)localJFreeChart.getPlot();
localCategoryPlot.setDomainGridlinesVisible(true);
localCategoryPlot.setRangeCrosshairVisible(true);
localCategoryPlot.setRangeCrosshairPaint(Color.blue);
NumberAxis localNumberAxis = (NumberAxis)localCategoryPlot.getRangeAxis();
localNumberAxis.setStandardTickUnits(NumberAxis.createIntegerTickUnits());
BarRenderer localBarRenderer = (BarRenderer)localCategoryPlot.getRenderer();
localBarRenderer.setDrawBarOutline(false);
GradientPaint localGradientPaint1 = new GradientPaint(0.0F, 0.0F, Color.blue, 0.0F,
0.0F, new Color(0, 0, 64));
GradientPaint localGradientPaint2 = new GradientPaint(0.0F, 0.0F, Color.green, 0.0F,
0.0F, new Color(0, 64, 0));
GradientPaint localGradientPaint3 = new GradientPaint(0.0F, 0.0F, Color.red, 0.0F,
0.0F, new Color(64, 0, 0));
localBarRenderer.setSeriesPaint(0, localGradientPaint1);
localBarRenderer.setSeriesPaint(1, localGradientPaint2);
localBarRenderer.setSeriesPaint(2, localGradientPaint3);
localBarRenderer.setLegendItemToolTipGenerator(new
StandardCategorySeriesLabelGenerator("Tooltip: {0}"));
CategoryAxis localCategoryAxis = localCategoryPlot.getDomainAxis();
localCategoryAxis.setCategoryLabelPositions(CategoryLabelPositions.createUpRotationL
abelPositions(0.5235987755982988D));
return localJFreeChart;
}
public static JPanel createDemoPanel(){
JFreeChart localJFreeChart = createChart(createDataset());
return new ChartPanel(localJFreeChart);
}}
DBCon.java
package mapreduce;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.ArrayList;
public class DBCon{
private static Connection con;
52
public static Connection getCon()throws Exception {
Class.forName("com.mysql.jdbc.Driver");
con = DriverManager.getConnection("jdbc:mysql://localhost/mapreduce","root","root");
return con;
}
public static String addReducer(String reducer,String lat,String lon)throws Exception{
String msg="fail";
con = getCon();
PreparedStatement stmt=con.prepareStatement("delete from reducer where
reducer_name=?");
stmt.setString(1,reducer);
stmt.executeUpdate();
PreparedStatement stat=con.prepareStatement("insert into reducer values(?,?,?)");
stat.setString(1,reducer);
stat.setString(2,lat);
stat.setString(3,lon);
int i=stat.executeUpdate();
if(i > 0)
msg = "success";
return msg;
}}
Reducer1.java
package mapreduce;
import java.io.ObjectOutputStream;
import java.io.ObjectInputStream;
import java.net.Socket;
import java.net.ServerSocket;
import java.awt.BorderLayout;
import java.awt.Color;
import java.awt.Container;
import java.awt.Font;
import java.awt.event.WindowAdapter;
import java.awt.event.WindowEvent;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.JButton;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;
import javax.swing.SwingUtilities;
import java.awt.event.ActionListener;
import java.awt.event.ActionEvent;
import javax.swing.UIManager;
public class Reducer1 extends JFrame{
ProcessThread thread;
JPanel p1,p2;
JLabel l1;
JScrollPane jsp;
53
JTextArea area;
Font f1,f2;
ServerSocket server;
Socket socket;
static StringBuilder sb = new StringBuilder();
public void start(){
try{
area.append("Node1 startup\n");
server = new ServerSocket(2222);
while(true){
socket = server.accept();
socket.setKeepAlive(true);
thread = new ProcessThread(socket,area);
thread.start();
}
}catch(Exception e){
e.printStackTrace();
}
}
public Reducer1(){
setTitle("Reducer1 Node");
p1 = new JPanel();
l1 = new JLabel("<html><body><center><font size=6 color=#f5ea01>Reducer1
Node</font></center></body></html>");
l1.setForeground(Color.white);
p1.add(l1);
p1.setBackground(Color.black);
f2 = new Font("Courier New", 1, 13);
p2 = new JPanel();
p2.setLayout(new BorderLayout());
area = new JTextArea();
area.setFont(f2);
jsp = new JScrollPane(area);
area.setEditable(false);
p2.add(jsp);
getContentPane().add(p1, "North");
getContentPane().add(p2, "Center");
addWindowListener(new WindowAdapter(){
@Override
public void windowClosing(WindowEvent we){
try{
if(socket != null){
socket.close();
}
server.close();
}catch(Exception e){
//e.printStackTrace();
}
}
});
54
}
public static void main(String a[])throws Exception{
UIManager.setLookAndFeel("com.sun.java.swing.plaf.nimbus.NimbusLookAndFe
el");
Reducer1 r1 = new Reducer1();
r1.setVisible(true);
r1.setSize(600,400);
new ServerThread(r1);
}
}
Reducer2.java
package mapreduce;
import java.io.ObjectOutputStream;
import java.io.ObjectInputStream;
import java.net.Socket;
import java.net.ServerSocket;
import java.awt.BorderLayout;
import java.awt.Color;
import java.awt.Container;
import java.awt.Font;
import java.awt.event.WindowAdapter;
import java.awt.event.WindowEvent;
import javax.swing.JFrame;
import javax.swing.JLabel;
import javax.swing.JPanel;
import javax.swing.JButton;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;
import javax.swing.SwingUtilities;
import java.awt.event.ActionListener;
import java.awt.event.ActionEvent;
import javax.swing.UIManager;
public class Reducer2 extends JFrame{
ProcessThread thread;
JPanel p1,p2;
JLabel l1;
JScrollPane jsp;
JTextArea area;
Font f1,f2;
ServerSocket server;
Socket socket;
static StringBuilder sb = new StringBuilder();
public void start(){
try{
area.append("Node2 startup\n");
server = new ServerSocket(3333);
while(true){
socket = server.accept();
55
socket.setKeepAlive(true);
thread = new ProcessThread(socket,area);
thread.start();
}
}catch(Exception e){
e.printStackTrace();
}
}
public Reducer2(){
setTitle("Reducer2 Node");
p1 = new JPanel();
l1 = new JLabel("<html><body><center><font size=6 color=#f5ea01>Reducer2
Node</font></center></body></html>");
l1.setForeground(Color.white);
p1.add(l1);
p1.setBackground(Color.black);
f2 = new Font("Courier New", 1, 13);
p2 = new JPanel();
p2.setLayout(new BorderLayout());
area = new JTextArea();
area.setFont(f2);
jsp = new JScrollPane(area);
area.setEditable(false);
p2.add(jsp);
getContentPane().add(p1, "North");
getContentPane().add(p2, "Center");
addWindowListener(new WindowAdapter(){
@Override
public void windowClosing(WindowEvent we){
try{
if(socket != null){
socket.close();
}
server.close();
}catch(Exception e){
//e.printStackTrace();
}}
});
}
public static void main(String a[])throws Exception{
UIManager.setLookAndFeel("com.sun.java.swing.plaf.nimbus.NimbusLookAndFe
el");
Reducer2 r2 = new Reducer2();
r2.setVisible(true);
r2.setSize(600,400);
new ServerThread(r2);
}
}
56
CHAPTER 7
EXPERIMENTAL RESULTS
A GUI (graphical user interface) is a system of interactive visual components
for computer software. A GUI displays objects that convey information, and represent
actions that can be taken by the user. The objects change color, size, or visibility when the
user interacts with them.
Upon running our code, we get GUI as Fig. 6.1: containing different buttons
as follows.
57
Fig. 6.2: Defining Reducer1 location
Click on Define Reducer location and add 2 reducer locations. Now we need to
define the reducers. As of our project we are defining two reducers. Firstly, We are defining
the reducer 1 at particular location by providing the specific latitude and longitude. In our
project, we provided reducer 1 latitude at 17.4293823 and longitude at 78.4507852. On
clicking the save reducer button, the reducer 1 location details will be saved which can be
seen in Fig. 6.2.
After successfully adding the reducer 1 details the screen is displayed as reducer
details added. After that we should click on the okay button so that we can define the next
reducer location.
We need to repeat the same for the reducer and start both the nodes before starting
our application project.
58
Fig. 6.3: Defining Reducer2 location
Now we need to define the reducer 2. We are defining the reducer 2 at particular
location by providing the specific latitude and longitude. In our project, we provided reducer
2 latitude at 17.4080325 and longitude at 78.4489493 as shown in Fig. 6.3. On clicking the
save reducer button, the reducer 2 location details will be saved.
After successfully adding the reducer 2 details the screen is displayed as reducer
details added then we should click on the okay button.
Soon after adding the locations of reducer1 and reducer2, we are required to start the
reducer nodes by following the below steps
59
Fig. 6.4: Upload Documents
Now upload the document or dataset on which processing need to be done. Next step
after running the reducer application i.e., reducer1 and reducer 2, the main work is to get the
assigned task done. Here in order to carry out the work to be done we mainly need to upload
the document the file which contains the data that need to be processed to obtain the required
results.
The data is present in a file named s1 which contains the required data in it. Hence,
we select the s1 data file and upload the document.
After successfully uploading the document, the data gets loaded which is the input
data and input document gets loaded. The demonstration is shown in the above Fig. 6.4.
60
Click on start MapReduce Aggregation
61
The below screen shows the Aggregated data after successfully processing by the
code provided by us in the previous steps
In here the processed data from the reducer nodes are obtained depending upon the
calculation made to which reducer node the request is nearer. Aggregate data is high-level
data which is acquired by combining individual-level data. For instance, the output of an
industry is an aggregate of the firms' individual outputs within that industry. Aggregate
data are applied in statistics, data warehouses, and in economics.
Upon calculating the distance between each node and the query location, the
request will be sent to the node which is near which is shown as follows (Reducer
applications)
62
Fig. 6.7: Results on Reducer nodes
In the above, the request has been processed by Reducer 1, because the reducer 1 is
nearer to the mapper location. (For mapper application location details refer Main.java line
number 214). When the distance between both the reducers and query is same the request is
sent to any one the reducer nodes. In actual the reducers are placed away from each other in
order to control traffic problems associated.
In our case the distance between the reducers and the query location are calculated
with the help of longitude and latitude and smaller distance is taken into consideration and
job is scheduled to that node. As we can see in Fig. 6.7: the request is been sent to reducer1.
63
Click on network traffic cost graph in order to check the traffic patterns
In this the aggregator functionality is to reduce merged traffic from multiple map
tasks.
64
CHAPTER 7
CONCLUSION AND FUTURE SCOPE
7.1. CONCLUSION
In this project, we study the joint optimization of intermediate data partition and
aggregation in MapReduce to minimize network traffic cost for big data applications.
We propose a three-layer model for this problem and formulate it as a mixed-integer
nonlinear problem, which is then transferred into a linear form that can be solved by
mathematical tools. We report speedups of up to 4 with seven devices and energy savings
up to 71% with eight devices. To deal with the large-scale formulation due to big data, we
design a distributed algorithm to solve the problem on multiple machines. Our project has
better results when compared with other algorithms such as honeybee model and etc. This
Furthermore, we extend our algorithm to handle the MapReduce job in an online manner
when some system parameters are not given. Finally, we conduct extensive simulations to
evaluate our proposed algorithm under both offline cases and online cases. The simulation
results demonstrate that our proposals can effectively reduce network traffic cost under
various network settings.
Attaching the load balancers and autoscaling groups increases accuracy and
efficiency of the results during high traffic.
Distributed Scheduling can be improved with the help of YARN tool.
Including the ability to calculate location of the sender.
More sophisticated tools by combining existing ones can be used.
65
CHAPTER 8
REFERENCES
[1]. J. Dean and S. Ghemawat, “Mapreduce: simplified data processing on large
clusters,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.
[2]. W. Wang, K. Zhu, L. Ying, J. Tan, and L. Zhang, “Map task scheduling in
MapReduce with data locality: Throughput and heavy-traffic optimality,” in
INFOCOM, 2013 Proceedings IEEE.IEEE, 2013, pp. 1609–1617.
[3]. F. Chen, M. Kodialam, and T. Lakshman, “Joint scheduling of processing and
shuffle phases in MapReduce systems,” in INFOCOM,2012 Proceedings IEEE.
IEEE, 2012, pp. 1143–1151.
[4]. Y. Wang, W. Wang, C. Ma, and D. Meng, “Zput: A speedy data uploading
approach for the Hadoop distributed file system,” in Cluster Computing
(CLUSTER), 2013 IEEE International Conference on. IEEE, 2013, pp. 1–5.
[5]. S. Chen and S. W. Schlosser, “Map-reduce meets wider varieties of applications,”
Intel Research Pittsburgh, Tech. Rep. IRP-TR-08-05, 2008.
[6]. S. Venkataraman, E. Bodzsar, I. Roy, A. AuYoung, and R. S. Schreiber, “Presto:
distributed machine learning and graph processing with sparse matrices,” in
Proceedings of the 8th ACM European Conference on Computer Systems. ACM,
2013, pp. 197–210.
[7]. A.Matsunaga, M. Tsugawa, and J. Fortes, “Cloudblast: Combining mapreduce
and virtualization on distributed resources for bioinformatics applications,” in
eScience, 2008. eScience’08. IEEE Fourth International Conference on 2008, pp.
222–229.
[8]. T.White, Hadoop: the definitive guide: the definitive guide. “O’ Reilly
Media, Inc.”, 2009. Genomics, proteomics & bioinformatics, vol. 12, no. 1, pp.
48–51, 2014.
66
ONE PAGE: STUDENT PROFILE
PRATHYUSHA GADE
67
CH CHANDANA
68
BOGARAM AKANKSHA
69
SYED ABDUL MANNAN H HASHMI
Syed Abdul Mannan H Hashmi is currently pursuing his final year, Bachelor of
Technology in the Computer Science and Engineering stream at St. Martin's Engineering
College. He completed his schooling from FIITJEE World School with CGPA of 9.3 and
intermediate from Sri Chaitanya Narayana Junior Kalasala with 85% aggregate. He has fair
understanding of Java and C, basic knowledge of C++ and python, hands on experience with
HTML. He is pretty good at data structures. He has participated in some events, seminars,
project expos conducted by college.
70