0% found this document useful (0 votes)
22 views

Wa0075

Uploaded by

dominic daniel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Wa0075

Uploaded by

dominic daniel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

DESIGN AND IMPLEMENTATION OF A CLOUD BASED LECTURE NOTES

(MATERIAL) DISTRIBUTION SYSTEM


ABSTRACT

Peer-to-peer file sharing applications have been in existence for some time. Familiar
applications or architectures such as Napster, Gnutella, and Freenet have a large
user base. The main design idea of these systems is files are distributed throughout
nodes. This is much different from traditional client/server systems, where files
would lie on one central machine, and all transfers would occur only between the
individual clients and that machine. In peer-to-peer, file transfers can occur between
the individual nodes.

The proposed system is a peer-to-peer file sharing application. Why did we decide to
create one more file sharing system? The answer is that the proposed system has
some unique properties, which distinguish it from other existing systems. The system
is developed to be an auxiliary in distributing academic resources among students.
The proposed system is developed in such manner that it provides maximum user
friendly interface using PHP (Hypertext pre-processor) programming language using
the XAMPP as the testbed.
CHAPTER ONE

1.0 INTRODUCTION

BACKGROUND OF STUDY

File sharing is the distribution of digital media using peer-topeer (P2P) networking

technology which is most suitable for the distribution of lecture materials. P2P file

sharing allows users to access media files such as books, music, movies, and

games using a P2P software program that searches for other connected computers

on a P2P network to locate the desired content. The nodes (peers) of such networks

are end-user computer systems that are interconnected via the Internet.

Peer-to-peer file sharing technology has evolved through several design stages from

the early networks like Napster, which popularized the technology, to the later

models like the BitTorrent protocol.

Several factors contributed to the widespread adoption and facilitation of

peer-to-peer file sharing. These included increasing Internet bandwidth, the

widespread digitization of physical media, and the increasing capabilities of

residential personal computers. Users were able to transfer either one or more files

from one computer to another across the Internet through various file transfer

systems and other file-sharing networks.

1.1HISTORICAL BACKGROUND

Peer-to-peer file sharing became popular in 1999 with the introduction of Napster, a

file sharing application and a set of central servers that linked people who had files

with those who requested files. The central index server indexed the users and their

shared content. When someone searched for a file, the server searched all available
copies of that file and present them to the user. The files would be transferred

directly between the two private computers. A limitation was that only music files

could be shared. Because this process occurred on a central server, however,

Napster was held liable for copyright infringement and shut down in July 2001. It

later reopened as a pay service.

After Napster was shut down, the most popular peer-to-peer services were Gnutella

and Kazaa. These services also allowed users to download files other than music,

such as movies and games.

1.2STATEMENT OF RESEARCH PROBLEM

The existing system is semi-automated system.  Here in this system needs to save

the information in the form of excel sheets or Disk Drives. There is no sharing is

possible if the data is in the form Disk drives. This system gives us very less security

for saving data; some data may be lost due to mismanagement. It’s a limited system

and fewer users friendly. Searching of particular information is very critical it takes lot

of time. The users cannot able to restrict the file sharing options.  The users only

know his information only not others. It is very critical to share public information to

all users.

1.3 AIM AND OBJECTIVES OF THE STUDY

The aim of this system is to design and implement a cloud based lecture resource

(material) distribution system

The project objectives are:


I. To develop an algorithm for the gridlock control system for mobile

telecommunication based on an already instigated design


II. Based on the algorithm developed, to develop and realize a source code for

the proposed system.

III. To deploy the individual modules generated from the source codes into a

functional software prototype using the CORBA protocol.

IV. To test the system to ensure it is free from bugs, data leaks and other

software related vulnerabilities using the sublime text editor.

Also, the project will be concerned with the following features:Ability of sharing file

with route ip address, Creation of share group among the network, Security issues

handling (file protection before destination).

1.4SIGNIFICANCE OF THE STUDY

This work will be of great importance to the communication network, also the project

work will reduce network traffic by enabling file synchronization and FTP file resume,

the work will promote communication through digital media as one can transfer

information in a digitalized source to one another

1.5 SCOPE AND LIMITATION OF THE STUDY

This work is limited to the following factors, below are some of the reason to which

the new system cannot function upon.

Network: this system is based and operates on a wireless network, of which without

the availability of network the system cannot function

Light: this system sorely depends on power supply for functionality, if there is no

source light the system can also not function too.

Platform: this system is developed to run only on window platform operating system,

therefore, will and cannot run on any other platform.


Among this are some of the limitation of the system beside that, the system is tested

and ran and confirmed to be perfect in its functionality.

1.6DEFINITION OF TERMS

Platform: is a computer environment that runs and manage the other operation and

functions of all other programs and software within a system e.g Windows, Linux,

Mac

System: a system is an electronic device capable of receiving, processing and

producing and output as its outcome, also a system is a collection of related

machines that works together for the achievement of one goal

Peer: a peer or peer to peer is an analogy used when two systems are connect to

each other within the networking or more

IP: an IP is an internet Protocol which is unique to each use in connecting ones

system over the network/internet

Route: in computer route is the pat in which a system is linked to the network
CHAPTER 2

LITERATURE REVIEW

ARCHITECTURE OF EXISTING PEER-TO-PEER FILE SHARING SYSTEMS

Peer-to-Peer (P2P) systems and applications are distributed systems without any

centralized control or hierarchical organization [1]. In a pure P2P system, the

software running at each node is equivalent in functionality. P2P is not a new

concept. However, it has caught the public eye only recently with the familiar

peer-to-peer file-sharing network applications such as Napster, Gnutella, Freenet,

Morpheus, etc. [2, 3]. Until recently, peer-to-peer file sharing applications have

followed one of two main models: the hybrid peer-to-peer system with centralized

servers such as that used by Napster, and the pure decentralized peer-to-peer

system such as Gnutella and Freenet [4]. A third model that harnesses the benefits

of the centralized model and the decentralized model has emerged recently. This

model is a hybrid where a central server facilitates the peer discovery process and

super-peers proxy the requests for local peers and unloads the searching burden on

the central server [5]. Applications such as Morpheus and KazaA use the third
model. The new versions of some Gnutella applications such as BearShare have

also applied the super-peer concept in their file discovery algorithms.

In searching for an appropriate architecture for our Proposed project, we surveyed

the architecture of three models used in peer-to-peer file sharing systems. In the

next part of the paper, we will first discuss different systems in terms of their

architecture and then compare them in terms of performance, resource

requirements, fault tolerance, and scalability. The last part of this survey will

concentrate on search algorithms in two decentralized peer-to-peer filesharing

systems: Gnutella and Freenet.

THE CENTRALIZED MODEL OF P2P FILE-SHARING

In this model, a central server or a cluster of central servers directs the traffic

between individual registered peers [4]. This model is also referred to as a hybrid

file-sharing system because both pure P2P and client-server systems are present

[6]. The file transfer is pure P2P while the file search is client-server. Both Napster

and OpenNap use this model. The central servers in Napster and OpenNap maintain

directories of the shared files stored at each registered peer of the current network.

Every time a user logs on or off the Napster network, the directories at the central

servers are updated to include or remove the files shared by the user. A user logs on

the Napster network by connecting with one of the central servers. Each time a user

wants a particular file, it sends a request to the central server to which it is

connected. The central server will search its database of files shared by peers who

are currently connected to the network, and creates a list of files matching the search

criteria. The resulted list will be sent to the user. The user can then select the desired

file from the list and open a direct HTTP link with the peer who possesses that file.
The file is directly transferred from one peer to another peer. The actual MP3 file is

never stored in the central server. The central server only holds the directory

information of the shared file but not the file itself.

In the centralized model used by Napster, information about all files shared in the

system is kept in the central server. There is no need to query individual users to

discover a file. The central index in Napster can locate files in the system quickly and

efficiently. Every user has to register with the central server to be on the network.

Thus the central index will include all files shared in the system, and a search

request at the central server will be matched with all files shared by all logged-on

users. This guarantees that all searches are as comprehensive as possible. Fig. 1

shows the architecture of the Napster network.

Napster’s Architecture.

THE DECENTRALIZED MODEL OF P2P FILE-SHARING

In a decentralized P2P file-sharing model, peers have the same capability and

responsibility. The communication between peers is symmetric in the sense that


each peer acts both as a client and a server and there is no master-slave

relationship among peers. At any given point in time, a node can act as a server to

the nodes that are downloading files from it and as a client to the nodes that it is

downloading files from. The software running at each node includes both the server

and client functionality.

Unlike a centralized P2P file sharing system, the decentralized network does not use

a central server to keep track of all shared files in the network. Index on the

meta-data of shared files is stored locally among all peers. To find a shared file in a

decentralized file sharing network, a user asks its friends (nodes to which it is

connected), who, in turn, asks their friends for directory information. File discovery in

a decentralized system is much more complicated than in a centralized system.

Different systems have applied different file discovery mechanisms.

The success of a decentralized file sharing system largely depends on the success

of the file discovery mechanisms used in the system. Applications implementing this

model include Gnutella, Freenet, etc. We will discuss two file discovery mechanisms

used in Gnutella network and Freenet later. Fig 2 shows the architecture of the

decentralized model.
Architecture of a Decentralized P2P File Sharing System

COMPARISON OF CENTRALIZED AND DECENTRALIZED P2P SYSTEMS

In a centralized model, file discovery can be done efficiently and comprehensively

with the file indexing information kept at central server. At the same time, the central

server is the only point of entry for peers in the system. The central server itself can

become a bottleneck to the system. A central server failure can lead to the collapse

of the whole network. Information provided by the central server might also be out of

date because the database at the central server is only updated periodically. To store
information about all files shared in the system also requires a significant amount of

storage space and a powerful processor at the central server to handle all the file

search requests in the system. The scalability of the system depends on the

processing power at the central server. When Napster central servers are

overloaded, file search in the system can become very slow and some peers might

not be able to connect to the network due to the central server’s limited capability [7].

In a decentralized model, the responsibility for file discovery is shared among peers.

If one or more peers go down, search requests can still be passed along other peers.

There is no single point failure that can bring the network down [8]. Therefore, a

decentralized P2P system is more robust compared with a centralized P2P system.

At the same time, since indexing information is kept locally at individual user’s

computer, file discovery requires search through the network. File discovery in a

decentralized P2P system can become inefficient and produce a large amount of

network traffic.

Inspired by the efficient file discovery advantage in a centralized P2P system and the

scalability and fault tolerance advantages in a decentralized P2P system, Morpheus

and KazaA have implemented pseudo-centralized system architectures to take

advantage of the strengths of both the centralized and decentralized model.

PARTIALLY CENTRALIZED SYSTEM WITH SUPER NODES

Both Morpheus and KazaA implemented a partially centralized architecture based on

technology developed by FastTrack, an Amsterdam-based startup company. Like

Gnutella but unlike Napster, Morpheus and KazaA do not maintain central file

directories. Like Napster but unlike Gnutella, Morpheus and KazaA are formally

closed systems, requiring centralized user registration and logon [5].


Both Morpheus and KazaA implement the FastTrack P2P Stack protocol, a

C++based protocol stack licensed from FastTrack. Although Morpheus claims to be

a “distributed, self-organizing network”, there is still a central server in the network

that is responsible for maintaining user registrations, logging users into the system

(in order to maintain active user statistics, etc.), and bootstrapping the peer

discovery process.

The registered user can look for super nodes through http://supernodekazaa.com [9].

A user needs to provide user name and password information to get access to super

node information. After a Morpheus peer is authenticated to the server, the server

provides it with the IP address and port (always 1214) of one or more “SuperNodes”

to which the peer then connects. Once the new node receives its list of super nodes,

little communication between the central server and the new node is needed for file

discovery and file transfer. Upon receiving the IP address and port of super nodes,

the new node opens a direct connection with one super node. A SuperNode acts like

a local search hub that maintains the index of the media files being shared by each

peer connected to it and proxies search requests on behalf of these peers. A super

node connects to other super nodes in the network to proxy search requests on

behalf of local peers. Queries are only sent to super node not to other peers. A

SuperNode will process the query received and send the search results back to the

requester directly if the data found in the SuperNode’s database. Otherwise, the

query is sent to other super nodes for search through the network. Search results in

Morpheus contain the IP addresses of peers sharing the files that match the search

criteria, and file downloads are purely peer-to-peer. Like Gnutella, files are

transferred with HTTP protocol in Morpheus.


Such a scheme greatly reduces search times in comparison to a broadcast query

algorithm like that employed on the Gnutella network. Fig 3 shows the architecture of

Morpheus network.

Architecture of Morpheus Network

A Morpheus peer will be automatically elected to become SuperNode if it has

sufficient bandwidth and processing power. Peers can choose not to run their

computer as SuperNode with a configuration parameter. Since the FastTrack P2P

stack protocol used by Morpheus is proprietary, no documentation regarding the

SuperNode election process is available. Clip2, a firm that provided network

statistics and information about P2P networks before it ceased operations,

independently developed the equivalent of a SuperNode for Gnutella, a product

called the Clip2 Reflector and generically designated a “super peer.” A prototype

future version of BearShare (code-named Defender), a popular Gnutella application,

implements the superpeer concept in the same integrated manner as Morpheus.


Since the application of the SuperNode concept in Morpheus has been done using

proprietary algorithms and protocols, we were unable find the protocol, but we found

the paper that proposed a scheme to implement the dynamic SuperNode selection

mechanism in the Gnutella network [10]. The approach discussed in the paper may

or may not be identical to the proprietary one used in Morpheus, with which we have

no way to compare. The following discussion about SuperNode selection is solely

based on the scheme proposed for Gnutella network. It should be noted that a

SuperNode is simply a traditional peer, but with superpowers. This is due to the

machine properties on which the application is running.

There are two types of nodes in the proposed new version of Gnutella network

protocol: super node and shielded node. A super node is just like any other node

except that it has better networking capabilities and processing power. These

capabilities are not necessary but are the desired ones. A super node also needs to

implement additional functionality in order to proxy other nodes. A shielded node is a

node that maintains only one connection and that connection is to a super node. A

super node acts as a proxy for the shielded node, and shields it from all incoming

Gnutella traffic.

When a shielded node A first joins the Gnutella network, it connects to Node B in the

normal fashion, but includes the following header in the connection handshake:

GNUTELLA CONNECT/0.6 <cr><lf>Supernode: False <cr><lf>

which states that Node A is not a super node. If node B is a shielded node, it will not

be able to accept any connection. Node B will then reject the connection using

suitable HTTP error code. However, Node B can provide IP addresses and ports of
few super nodes that it knows, including the one it is connected to, to Node A. Node

A can try to establish a connection with one of the super nodes it received from Node

B. If Node B is a super node with available client connection slots, Node B will

accept the connection from Node A. Node B may also optionally provide some other

Supernode IP addresses in the HTTP header sent to Node A for it to cache. These

can be used by Node A in the future in case the connection with this super node

breaks. If Node B is a super node with no client connection slot available, it will reject

the connection but provide Node A with IP addresses and port number of other super

nodes it knows so that Node A can connect to the network through other super

nodes.

A super node joins the network with connections to other super nodes. It can obtain

the addresses and port of other super nodes either through well know host cache

such as connect1.gnutellahosts.com:6346) or IRC channels, or through its own

cached addresses from previous connections. When a new super node C tries to

establish a super node connection to another super node D, node D will accept the

connection if it has super node connection slot available. Otherwise, node D will

provide it with the addresses and port numbers of other super nodes so that node C

can establish connections with other super nodes.

Two schemes were proposed for super nodes to handle the query from shielded

nodes. In the first scheme, a super node keeps index information about all files

shared by all shielded nodes that connect to it. When a shielded node first connects

to a super node, the super node sends an indexing query with TTL=0, HOPS=0 to

the client. Upon receiving the indexing query, a shielded node will send in all the files

that it is sharing wrapped in Query Replies. The super node indexes the information

it received from all shielded nodes. When receiving a query, the super node search
through the index in its database to find a match and creates query replies using the

IP address and the port of the corresponding file owner (itself, or a shielded node). In

the second scheme, a super node does not keep index information for all files

shared by shielded nodes, but instead of keep query routing information received

from shielded nodes. When receiving a query, the super node routes it to shielded

nodes selectively based on query routing information. A super node can occasionally

update the shielded nodes it connects to with the IP addresses and port of other

super nodes. The shielded nodes will cache this information. In case this super node

goes down, the shielded nodes can use the cached super node information to

establish connection with other super nodes.

The following protocol can be used in Gnutella network to assign a node as a

shielded node or a super node. If a node has a slow CPU or a slow network

connection, it should choose to be a shielded node itself and try to open a

connection to a super node. If there is no super node available to accept its

connection, it will act as a super node but accept no shieldedclient node connection

from other shielded nodes.

When a node that has enough CPU processing power and network capability joins

the network, it acts as a super node and establishes the configured number of super

node connections. At the configuration, a node also set the minimum number of

shielded nodes needed for it to be a super node (MIN_CLIENTS) and the time period

to reach the number (PROBATION_TIME). The new super node is on probation

during the PROBATION_TIME.

If it received at least MIN_CLIENTS number of shielded node connection requests

during the PROBATION_TIME, it continues to behave as a super node. If the super


node failed to receive MIN_CLIENTS number of shielded node connection requests

during the PROBATION_TIME, it becomes a shielded client node and tries to

connect to other super nodes as a shielded node. Whenever the number of shielded

node connections drops below MIN_CLIENTS, a super node will go on probation

until PROBATION_TIME. If the number of shielded node connection fails to reach

MIN_CLIENTS when PROBATION_TIME is over, the super node becomes a

shielded node and establishes a shielded node connection with other super nodes.

When a super node goes down, the shielded nodes connecting to it can behave as a

super node or a shielded node according to the node’s processing power and

network capability. If a shielded node chooses to behave as a super node in case of

its connection to the super node breaks, it is on probation using the mechanisms

described above. Consistent with the autonomous nature of Gnutella network, a

node chooses to behave as a shielded node or a super node without the interference

from a central server.

Besides the rules described above, a new node can get guidance from the existing

super node about whether it should be a super node or a shielded client when it

establishes connection with a super node.

With the proposed protocol described above, the new version of the Gnutella

network self-organizes into an interconnection of super nodes and shielded nodes

automatically. The super node scheme will reduce the traffic in Gnutella network

since only super nodes participate in message routing. At the same time, more

nodes will be searched because each super node may proxy many shielded nodes.

Moreover, many queries can be satisfied locally within the shielded nodes

connecting to the same super node without routing the query to other super nodes.

This will also reduce the network traffic related to query.


Since the protocol does not require a central server to assign super node, there is no

single point failure that can bring the network down. This partially centralized P2P

network is more robust and scaleable than the centralized P2P systems such as

Napster and OpenNap. Even in Morpheus and KazaA where a central server is

used, the central server only keeps information about super nodes in the system but

not indexing information about all files shared in the system. This reduces the

workload on central servers in comparison with fully centralized indexing system

such as Napster and OpenNap. In the case where the central server of Morpheus or

KazaA breaks down, the nodes that have been in the network before can use super

node information cached from previous connections to establish a connection to

super nodes they knew. However, new users cannot join the network without getting

super node information from the central server.

The super nodes in Morpheus and KazaA function differently from the central server

in Napster. The central server in Napster just keeps the index of the files shared in

the system. The central server itself does not share any file with peers in the system

or download files from other peers. In Morpheus and KazaA, a super node itself is a

peer. It shares file with other peers in the system. Napster will collapse if the central

server goes down. If one or several super nodes goes down, the peers connected to

these super nodes can open connection with other super nodes in the system, the

network will still function. If all super nodes go down, the existing peers can become

super nodes themselves.

Since super nodes keep indexing or routing information about files shared in the

local area, searches in these systems is more efficient than that in completely

decentralized systems such as original Gnutella network and Freenet. The new

version of the Gnutella protocol that is proposed in paper [10] and FastTrack P2P
stack used by Morpheus and KazaA reduce discovery time in comparison with purely

decentralized indexing system such as original Gnutella network and Freenet. While

Morpheus is largely a decentralized system, the speed of its query engine rivals that

of centralized systems like Napster because of its SuperNode.

FILE DISCOVERY IN DECENTRALIZED P2P FILE SHARING SYSTEMS

Since there is no central directory service available in a decentralized P2P file

sharing system. A peer who wants a file is required to search the network to locate

the file provider. The success of a decentralized P2P file sharing system largely

depends on the success of its file discovery mechanisms. Both Freenet and Gnutella

networks are decentralized P2P file-sharing systems, but their file search algorithms

are not the same. The next part of this survey will concentrate on the query

mechanisms in Freenet and Gnutella network.

File Discovery Mechanisms in Freenet: Chain Mode

Freenet is an adaptive peer-to-peer network of nodes that query one another for file

sharing [11,12]. Files in Freenet are identified by binary file keys [11]. There are three

types of file keys in Freenet: keyword-signed key, signed-subspace key, and

content-hash key. The key for a file is obtained by applying a hash function. Each

node in Freenet maintains its own local files that it makes available to the network as

well as a dynamic routing table containing the addresses of other nodes associated

with the keys that they are thought to hold. When a user in Freenet wants a file, it

initiates a request specifying the key and a hops-to-live value. A request for keys is

passed along from node to node through a chain where each node makes a local

decision about where to forward the request next depending on the key requested.
To keep the requester and the data provider anonymous, each node in Freenet only

knows their immediate upstream and downstream neighbors in the chain.

The hops-to-live value of the request, analogous to IP’s time-to-live, is decremented

at each node to prevent an infinite chain. Each request in Freenet also has a unique

ID for node to keep track of the requests. A node will reject a request with the ID it

saw previously to prevent loops in the network. Messages in Freenet contain a

randomly generated 64-bit ID, a hops-tolive value, and a depth counter [11]. The

depth counter is incremented at each hop and is used to set hops-to-live value when

a reply message is created so that the reply will reach the original requester. If a

downstream neighbor rejects a request, the sender will choose a different node to

forward to. The chain continues until either the requested data is found or the

hops-to-live value of the request is exceeded. If found, the requested data is passed

back through the chain to the original requester. If the request times out, a failure

result will be passed back to the requester through the same chain that routes the

request. The following routing algorithm is used to search and transfer files in

Freenet.

After receiving a request, a node first searches its own files and returns the data if

found with a note saying that it was the source of the data. If a node cannot satisfy

the request with its own files, it looks up its routing table for the key closest to the key

requested in terms of lexicographic distance and forwards the request to the node

that holds the closest key. If a node cannot forward the request to the best node in

the routing table because the preferred node is down or a loop would form, it will

forward the request to its second-best, then third-best node in the routing table, and

so on. When running out of candidates to try, a node will send a backtracking failure

message to its upstream requester, which, in turn, will try its second, then third
candidate. If all nodes in the network have been explored in this way or the request

T.T.L. reaches 0, a failure message will be sent back through the chain to the node

that sends the original request. Nodes store the ID and other information of the Data

Request message it has seen for routing Data Reply message and Request Failed

message. When receiving a Data Request with ID that has been seen before, the

node will send a backtracking Request Failed message to its upstream requester,

which may try other candidates in its routing table. Data Reply message in Freenet

will only be passed back through the nodes that route the Data Request message

previously. Upon receiving a Data Reply message, a node will forward it to the node

that the corresponding Data Request message was received from so that the Data

Reply will eventually be sent to the node that initiated the Data Request. If a node

receives a Data Reply message without seeing the corresponding Data Request

message, the Data Reply message will be ignored.

If a request is successful, the requested data will be returned in a Data Reply

message that inherits the ID of the Data Request message [12]. The TTL of the Data

Reply should be set equal to the depth counter of the Data Request message. The

Data Reply will be passed back through the chain that forwarded the Data Request

Message. Each node along the way will cache the file in its own database for future

requests, and create a new entry in its routing table associating the actual data

source with the requested key for future routing. A subsequent request to the same

key will be served immediately with the cached data. A request to a “similar” key will

be forwarded to the node that provided the data previously. This scheme allows the

node to learn about other nodes in the network over time so that the routing decision

can be improved over time and an adaptive network will evolve. There are two
consequences with this scheme. First, nodes in Freenet will specialize in locating

sets of similar keys. This occurs due to the fact that if a node is associated with a

particular key in the routing table, it is more likely that the node will receive requests

for keys similar to that key. Hence this node gains more experience in answering

those queries and become more knowledgeable about other nodes carrying the

similar keys in its routing table to make better routing decision in the future. This in

turn will make it a better candidate in the routing table of other nodes for those keys.

Second, nodes will specialize in storing files with similar keys in the same manner

because successfully forwarding a request will gain a copy of the requested file for

the node. Since most requests forwarded to a node will be for similar keys, the node

will obtain files with similar keys in this process. In addition, this scheme allows

popular data to be duplicated by the system automatically and closer to requesters.

To keep the actual data source anonymous, any node along the way can decide to

change the reply message to claim itself or another arbitrarily chosen node as the

data source. Since the data are cached along the way, the node that claimed to be

the data source will actually be able to serve future request to the same data.

File Discovery in Gnutella network: Broadcast

Gnutella is a protocol for distributed file search in peer-to-peer file sharing systems

[13]. Applications that implemented the Gnutella protocol form a completely

decentralized network. Gnutella was a protocol originally designed by Nullsoft, a

subsidiary of America Online. The current Gnutella protocol is the version 0.4 and

can be found at [13]. Many applications have implemented the Gnutella protocol:

BearShare LimeWire, ToadNode, NapShare are just a few.


Unlike Freenet, a node in the Gnutella network broadcasts to all its neighbors when it

requests a file search. There are five types of messages in the Gnutella network:

Ping, Pong, Query, QueryHit and Push. Each message in Gnutella contains a

Descriptor Header with a Descriptor ID uniquely identifying the message. A TTL field

in the Descriptor Header specifies how many times the message should be

forwarded. A "Hops" field in the Descriptor Header indicates how many times the

message has been forwarded. At any given node z, the "TTL" and "Hops" fields must

satisfy the following condition:

TTL (0) = TTL (z) + Hops (z)

where TTL (0) is the TTL at the node that initiates the message. A node decrements

a descriptor header’s TTL field and increments its Hops field before forwarding it to

any node [13].

When a node joins the Gnutella network, it connects to an existing node and

announces it is alive by sending a ping message to the existing node. After receiving

a Ping message, a node will send a pong message backwards to the originator of

the ping message with the ID inherited from the ping message and also forward the

ping message to all its neighbor except the one where it received the ping message.

Nodes in Gnutella network keep information about the ping messages that they see

in their routing table. When receiving a pong message, a node looks up its routing

table to find the connection that the corresponding ping message came from and

routes the pong message backwards through the same connection. A pong message

takes the corresponding ping message’s route backwards. A pong may only be sent

along the same path that carried the incoming ping. Only those nodes that routed the

ping will see the pong in response. If a node receives a pong with descriptor ID = n,
but has not seen a ping descriptor with the same ID, it should remove the pong from

the network. The TTL field of the ping message ensures that the ping message will

not propagate infinitely.

Queries are routed in the same way with pings. A node posts a Query to its

neighbors. When a node sees a Query message, it forwards it to its neighbors and

also searches its own files and sends back a QueryHit message to the node that

originates the Query if a match is found. A QueryHit message takes the Query’s

route backwards.

A node forward incoming ping and Query message to all of its directly connected

neighbors except the one that sent the incoming ping or Query. If a node receives

the same type of message with the ID that it saw previously, it will not forward the

message to any of its directly connected neighbors to avoid loops.

The QueryHit message sent back to the Query originator contains the following

fields: Number of Hits, Port, IP Address, Speed, Result Set, Servent ID. The

QueryHit message does not contain the actual files. After receiving the QueryHit

message, a node could download the selected files from the nodes that provide the

files directly using HTTP protocol. The actual files are transferred off the Gnutella

network.

Gnutella protocol allows a node behind a firewall to share files using Push requests

when a direct connection cannot be established between the source and the target

node. If a direct connection to the source node cannot be established, the node that

requests the file can send a Push request to the node that shares the file. A Push

request contains the following fields: Servant ID, File Index, IP Address, and Port.

After receiving a Push request, the node that share the file, identified by the Servant
ID field of the Push request, will attempt to establish a TCP/IP connection to the

requesting node identified by the IP address and Port fields of the Push request. If a

direct connection cannot be established from the source node to the requesting

node, it is most likely that the requesting node itself is also behind a firewall. The file

transfer cannot be accomplished in this case. If a direct connection to the requesting

node is established, the source node behind the firewall will send the following GIV

request header to the requesting node:

GIV <File Index> :<Servant Identifier>/<File Name> \n\n

Where <File Index> and <Servant Identifier> are the corresponding values from the

Push

Request. After receiving the above GIV request header, the requesting node

constructs an HTTP GET request with <File Index> and <File Name> extracted from

the header and sends it to the source node behind the firewall. Upon receiving the

GET request, the source node sends the file data preceded by the following HTTP

compliant header:

HTTP 200 OK\r\n

Server: Gnutella\r\n

Content-type: application/binary\r\n

Content-length: xxxx\r\n

\r\n

where “xxxx” is the actual file size.


The routing algorithm described above is based on Gnutella Protocol Specification

Version 0.4, which is currently used by most Gnutella applications. Several schemes

were proposed to extend the Gnutella Protocol Version 0.4 in order to reduce the

Query traffic in the network and improve the network’s scalability (see references

[14, 15, 16, 17] for detailed information about the extension to the current Gnutella

Protocol).

SEARCH PERFORMANCE COMPARISON OF GNUTELLA AND FREENET

The search performance in Freenet follows the Small-World Effect found by Stanley

Milgram [1]. In 1967, Stanley Miglram, a Harvard professor, mailed 60 letters to a

group of randomly chosen people in Omaha, Nebraska and asked them to pass

these letters to a target person in Boston as a part of a social experiment. Those

people did not know the target person and they were asked to pass the letters only

using intermediaries known to one another on a first-name basis. Each person would

pass the letter to one of his or her friends who were assumed to bring the letter close

to the target person in Boston. The friend would pass the letter to his or her friend,

and so on until the letter reached the target person. Surprisingly, 42 out of 60 letters

reached the target person through just 5.5 intermediaries on average. This famous

phenomenon was called the Small-World Effect.

The file search process in a decentralized P2P network such as Freenet resembles

the social experiment described above. The question is finding the node that holds

the file. Each node passes the search request to its neighbor that is thought to most

likely hold the file. The search request will eventually reach the node that holds the

file through a small number of intermediate nodes because of the Small-World Effect.
This can be seen as a graph problem where people or nodes are vertices and the

relationship between people or the connections between nodes are edges. The

question is to find a shortest path between two people or two nodes. In a random

graph where each of N vertices in the graph connects to random K vertices in the

graph, the path length is approximately logN/logK, which is much better than N/2K,

the path length in a regular graph where each node connects to the nearest K

vertices. The random connection among people or nodes is what yielded the

Small-World Effect.

The search performance in Freenet also has the Small-World Effect, which renders a

good average path length. However, in the worst case, the search can result in

unnecessary failure due to poor local routing decision. This is true especially at the

beginning when nodes do not have enough knowledge about other nodes. As the

nodes in the network gain more knowledge about other nodes, the routing decision

improves.

With broadcasting, Gnutella queries are guaranteed to find the optimal path in any

case. The price Gnutella paid for the optimal path length is a large amount of query

traffic over the network.

PROBLEMS WITH EXISTING P2P FILE SHARING SYSTEMS

Peer-to-peer file sharing has become popular recently, but there is a flip side to the

story. The decentralization and user autonomy that makes P2P appealing in the first

place also poses some potential problems.

The content of the files shared by individual users could be spurious. However, there

is no authority in the P2P system that can remove objectionable or invalid content.
The quality of the download service can vary due to heterogeneous connection

qualities.

According to one study, 35% Gnutella users have upstream bottleneck bandwidth of

at least 100Kbps, but only 8% of Gnutella users have at least 10Mbps bandwidth

while other 22% have bandwidth 100Kbps or less [18].

Another problem with current P2P file sharing is Free Riding. According to one study,

approximately 66% of Gnutella peers share no files and 73% share ten or less files

and nearly 50% of all responses are returned by the top 1% of sharing hosts [19].

The designer of P2P system needs to think of ways to regulate the growth of P2P

network to compensate the information providers and discourage the Free Riders.

Most files shared in popular P2P file sharing system are audio or video files. Most of

them involve some kind of copyright infringement and intellectual piracy. The lawsuit

of Napster has caught public’s attention to this issue. In a decentralized P2P system,

there is nobody to sue even though there is copyright infringement. The popularity of

the P2P file sharing system has posed a potential threat to the music industry.

LESSONS LEARNED FROM SURVEYING THE EXISTING P2P FILE SHARING

SYSTEM

The first lesson we learned from the existing P2P file sharing system is that there are

always tradeoffs in designing a P2P network protocol. Freenet trades off the optimal

path length for less query traffic in the network. On the other hand, Gnutella trades

off the query traffic for the optimal path length. A centralized P2P system trades off

fault tolerance and scalability for quick file discovery. A decentralized P2P system

trades off quick file discovery for fault tolerance and scalability. In a partially
centralized indexing P2P system, the protocol designer tries to get both scalability

and search efficiency. In designing the Proposed system, we made a trade-off of

storing a complete catalogue in return for no search time.

The second lesson we learned from surveying the existing P2P network is that the

users of the P2P file sharing network are heterogeneous in terms of many

characteristics: network connection speed, online time, the amount of data shared,

etc [18, 20]. The designer of future P2P system must take peers’ heterogeneity into

consideration when implementing routing algorithms. The Proposed application we

implemented is curtailed to a particular of user population. Peers in the Proposed

network are much more homogeneous than peers in otherP2P file-sharing systems.

Several design decisions were made based the assumption that the system is used

only by a small group of peers who are geographically close to each other and share

a small number of large files among them. These decisions were made to best serve

the need of this particular group. If the system were to be used by a large group of

people who share an enormous amount of files with various sizes, some design

choices would be different. As we saw earlier, P2P system design involves many

tradeoffs. We have to make decisions based on our knowledge about our user

population.

CHAPTER 3

MATERIALS AND METHODS


Materials

The following tools were used during the design phase of this project.

XAMPP: XAMPP is a free and open source cross-platform web server solution stack

package developed by Apache Friends, consisting mainly of the Apache HTTP

Server, MySQL database, and interpreters for scripts written in the PHP and Perl

programming languages. XAMPP stands for Cross-Platform (X), Apache (A), MySQL

(M), PHP (P) and Perl (P). It is a simple, lightweight Apache distribution that makes it

extremely easy for developers to create a local web server for testing and

deployment purposes. Everything needed to set up a web server – server application

(Apache), database (MariaDB), and scripting language (PHP) – is included in an

extractable file. XAMPP is also cross-platform, which means it works equally well on

Linux, Mac and Windows.

APTANA STUDIO: Aptana Studio is an open source integrated development

environment (IDE) for building web applications. Based on Eclipse, it supports

JavaScript, HTML, DOM and CSS with code-completion, outlining, JavaScript

debugging, error and warning notifications and integrated documentation.

BOOTSTRAP 3: Bootstrap is a free and open-source front-end web framework for

designing websites and web applications. It contains HTML- and CSS-based design

templates for typography, forms, buttons, navigation and other interface components,

as well as optional JavaScript extensions.

JQUERY: jQuery is a cross-platform JavaScript library designed to simplify the

client-side scripting of HTML. JQuery is the most popular JavaScript library in use

today. JQuery's syntax is designed to make it easier to navigate a document, select

DOM elements, create animations, handle events, and develop Ajax applications.
JQuery also provides capabilities for developers to create plug-ins on top of the

JavaScript library. This enables developers to create abstractions for low-level

interaction and animation, advanced effects and high-level, theme-able widgets. The

modular approach to the jQuery library allows the creation of powerful dynamic web

pages and Web applications.

SYSTEM REQUIREMENTS

Hardware Requirement

Processor: Intel(R) Core or higher

Installed Memory: 4.00GB or higher

Speed: 1.40GHz or faster

Operating System: 32/64-Bit operating system, x86/x64-based processor

Software Requirement

Operating System: Windows 7/8/8.1/10

Data Base: MySQL Server Version 5.3 and above

Web Server: Apache

Web Technologies: HTML, CSS, JQuery/Ajax and PHP

IDE & Tools: Aptana Studio, Phpmyadmin

FEASIBILITY STUDY

Preliminary investigation examines project feasibility; the likelihood the system will

be useful to the organization. The main objective of the feasibility study is to test the

Technical, Operational and Economical feasibility for adding new modules and
debugging old running system. All system is feasible if they are unlimited resources

and infinite time. There are aspects in the feasibility study portion of the preliminary

investigation: ·        

● Technical Feasibility ·        

● Operational Feasibility ·        

● Economic Feasibility

TECHNICAL FEASIBILITY

The technical issue usually raised during the feasibility stage of the investigation

includes the following: ·        

● Does the necessary technology exist to do what is suggested? ·        

● Do the proposed equipments have the technical capacity to hold the data

required to use the new system? ·        

● Will the proposed system provide adequate response to inquiries, regardless

of the number or location of users? ·        

● Can the system be upgraded if developed? ·        

● Are there technical guarantees of accuracy, reliability, ease of access and

data security?

Earlier no system existed to cater to the needs of ‘Secure Infrastructure

Implementation System’. The current system developed is technically feasible. It is a

web based user interface for audit workflow at NIC-CSD. Thus it provides an easy

access to the users. The database’s purpose is to create, establish and maintain a

workflow among various entities in order to facilitate all concerned users in their

various capacities or roles. Permission to the users would be granted based on the

roles specified.    Therefore, it provides the technical guarantee of accuracy,


reliability and security. The software and hard requirements for the development of

this project are not many and are already available in-house at NIC or are available

as free as open source. The work for the project is done with the current equipment

and existing software technology. Necessary bandwidth exists for providing a fast

feedback to the users irrespective of the number of users using the system.

OPERATIONAL FEASIBILITY

Proposed projects are beneficial only if they can be turned out into information

system. That will meet the organization’s operating requirements. Operational

feasibility aspects of the project are to be taken as an important part of the project

implementation. Some of the important issues raised are to test the operational

feasibility of a project includes the following: - ·         Is there sufficient support for the

management from the users? ·         Will the system be used and work properly if it is

being developed and implemented? ·         Will there be any resistance from the user

that will undermine the possible application benefits?

This system is targeted to be in accordance with the above-mentioned issues.

Beforehand, the management issues and user requirements have been taken into

consideration. So there is no question of resistance from the users that can

undermine the possible application benefits.

Feasibility Report - FILE SHARING MANAGEMENT SYSTEM

The well-planned design would ensure the optimal utilization of the computer

resources and would help in the improvement of performance status.

ECONOMIC FEASIBILITY
A system can be developed technically and that will be used if installed must still be

a good investment for the organization. In the economic feasibility, the development

cost in creating the system is evaluated against the ultimate benefit derived from the

new systems. Financial benefits must equal or exceed the costs. The system is

economically feasible. It does not require any addition hardware or software. Since

the interface for this system is developed using the existing resources and

technologies available at NIC. There is nominal expenditure and economic feasibility

for certain.

Methodology

A software development methodology is a framework that is used to structure, plan,

and control the process of developing an information system, this includes the

pre-definition of specific deliverables and artefact’s that are created and completed

by a project team to develop or maintain an application. A wide variety of such

frameworks have evolved over the years, each with its own recognized strengths

and weakness. One software development methodology framework is not

necessarily suitable for use by all projects. Each of the available methodology

frameworks are best suited to specific kinds of projects, based on various technical,

organizational, project and team considerations. These software development

frameworks are often bound to some kind of organization, which further develops,

supports the use, and promotes the methodology framework.

System Development Life Cycle

System development life cycle is a process of developing software on the basis of

the requirement of the end user to develop efficient and good quality software. It is

necessary to follow a particular procedure. The sequence of phases that must be


followed to develop good quality software is known as SDLC (System Development

Life Cycle).

The software is said to have a life cycle composed of several phases. Each of these

phases results in the development of either a part of the system or something

associated with the system, such as a test plan or a user manual. In the life cycle

model, called the “spiral model,” each phase has well-defined starting and ending

points, with clearly identifiable deliverables to the next phase.

As with most undertakings, planning is an important factor in determining the

success or failure of any software project. Essentially, good project planning will

eliminate many of the mistakes that would otherwise be made, and reduce the

overall time required to complete the project. As a rule of thumb, the more complex

the problem is, and the more thorough the planning process must be. Most

professional software developers plan a software project using a series of steps

generally referred to as the software development life cycle. The following example

is a generic model that should give you some idea of the steps involved in a typical

software project.

Figure 6: Diagram illustrating the core processes involved in SDLC


CHAPTER FOUR

IMPLEMENTATION

This system was implemented with two interfaces and two domains, for

lecturers and for students with online access. Lecturers can use their interface to

enter questions and alternative answers, it is also used to receive answers to the

questions and also questions from students. Students’ interface is used for students

to answer questions and to pose own questions to the lectures and receive answers

in real time. Firstly, lecturers and students have login name and a password, then

after the lecturers can create their courses and a number of students can register to

communicate with the lecturer. it is possible to upload, edit and delete presentations

or docs and other media materials including videos, pdf docs etc. the framework is

that it is created a real time interactivity between students and lecturers to enhance

learning.
Testing (Overview)

Testing is a process of executing a program with the intent of finding in error.

Debugging is the process of loading the exact cause of an error in removing that

cause. The system is designed using the XAMP as a test bed. Software testing is a

critical element of software quality assurance and represents the ultimate review of

specifications, design and code generation. These techniques provide systematic

guidance for designing tests that: Exercise the internal logic of software components,

and exercise the input and output domains of the programs to uncover the errors in

program function, behaviour and performance.

FIGURE 18.0 SYSTEM TESTING MODEL

Testing Methods that are used:

1. Black Box Testing.

2. White Box Testing.

3. Unit Testing.
4. Interface Testing. 

5. Interrogation Testing 

6. Performance testing.

1. Black Box Testing:We used Black Box testing. We give different

different type of inputs and check the output.

2. White Box Testing:In this testing, we check all the loops and

structure. We give input according to the the loops and structure

and check the output.

3. Unit Testing:In this testing, whenever a module is finished we

check it individualy, means all the functions are checked individualy.

4. Interface Testing: 

● We check if all the interactions between these applications are

executed properly or not.

● Errors are handled properly or not.

● If database returns any error message for any query by

application then it be should catch and display these error

messages appropriately to users.

5. Interrogation Testing :When unit testing is finished, we integrate

functions and then check the function that they are working properly

or not.

6. Performance testing: We test our Task on different internet

connection speed. In load testing test if user wants to perform so

many functionalities at the same time, large input data from users,

Simultaneous connection to DB, heavy load on specific pages etc.


4.7.1 Scope of Testing

In our Task, we had first gone for “unit testing” strategy. In which we test for the

functionality of each function, after that we performed “Integration testing” where we

integrated them all and tested them together.

4.7.2 Test Plan:

We have gone for unit testing and integral testing. So, we have initially concentrated

on unit testing and for that we spend some time whenever we developed any new

functions. This has been done during coding time as well as after the design

whenever we use them. After the completion of unit testing, we have moved to

integration testing and we completed it in one day.


CHAPTER FIVE

5.0 CONCLUSION, SUMMARY AND RECOMMENDATION

5.1 CONCLUSION

A Student lecturer real time file distribution system is implemented, as it

enables students and lecturers to engage in class participation, check student’s

attendance, share lecture materials, doc files, pdf`s, media files (audio and videos)

and clarify any misconceptions in real time. The application of this system will give

positive feedback and students and there will be no negative effects on student

learning.

However, good implementation could be time consuming and response to

questions need to be designed cautiously to prevent misleading students. Overall it

is beneficial to integrate the system into the school system to enhance class

interaction and participation.

5.2 SUMMARY OF THE FINDING

The chapter one of this work looks into the introduction to real time interactive

systems with the definitions and breakdown. The chapter two is the literature review

of the research work providing more details on interactivity and giving deep

understanding of a student lecturer real time interactive system. the chapter three

looks at the various materials and methods that was used for the designing of the

work research work. Chapter takes a dive into the implementation, result analysis

and testing of the prototype of the research work. The chapter five conclude this

piece of work rounding up with the conclusion, summary and recommendation.


5.3 RECOMMENDATION

I will like to conclude this piece of work with the following recommendation:

1. I humbly recommend that the student lecturer real time interactive system

should be incorporated in our educational system.

2. The area of real time interactivity can be fully harnessed to reach its full

potentials by the Nigerian educational body to help enhance learning at

various level of education in Nigeria.

3. I will also like to recommend that


REFERENCE

1. Yima, The survey of the technologies of peer-to-peer.

2 Fox, G. Peer to Peer Networks.

3. Parameswarn, M.; Susarla, A. &Whinston, A. P2P Networking: an Information-

Sharing Alternative.

4 Modern peer-to-peer file-sharing over the internet.

http://www.limewire.com/index.jsp/p2p

5. http://www.openp2p.com/pub/a/p2p/2001/07/02/morpheus.html

6. Yang, B. & Garcia-Molina. Comparing Hybrid Peer-to-Peer Systems.

7. http://www.napster.com/help/win/faq/#x-2

8. What is Gnutella?

http://www.gnutellanews.com/information/what_is_gnutella.shtml.

9. Sander, S. Investigating one incidence of anomalous network traffic.

10 Supernode specification.

http://groups.yahoo.com/group/the_gdf/files/Supernodes.html

11. Clarke, L.; Sandberg, O.; Wiley, B.; Hong, T.W. (Edited by: Federrath, H.)

Freenet: a distributed anonymous information storage and retrieval system.

Designing Privacy Enhancing Technologies. International Workshop on

DesignIssues in Anonymity and Unobservability. Proceedings (Lecture Notes in

Computer Science Vol.2009), (Designing Privacy Enhancing Technologies.


International Workshop on Design Issues in Anonymity and Unobservability.

Proceedings (Lecture Notes in Computer Science Vol.2009), Designing Privacy

Enhancing Technologies. International Workshop on Design Issues in Anonymity

andUnobservability, Berkeley, CA, USA, 25-26 July 2000.) Berlin, Germany:

Springer-Verlag, 2001. p.46-66

12. Clarke, I. A distributed decentralized information storage and retrieval system.

unpublished dissertation, University of Edinburgh, 1999.

13. The Gnutella protocol specification v0.4. Clip2 distributed search services,

http://www.limewire.com/index.jsp/developer.

14. Gnutella 0.6 Protocol Extension: Handshaking Protocol (also called the

LimeWire Connection Proposal):

http://groups.yahoo.com/group/the_gdf/message/2010

15. Proposed Gnutella Protocol Extensions: Ping/Pong Scheme:

http://www.limewire.com/index.jsp/pingpong.

16 MetaData Proposal: http://www.limewire.com/index.jsp/metainfo_searches,

http://www.limewire.com/developer/MetaProposal2.htm.

17 Query Routing: http://www.limewire.com/developer/query_routing/keyword

routing.htm.

18. Sarious, S., Gummadi, P.K., Gribble, S.D. A measurement study of peer-to-peer

file sharing systems.

HollyShare Project: Final Report 40

You might also like