CN Unit 5
CN Unit 5
The World Wide Web (WWW), often called the Web, is a system of interconnected webpages
and information that you can access using the Internet. It was created to help people share and
find information easily, using links that connect different pages together. The Web allows us to
browse websites, watch videos, shop online, and connect with others around the world through
our computers and phones.
All public websites or web pages that people may access on their local computers and other
devices through the internet are collectively known as the World Wide Web or W3. Users can
get further information by navigating to links interconnecting these pages and documents. This
data may be presented in text, picture, audio, or video formats on the internet.
WWW stands for World Wide Web and is commonly known as the Web. The WWW was started
by CERN in 1989. WWW is defined as the collection of different websites around the world,
containing different information shared via local servers (or computers).
Web pages are linked together using hyperlinks which are HTML-formatted and, also referred
to as hypertext, these are the fundamental units of the Internet and are accessed
through Hypertext Transfer Protocol (HTTP). Such digital connections, or links, allow users to
easily access desired information by connecting relevant pieces of information. The benefit of
hypertext is it allows you to pick a word or phrase from the text and click on other sites that have
more information about it.
The basic model of how the web works is shown in the figure below. Here the browser is
displaying a web page on the client machine. When the user clicks on a line of text that is linked
to a page on the abd.com server, the browser follows the hyperlink by sending a message to the
abd.com server asking it for the page.
Here the browser displays a web page on the client machine when the user clicks on a line of
text that is linked to a page on abd.com, the browser follows the hyperlink by sending a message
to the abd.com server asking for the page.
Working of WWW
A Web browser is used to access web pages. Web browsers can be defined as programs which
display text, data, pictures, animation and video on the Internet. Hyperlinked resources on the
World Wide Web can be accessed using software interfaces provided by Web browsers. Initially,
Web browsers were used only for surfing the Web but now they have become more universal.
The below diagram indicates how the Web operates just like client-server architecture of the
internet. When users request web pages or other information, then the web browser of your
system request to the server for the information and then the web server provide requested
services to web browser back and finally the requested service is utilized by the user who made
the request.
Web browsers can be used for several tasks including conducting searches, mailing, transferring
files, and much more. Some of the commonly used browsers are Internet Explorer, Opera Mini,
and Google Chrome.
Features of WWW
WWW is open source.
It is a distributed system spread across various websites.
It is Cross-Platform.
Components of the Web
There are 3 components of the web:
Uniform Resource Locator (URL): URL serves as a system for resources on the
web.
Hyper Text Transfer Protocol (HTTP): HTTP specifies communication of
browser and server.
Hyper Text Markup Language (HTML): HTML defines the structure,
organization and content of a web page.
Difference Between WWW and Internet
WWW Internet
HTTP:
HTTP stands for Hypertext Transfer Protocol. It is the main way web browsers and servers
communicate to share information on the internet. Tim Berner invents it. HyperText is the type
of text that is specially coded with the help of some standard coding language called HyperText
Markup Language (HTML). HTTP/2 is the new version of HTTP. HTTP/3 is the latest
version of HTTP, which is published in 2022.
When you visit a website, HTTP helps your browser request and receive the data needed to
display the web pages you see. It is a fundamental part of how the internet works, making it
possible for us to browse and interact with websites. In this article, we are going to discuss the
Full form of HTTP along with its working, advantages, and disadvantages.
HTTP stands for “Hypertext Transfer Protocol.” It is a set of rules for sharing data on the World
Wide Web (WWW). HTTP helps web browsers and servers communicate, allowing people to
access and share information over the internet.
Key Points
Basic Structure: HTTP forms the foundation of the web, enabling data
communication and file sharing.
Web Browsing: Most websites use HTTP, so when you click on a link or download
a file, HTTP is at work.
Client-Server Model: HTTP works on a request-response system. Your browser
(client) asks for information, and the website’s server responds with the data.
Application Layer Protocol: HTTP operates within the Internet Protocol Suite,
managing how data is transmitted and received.
Hyper Text:
The protocol used to transfer hypertext between two computers is known as HyperText Transfer
Protocol. HTTP provides a standard between a web browser and a web server to establish
communication. It is a set of rules for transferring data from one computer to another. Data such
as text, images, and other multimedia files are shared on the World Wide Web. Whenever a web
user opens their web browser, the user indirectly uses HTTP. It is an application protocol that is
used for distributed, collaborative, hypermedia information systems.
First of all, whenever we want to open any website we first open a web browser after that we
will type the URL of that website (e.g., www.facebook.com ).
This URL is now sent to the Domain Name Server (DNS). Then DNS first checks records for
this URL in their database, and then DNS will return the IP address to the web browser
corresponding to this URL. Now the browser is able to send requests to the actual server.
After the server sends data to the client, the connection will be closed. If we want something else
from the server we should have to re-establish the connection between the client and the server.
Characteristics of HTTP
HTTP is IP based communication protocol that is used to deliver data from server to client or
vice-versa.
The server processes a request, which is raised by the client, and also server and client
know each other only during the current bid and response period.
Once data is exchanged, servers and clients are no longer connected.
It is a request and response protocol based on client and server requirements.
It is a connection-less protocol because after the connection is closed, the server does
not remember anything about the client and the client does not remember anything
about the server.
It is a stateless protocol because both client and server do not expect anything from
each other but they are still able to communicate.
Cookies in HTTP
An HTTP cookie (web cookie, browser cookie) is a little piece of data that a server transmits to
a user’s web browser.
Advantages of HTTP
Memory usage and CPU usage are low because of fewer simultaneous connections.
Since there are few TCP connections hence network congestion is less.
The error can be reported without closing the connection.
Disadvantages of HTTP
HTTP requires high power to establish communication and transfer data.
HTTP is less secure because it does not use any encryption method like HTTPS and
uses TLS to encrypt regular HTTP requests and responses.
Electronic Mail
Electronic mail, commonly known as email, is a method of exchanging messages over the
internet. Here are the basics of email:
1. An email address: This is a unique identifier for each user, typically in the format of
[email protected].
2. An email client: This is a software program used to send, receive and manage emails,
such as Gmail, Outlook, or Apple Mail.
3. An email server: This is a computer system responsible for storing and forwarding
emails to their intended recipients.
To send an email:
4. Spool file : This file contains mails that are to be sent. User agent appends outgoing
mails in this file using SMTP.
Services provided by E-mail system:
Composition – The composition refers to process that creates messages and answers.
For composition any kind of text editor can be used.
Transfer – Transfer means sending procedure of mail i.e. from the sender to
recipient.
Reporting – Reporting refers to confirmation for delivery of mail. It help user to
check whether their mail is delivered, lost or rejected.
Displaying – It refers to present mail in form that is understand by the user.
Disposition – This step concern with recipient that what will recipient do after
receiving mail i.e save mail, delete before reading or delete after reading.
Advantages of email:
1. Convenient and fast communication with individuals or groups globally.
2. Easy to store and search for past messages.
3. Ability to send and receive attachments such as documents, images, and videos.
4. Available 24/7.
Disadvantages of email:
enabled device with a browser, making it a convenient and widely-used option. For example,
Gmail, Yahoo Mail, Outlook.com.
2. Client-Based Email: Client-based email requires dedicated email client software installed on
the user's device for access. These applications offer a more feature-rich and often customizable
experience compared to web-based email clients. For example, Microsoft Outlook, Mozilla
Thunderbird.
3. Secure Email Services: Secure email services prioritize end-to-end encryption and advanced
security features to protect user privacy and sensitive information.
4. Business or Corporate Email: Tailored for business use, corporate email services often include
collaboration tools, shared calendars, and enhanced security features to meet the specific needs of
organizations. For example, Microsoft Exchange, Google Workspace.
5. Disposable Email Services: Disposable email services provide temporary email addresses for
short-term use. Users often utilize them for activities like online registrations or verifications,
maintaining privacy. For example, Guerrilla Mail, 10 Minute Mail.
6. Encrypted Email Protocols: Encrypted email protocols focus on securing the content of
emails. Technologies like PGP and S/MIME employ encryption techniques to ensure
confidentiality in email communication. For example, PGP (Pretty Good Privacy), S/MIME
(Secure/Multipurpose Internet Mail Extensions).
7. POP3 (Post Office Protocol 3): POP3 retrieves emails from the server to the local device,
typically deleting them from the server. It is commonly used when users want to download and
store emails locally.
Email- protocols
Email protocols are a collection of protocols that are used to send and receive emails properly.
The email protocols provide the ability for the client to transmit the mail to or from the intended
mail server. Email protocols are a set of commands for sharing mails between two computers..
Three basic types of email protocols involved for sending and receiving mails are:
SMTP
POP3
IMAP
Simple Mail Transfer Protocol is used to send mails over the internet. SMTP is an application
layer and connection-oriented protocol. SMTP is efficient and reliable for sending emails. SMTP
uses TCP as the transport layer protocol.
It handles the sending and receiving of messages between email servers over a TCP/IP network.
This protocol along with sending emails also provides the feature of notification for incoming
mails. When a sender sends an email then the sender’s mail client sends it to the sender’s mail
server and then it is sent to the receiver mail server through SMTP. SMTP commands are used
to identify the sender and receiver email addresses along with the message to be sent.
Some of the SMTP commands are HELLO, MAIL FROM, RCPT TO, DATA, QUIT, VERIFY,
SIZE, etc. SMTP sends an error message if the mail is not delivered to the receiver hence, reliable
protocol.
Post Office Protocol is used to retrieve email for a single client. POP3 version is the current
version of POP used. It is an application layer protocol. It allows to access mail offline and thus,
needs less internet time. To access the message it has to be downloaded. POP allows only a
single mailbox to be created on the mail server. POP does not allow search facilities
Some of the POP commands are LOG IN, STAT, LIST, RETR, DELE, RSET, and QUIT. For
more details please refer to the POP Full-Form article.
Internet Message Access Protocol is used to retrieve mails for multiple clients. There are several
IMAP versions: IMAP, IMAP2, IMAP3, IMAP4, etc. IMAP is an application layer protocol.
IMAP allows to access email without downloading them and also supports email download. The
emails are maintained by the remote server. It enables all email operations such as creating,
manipulating, delete the email without reading it. IMAP allows you to search emails. Some of
the IMAP commands are: IMAP_LOGIN, CREATE, DELETE, RENAME, SELECT,
EXAMINE, and LOGOUT.
Multipurpose Internet Mail Extension Protocol is an additional email protocol that allows non-
ASCII data to be sent through SMTP. It allows users to send and receive different types of data
like audio, images, videos and other application programs on the Internet. It allows to send
multiple attachments with single message. It allows to send message of unlimited length.
Functions of Email
1. Message Sending and Receiving: The core function of email is to send and
receive messages between users.
2. Attachment Support: Users can attach files, such as documents, images, and
videos, to their emails, facilitating the sharing of information.
3. Organization: Email clients often provide tools for organizing messages into
folders, tagging, and prioritizing to help users manage their communications
effectively.
4. Spam Filtering: Most email services include spam filters to reduce unwanted or
malicious emails, enhancing user experience and security.
5. Integration with Other Tools: Many email services integrate with other
applications (calendars, task managers, CRM systems) to streamline workflows
and improve productivity.
6. Marketing and Newsletters: Businesses use email for marketing purposes,
sending newsletters, promotions, and updates to customers.
7. Collaboration: Email facilitates collaboration through group emails and the
ability to share documents and feedback among teams.
DNS
An application layer protocol defines how the application processes running on different systems,
pass the messages to each other.
DNS is a TCP/IP protocol used on different platforms. The domain name space is divided into
three different sections: generic domains, country domains, and inverse domain.
Generic Domains
o It defines the registered hosts according to their generic behavior.
o Each node in a tree defines the domain name, which is an index to the DNS database.
o It uses three-character labels, and these labels describe the organization type.
Label Description
Country Domain
The format of country domain is same as a generic domain, but it uses two-character country
abbreviations (e.g., us for the United States) in place of three-character organizational
abbreviations.
Inverse Domain
The inverse domain is used for mapping an address to a name. When the server has received a
request from the client, and the server contains the files of only authorized clients. To determine
whether the client is on the authorized list or not, it sends a query to the DNS server and ask for
mapping an address to the name.
Working of DNS
o DNS is a client/server network communication protocol. DNS clients send requests to the.
server while DNS servers send responses to the client.
o Client requests contain a name which is converted into an IP address known as a forward
DNS lookup while requests containing an IP address which is converted into a name known
as reverse DNS lookups.
o DNS implements a distributed database to store the name of all the hosts available on the
internet.
o If a client like a web browser sends a request containing a hostname, then a piece of
software such as DNS resolver sends a request to the DNS server to obtain the IP address
of a hostname. If DNS server does not contain the IP address associated with a hostname,
then it forwards the request to another DNS server. If IP address has arrived at the resolver,
which in turn completes the request over the internet protocol.
The above representation is showing the DNS Message format in which some fields are set to 0s
for query messages.
Identification: The identification field is made up of 16 bits which are used to match
the response with the request sent from the client-side. The matching is carried out
by this field as the server copies the 16-bit value of identification in the response
message so the client device can match the queries with the corresponding response
received from the server-side.
Flags: It is 16 bits and is divided into the following Fields:
CDN
A CDN-
1. Manages servers that are geographically distributed over different locations.
2. Stores the web content in its servers.
3. Attempts to direct each user to a server that is part of the CDN so as to deliver content
quickly.
The CDN is a collection of servers or a network of all the servers that deliver data all over the
world to the web user. It has three main components; each component has its own value and role
to play.
The first one is the Origin server which stores all the data spread all over the world. It is the
main server that handles every delivery and also it maintains the updated version of data.
The second one is the Edge server which stores temporary data means it stores a copy of the
original data temporarily. also, It is the one that delivers the data to the web user. The number
of Edge servers is many, the nearest Edge servers deliver the data to the web user so that there
should not be any delay in between the loading of the page on the web.
The third one is a DNS server that keeps track of the IP addresses. whenever a user sends a
request through the internet on the web browser then in response to that request the Origin Server
gives the IP address. By accessing that IP address, the user gets their data on the web.
Following Image depicts the difference between how a request is handled with and without a
CDNrespectively: WITHCDN(2SECONDS)
WITHOUTCDN(5SECONDS)
Benefits of CDN
● Security improvement- The DDOS mitigation improves the security as it contains some
security certificates and optimizations.
● Increase in content availability and redundancy- Hardware failures and more traffic can lead
to the website’s dysfunction. CDN can handle traffic and can withstand hardware disfunction
better than many servers.
● Better load times- The visitor has a fast page loading because a nearby CDN server is used
whenever a client search for a webpage. CDN also reduces the slow loading times by reducing
the bounce rates and increasing the amount of time people spend on site.
● Low bandwidth cost- The direct cost for hosting a website is bandwidth consumption cost.
With the help of caching and other optimizations, it minimizes the amount of data an origin
server must provide, thus reducing the hosting costs.
A peer-to-peer network is a simple network of computers. It first came into existence in the late
1970s. Here each computer acts as a node for file sharing within the formed network. Here each
node acts as a server and thus there is no central server in the network. This allows the sharing
of a huge amount of data. The tasks are equally divided amongst the nodes. Each node connected
in the network shares an equal workload. For the network to stop working, all the nodes need to
individually stop working. This is because each node works independently.
Before the development of P2P, USENET came into existence in 1979. The network enabled the
users to read and post messages. Unlike the forums we use today, it did not have a central server.
It is used to copy the new messages to all the servers of the node.
In the 1980s the first use of P2P networks occurred after personal computers were
introduced.
In August 1988, the internet relay chat was the first P2P network built to share text
and chat.
In June 1999, Napster was developed which was a file-sharing P2P software. It could
be used to share audio files as well. This software was shut down due to the illegal
sharing of files. But the concept of network sharing i.e P2P became popular.
In June 2000, Gnutella was the first decentralized P2P file sharing network. This
allowed users to access files on other users’ computers via a designated folder.
Types of P2P networks
1. Unstructured P2P networks: In this type of P2P network, each device is able to
make an equal contribution. This network is easy to build as devices can be connected
randomly in the network. But being unstructured, it becomes difficult to find content.
For example, Napster, Gnutella, etc.
2. Structured P2P networks: It is designed using software that creates a virtual layer
in order to put the nodes in a specific structure. These are not easy to set up but can
give easy access to users to the content. For example, P-Grid, Kademlia, etc.
3. Hybrid P2P networks: It combines the features of both P2P networks and client-
server architecture. An example of such a network is to find a node using the central
server.
These networks do not involve a large number of nodes, usually less than 12. All the
computers in the network store their own data but this data is accessible by the group.
Unlike client-server networks, P2P uses resources and also provides them. This
results in additional resources if the number of nodes increases. It requires specialized
software. It allows resource sharing among the network.
Since the nodes act as clients and servers, there is a constant threat of attack.
Almost all OS today support P2P networks.
In the P2P network architecture, the computers connect with each other in a workgroup to share
files, and access to internet and printers.
Each computer in the network has the same set of responsibilities and capabilities.
Each device in the network serves as both a client and server.
The architecture is useful in residential areas, small offices, or small companies where
each computer act as an independent workstation and stores the data on its hard drive.
Each computer in the network has the ability to share data with other computers in
the network.
The architecture is usually composed of workgroups of 12 or more computers.
Let’s understand the working of the Peer-to-Peer network through an example. Suppose, the user
wants to download a file through the peer-to-peer network then the download will be handled in
this way:
If the peer-to-peer software is not already installed, then the user first has to install
the peer-to-peer software on his computer.
This creates a virtual network of peer-to-peer application users.
The user then downloads the file, which is received in bits that come from multiple
computers in the network that have already that file.
The data is also sent from the user’s computer to other computers in the network that
ask for the data that exist on the user’s computer.
Thus, it can be said that in the peer-to-peer network the file transfer load is distributed among
the peer computers.
File sharing: P2P network is the most convenient, cost-efficient method for file
sharing for businesses. Using this type of network there is no need for intermediate
servers to transfer the file.
Blockchain: The P2P architecture is based on the concept of decentralization. When
a peer-to-peer network is enabled on the blockchain it helps in the maintenance of a
complete replica of the records ensuring the accuracy of the data at the same time. At
the same time, peer-to-peer networks ensure security also.
Direct messaging: P2P network provides a secure, quick, and efficient way to
communicate. This is possible due to the use of encryption at both the peers and
access to easy messaging tools.
Collaboration: The easy file sharing also helps to build collaboration among other
peers in the network.
File sharing networks: Many P2P file sharing networks like G2, and eDonkey have
popularized peer-to-peer technologies.
Content distribution: In a P2P network, unline the client-server system so the clients
can both provide and use resources. Thus, the content serving capacity of the P2P
networks can actually increase as more users begin to access the content.
IP Telephony: Skype is one good example of a P2P application in VoIP.
Easy to maintain: The network is easy to maintain because each node is independent
of the other.
Less costly: Since each node acts as a server, therefore the cost of the central server
is saved. Thus, there is no need to buy an expensive server.
No network manager: In a P2P network since each node manages his or her own
computer, thus there is no need for a network manager.
Adding nodes is easy: Adding, deleting, and repairing nodes in this network is easy.
Less network traffic: In a P2P network, there is less network traffic than in a client/
server network.
video CDN
A video CDN is a CDN that has been designed to support video stream delivery. The use of a CDN
for streaming video helps a stream reach viewers around the world,
minimizes latency and buffering time, and ensures that the stream's source or origin server are not
overwhelmed with requests.
While most CDNs are able to cache and deliver video content alongside HTML, images,
JavaScript, CSS style sheets, and other web content, video CDNs can be constructed exclusively
for streaming video. For instance, Netflix built out their own distributed network called Open
Connect to more efficiently deliver their video content.
CDN:
A content delivery network (CDN) is a group of connected servers that cache and deliver content
over the Internet. CDNs are spread out all over the world, enabling them to deliver content more
efficiently to a wider range of people than an origin server or a single data center can. A CDN
caches content whenever a user requests the content from a website that uses that CDN; to "cache"
means to temporarily store a file.
Suppose Bob hosts a website, bobisgreat.example.com, on a server in New York City, New York.
When Alice in Albany, New York (about 250 kilometers away), visits the website, it loads quickly,
since the website content has to travel only 250 kilometers. However, when Carlos tries to load
bobisgreat.example.com from his house in Los Angeles, California (about 4,800 kilometers away),
he has to wait a lot longer for the website to load.
If Bob uses a CDN service, the CDN can cache his website's content at locations close to both
Alice and Carlos. Suppose Bob's CDN caches his website at data centers in Albany and Los
Angeles, in addition to New York City. Now both Alice and Carlos hardly have to wait any time
at all for bobisgreat.example.com to load in their browsers.