4 - Internet Protocols and Standards - HTTP, HTTPS, FTP, SMTP, TCP - IP, URI & URL
4 - Internet Protocols and Standards - HTTP, HTTPS, FTP, SMTP, TCP - IP, URI & URL
• Internet Protocol (IP) – a set of rules that dictate how data should be
delivered over the public network (Internet)
• IP - set of rules, for routing and addressing packets of data so that they can
travel across networks and arrive at the correct destination.
• Data traversing the Internet is divided into smaller pieces, called packets.
IP information is attached to each packet, and this information
helps routers to send packets to the right place.
• Every device or domain that connects to the Internet is assigned an IP
address.
• Once the packets arrive at their destination, they are handled differently
depending on which transport protocol is used in combination with IP. The
most common transport protocols are TCP and UDP (User Datagram
Protocol).
Internet Standard
• The Physical Layer defines the physical characteristics of the network such
as connections, voltage levels and timing.
Hypertext Transfer Protocol
Development of HTTP
• The user-agent is any tool that acts on the behalf of the user. This role is
primarily performed by the Web browser.
• The browser initiate the request. It is never the server (though some
mechanisms have been added over the years to simulate server-initiated
messages).
• To present a Web page, the browser sends an original request to fetch the
HTML document that represents the page. It then analyzes this file, making
additional requests corresponding to execution scripts, layout information
(CSS) to display, and sub-resources contained within the page (usually images
and videos).
• The Web browser then mixes these resources to present to the user a complete
document, the Web page. Scripts executed by the browser can fetch more
resources in later phases and the browser updates the Web page accordingly.
• A Web page is a hypertext document. This means some parts of displayed text
are links which can be activated (usually by a click of the mouse) to fetch a
new Web page, allowing the user to direct their user-agent and navigate through
the Web.
The Web server
• HTTPS adds
1. Encryption
2. Authentication
3. Integrity to the HTTP protocol
How is HTTPS different from HTTP?
1. Encryption
– HTTP was originally designed as a clear text protocol, it is vulnerable
to eavesdropping (spying) and man in the middle attacks.
– By including SSL/TLS encryption, HTTPS prevents data sent over the
internet from being interrupted and read by a third party.
– Through public-key cryptography and the SSL/TLS handshake, an
encrypted communication session can be securely set up between two
parties who have never met in person (e.g. a web server and browser)
via the creation of a shared secret key.
How is HTTPS different from HTTP?
2. Authentication
• Unlike HTTP, HTTPS includes robust authentication via the SSL/TLS
protocol.
• A website’s SSL/TLS certificate includes a public key that a web browser
can use to confirm that documents sent by the server (such as HTML
pages) have been digitally signed by someone in possession of the
corresponding private key.
• HTTPS websites can also be configured for mutual authentication, in
which a web browser presents a client certificate identifying the user.
• Mutual authentication is useful for situations such as remote work, where it
is desirable to include multi-factor authentication, reducing the risk of
phishing or other attacks involving credential theft.
How is HTTPS different from HTTP?
3) Integrity
• Each document (such as a web page, image, or JavaScript file) sent to a
browser by an HTTPS web server includes a digital signature that a web
browser can use to determine that the document has not been altered by a
third party or otherwise corrupted while in transit.
• The server calculates a cryptographic hash of the document’s contents,
included with its digital certificate, which the browser can independently
calculate to prove that the document’s integrity is intact.
Why use HTTPS?
Privacy
• HTTPs plays an important role to avoid -
– Intruders scooping up credit card numbers and passwords (shop or
bank online)
– Avoid snooping (including governments, employers, or someone
building a profile to de-anonymize your online activities)
Why use HTTPS?
User Experience
• Provide secured website
• Great user experience
Compatibility
• Current browser changes are pushing HTTP ever closer to incompatibility.
• Mozilla Firefox recently announced an optional HTTPs-only mode, while
Google Chrome is steadily moving to block mixed content (HTTP resources
linked to HTTPs pages).
SEO
• Search engines (including Google) use HTTPs as a ranking signal when
generating search results.
• Website owners can get an easy SEO boost just by configuring web servers to
use HTTPs rather than HTTP.
What happens if my website doesn’t use
HTTPs?
• In 2020, websites that do not use HTTPs or serve mixed content (serving
resources like images via HTTP from HTTPs pages) are subject to browser
security warnings and errors.
• Furthermore, these websites unnecessarily compromise their users’ privacy
and security, and are not preferred by search engine algorithms.
• Therefore, HTTP and mixed-content websites can expect more browser
warnings and errors, lower user trust and poorer SEO than if they had
enabled HTTPs.
How do I know if a website uses HTTPS?
• In modern browsers like Chrome, Firefox, and Safari, users can click the
lock to see if an HTTPs website’s digital certificate includes identifying
information about its owner.
FTP
File Transfer Protocol
FTP
• FTP
– File transfer protocol.
– Network protocol for transmitting files between computers over
TCP/IP connections.
• FTP is an application layer protocol.
• In an FTP transaction
– The end user's computer is called the local host.
– The second computer involved in FTP is a remote host, which is
usually a server.
• Both computers need to be connected via a network and configured
properly to transfer files via FTP. Servers must be set up to run FTP
services, and the client must have FTP software (FileZilla, Progress
MOVEit, WinSCP, etc.) installed to access these services.
• Used for transferring the web page files from server to client.
• Used for downloading the files to computer from other servers.
How does FTP work?
• Control Connection
– Uses very simple rules for communication.
– Can transfer a line of command or line of response
at a time.
– Is made between the control processes.
– Remains connected during the entire interactive FTP
session.
• Data Connection
– Uses very complex rules as data types may vary.
– Is made between data transfer processes.
– The data connection opens when a command comes
for transferring the files and closes when the file is
transferred.
FTP Used
• FTP is used for file transfers between one system and another, and it has
several common use cases, including the following:
• Backup
– FTP can be used by backup services or individual users to backup data
from one location to a secured backup server running FTP services.
• Replication
– Similar to backup, replication involves duplication of data from one
system to another but takes a more comprehensive approach to provide
higher availability and resilience.
• Access and data loading
– FTP is also commonly used to access shared web hosting and cloud
services as a mechanism to load data onto a remote system.
Advantages of FTP
• Speed
– One of the biggest advantages of FTP is speed.
– The FTP is one of the fastest way to transfer the files from one
computer to another computer.
• Efficient
– It is more efficient as we do not need to complete all the operations to
get the entire file.
• Security
– To access the FTP server, we need to login with the username and
password. Therefore, we can say that FTP is more secure.
• Back & forth movement
– FTP allows us to transfer the files back and forth. Suppose you are a
manager of the company, you send some information to all the
employees, and they all send information back on the same server.
Disadvantages of FTP
• The standard requirement of the industry is that all the FTP transmissions
should be encrypted. However, not all the FTP providers are equal and not
all the providers offer encryption. So, we will have to look out for the FTP
providers that provides encryption.
• FTP serves two operations, i.e., to send and receive large files on a
network. However, the size limit of the file is 2GB that can be sent. It also
doesn't allow to run simultaneous transfers to multiple receivers.
• Passwords and file contents are sent in clear text that allows unwanted
spying. So, it is quite possible that attackers can carry out the brute force
attack by trying to guess the FTP password.
• It is not compatible with every system.
SMTP
(Simple Mail Transfer Protocol)
SMTP
• Using a process called “store and forward,” SMTP moves email on and
across networks.
• It works closely with the Mail Transfer Agent (MTA) to send
communication to the right computer and email inbox.
• The main purpose of SMTP is used to set up communication rules between
servers. The servers have a way of identifying themselves and announcing
what kind of communication they are trying to perform. They also have a
way of handling the errors such as incorrect email address.
• For example, if the recipient address is wrong, then receiving server reply
with an error message of some kind.
Model of SMTP system
Model of SMTP system
• In the SMTP model user deals with the user agent (UA) for example
Microsoft Outlook, Netscape, Mozilla, etc.
• In order to exchange the mail using TCP, MTA is used. The users sending
the mail do not have to deal with the MTA it is the responsibility of the
system admin to set up the local MTA.
• The MTA maintains a small queue of mails so that it can schedule repeat
delivery of mail in case the receiver is not available.
• The MTA delivers the mail to the mailboxes and the information can later
be downloaded by the user agents.
Connection-oriented
• It is a connection-oriented service that means the data exchange occurs
only after the connection establishment. When the data transfer is
completed, then the connection will get terminated.
Full duplex
• It is a full-duplex means that the data can transfer in both directions at the
same time.
Stream-oriented
• TCP is a stream-oriented protocol as it allows the sender to send the data in
the form of a stream of bytes and also allows the receiver to accept the data
in the form of a stream of bytes.
Working of TCP
• IP - internet protocol.
• It is a protocol used for sending the packets from source to destination.
• The main task of IP is to deliver the packets from source to the destination
based on the IP addresses available in the packet headers.
• IP defines the packet structure that hides the data which is to be delivered
as well as the addressing method that labels the datagram with a source and
destination information.
• An IP protocol provides the connectionless service, which is accompanied
by two transport protocols, i.e., TCP/IP and UDP/IP, so internet protocol is
also known as TCP/IP or UDP/IP.
Function of Internet Protocol (IP)
• Before an IP packet is sent over the network, two major components are
added in an IP packet, i.e., header and a payload.
• An IP header contains information about the IP packet which includes -
– Source IP address: The source is the one who is sending the data.
– Destination IP address: The destination is a host that receives the data
from the sender.
– Header length: length of the IP header in 4-byte (32-bit) units known as
“words,” and includes any option fields present and padding needed to
align the header on a 32-bit boundary.
– Packet length: Includes the IP addresses of the source and destination,
plus other fields that help to route the packet.
– TTL (Time to Live): The number of steps occurs before the packet gets
discarded.
– Transport protocol: The transport protocol used by the internet protocol,
either it can be TCP or UDP.
• Payload: Payload is the data that is to be transported.
2) IP Addressing
a) Public address
• The public address is also known as an external address as they are
grouped under the WAN addresses.
• Can define the public address as a way to communicate outside the
network. This address is used to access the internet.
• The public address available on our computer provides the remote access
to our computer. With the help of a public address, we can set up the home
server to access the internet.
2) IP Addressing
b) Private address
• A private address is also known as an internal address, as it is grouped
under the LAN addresses.
• It is used to communicate within the network. These addresses are not
routed on the internet so that no traffic can come from the internet to this
private address.
• The private addresses are assigned to mainly those computers, printers,
smartphones, which are kept inside the home or the computers that are kept
within the organization.
• Example - a private address is assigned to the printer, which is kept inside
our home, so that our family member can take out the print from the
printer.
Uniform Resource Identifier
(URI)
Uniform Resource Identifier (URI)
• Scheme
– The first component of URI is scheme that contain a sequence of
characters that can be any combination of letter, digit, plus sign, or
hyphen (_), which is followed by a colon (:). The popular schemes
are http, file, ftp, data, and irc. The schemes should be registered
with IANA (Internet Assigned Numbers Authority).
• Authority
– The authority component is optional and preceded by two slashes (//).
It contains three sub-components:
• userinfo: It may contain a username and an optional password
separated by a colon. The sub-component is followed by the @
symbol.
• host: It contains either a registered name or an IP address. The IP
address must be enclosed within [] brackets.
• Port: Optional
Syntax of URI
• Path
– It consists of a sequence of path segments separated by a slash(/). The
URI always specifies it; however, the specified path may be empty or
of 0 lengths.
• Query
– It is an optional component, which is preceded by a question mark(?).
It contains a query string of non-hierarchical data.
• Fragment
– It is also an optional component, preceded by a hash(#) symbol. It
consists of a fragment identifier that provides direction to a secondary
resource.
Some examples of URI
1. mailto:[email protected]
2. news:comp.infosystems.www.servers.unix
3. urn:oasis:names:specification:docbook:dtd:xml:4.1.2
4. foo://example.com:8042/over/there?name=ferret#nose
Uniform Resource Locator (URL)
• URL
– It is mainly referred to as the address of the website, which a user can
find in their address bars.
Syntax of URL
– Path
• The path indicates the complete path to the resource on the
webserver. It can be like /software/htp/index.html.
– Query String
• It is the string that contains the name and value pair. If it is used in
a URL, it follows the path component and gives the information.
Such as "?key1=value1&key2=value2".
– Fragment
• It is also an optional component, preceded by a hash(#) symbol. It
consists of a fragment identifier that provides direction to a
secondary resource.
Difference between URI and URL
URI URL
URI is an acronym for Uniform Resource URL is an acronym for Uniform Resource
Identifier. Locator.
URI contains two subsets, URN, which tell URL is the subset of URI, which tells the
the name, and URL, which tells the only location of the resource.
location.
All URIs cannot be URLs, as they can tell All URLs are URIs, as every URL can
either name or location. only contain the location.
A URI aims to identify a resource and A URL aims to find the location or address
differentiate it from other resources by of a resource on the web.
using the name of the resource or location
of the resource.
An example of a URI can be ISBN 0-486- An example of an URL is
35557-4. https://www.javatpoint.com.
It is commonly used in XML and tag It is mainly used to search the webpages
library files such as JSTL and XSTL to on the internet.
identify the resources and binaries.
The URI scheme can be protocol, The scheme of URL is usually a protocol
designation, specification, or anything. such as HTTP, HTTPS, FTP, etc.
Conclusions