SlideShare a Scribd company logo
15-441 Computer Networking
Lecture 25 – The Web
Lecture 19: 2006-11-
02 2
Outline
• HTTP review and details (more in notes)
• Persistent HTTP review
• HTTP caching
• Content distribution networks
Lecture 19: 2006-11-
02 3
HTTP Basics (Review)
• HTTP layered over bidirectional byte stream
• Almost always TCP
• Interaction
• Client sends request to server, followed by
response from server to client
• Requests/responses are encoded in text
• Stateless
• Server maintains no information about past
client requests
Lecture 19: 2006-11-
02 4
How to Mark End of Message? (Review)
• Size of message  Content-Length
• Must know size of transfer in advance
• Delimiter  MIME-style Content-Type
• Server must “escape” delimiter in content
• Close connection
• Only server can do this
Lecture 19: 2006-11-
02 5
HTTP Request (review)
• Request line
• Method
• GET – return URI
• HEAD – return headers only of GET response
• POST – send data to the server (forms, etc.)
• URL (relative)
• E.g., /index.html
• HTTP version
Lecture 19: 2006-11-
02 6
HTTP Request (cont.) (review)
• Request headers
• Authorization – authentication info
• Acceptable document types/encodings
• From – user email
• If-Modified-Since
• Referrer – what caused this page to be
requested
• User-Agent – client software
• Blank-line
• Body
Lecture 19: 2006-11-
02 7
HTTP Request (review)
Lecture 19: 2006-11-
02 8
HTTP Request Example (review)
GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT
5.0)
Host: www.intel-iris.net
Connection: Keep-Alive
Lecture 19: 2006-11-
02 9
HTTP Response (review)
• Status-line
• HTTP version
• 3 digit response code
• 1XX – informational
• 2XX – success
• 200 OK
• 3XX – redirection
• 301 Moved Permanently
• 303 Moved Temporarily
• 304 Not Modified
• 4XX – client error
• 404 Not Found
• 5XX – server error
• 505 HTTP Version Not Supported
• Reason phrase
Lecture 19: 2006-11-
02 10
HTTP Response (cont.) (review)
• Headers
• Location – for redirection
• Server – server software
• WWW-Authenticate – request for authentication
• Allow – list of methods supported (get, head, etc)
• Content-Encoding – E.g x-gzip
• Content-Length
• Content-Type
• Expires
• Last-Modified
• Blank-line
• Body
Lecture 19: 2006-11-
02 11
HTTP Response Example (review)
HTTP/1.1 200 OK
Date: Tue, 27 Mar 2001 03:49:38 GMT
Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1
OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24
Last-Modified: Mon, 29 Jan 2001 17:54:18 GMT
ETag: "7a11f-10ed-3a75ae4a"
Accept-Ranges: bytes
Content-Length: 4333
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html
…..
Lecture 19: 2006-11-
02 12
Outline
• HTTP intro and details
• Persistent HTTP
• HTTP caching
• Content distribution networks
Lecture 19: 2006-11-
02 13
Typical Workload (Web Pages)
• Multiple (typically small) objects per page
• File sizes
• Heavy-tailed
• Pareto distribution for tail
• Lognormal for body of distribution
-- For reference/interest only --
• Embedded references
• Number of embedded objects =
pareto – p(x) = aka
x-(a+1)
Lecture 19: 2006-11-
02 14
HTTP 0.9/1.0 (mostly review)
• One request/response per TCP connection
• Simple to implement
• Disadvantages
• Multiple connection setups  three-way
handshake each time
• Several extra round trips added to transfer
• Multiple slow starts
Lecture 19: 2006-11-
02 15
Single Transfer Example
Client Server
SYN
SYN
SYN
SYN
ACK
ACK
ACK
ACK
ACK
DAT
DAT
DAT
DAT
FIN
ACK
0 RTT
1 RTT
2 RTT
3 RTT
4 RTT
Server reads from
disk
FIN
Server reads from
disk
Client opens TCP
connection
Client sends HTTP request
for HTML
Client parses HTML
Client opens TCP
connection
Client sends HTTP request
for image
Image begins to arrive
Lecture 19: 2006-11-
02 16
More Problems
• Short transfers are hard on TCP
• Stuck in slow start
• Loss recovery is poor when windows are small
• Lots of extra connections
• Increases server state/processing
• Server also forced to keep TIME_WAIT
connection state
-- Things to think about --
• Why must server keep these?
• Tends to be an order of magnitude greater than # of
active connections, why?
Lecture 19: 2006-11-
02 17
Persistent Connection Solution (review)
• Multiplex multiple transfers onto one TCP connection
• How to identify requests/responses
• Delimiter  Server must examine response for delimiter string
• Content-length and delimiter  Must know size of transfer in
advance
• Block-based transmission  send in multiple length delimited
blocks
• Store-and-forward  wait for entire response and then use
content-length
• Solution  use existing methods and close connection otherwise
Lecture 19: 2006-11-
02 18
Persistent Connection Example (review)
Client Server
ACK
ACK
DAT
DAT
ACK
0 RTT
1 RTT
2 RTT
Server reads from
disk
Client sends HTTP request
for HTML
Client parses HTML
Client sends HTTP request
for image
Image begins to arrive
DAT
Server reads from
disk
DAT
Lecture 19: 2006-11-
02 19
Persistent HTTP (review)
Nonpersistent HTTP issues:
• Requires 2 RTTs per object
• OS must work and allocate
host resources for each TCP
connection
• But browsers often open
parallel TCP connections to
fetch referenced objects
Persistent HTTP
• Server leaves connection
open after sending response
• Subsequent HTTP messages
between same client/server
are sent over connection
Persistent without pipelining:
• Client issues new request
only when previous
response has been received
• One RTT for each
referenced object
Persistent with pipelining:
• Default in HTTP/1.1
• Client sends requests as
soon as it encounters a
referenced object
• As little as one RTT for all
the referenced objects
Lecture 19: 2006-11-
02 20
Outline
• HTTP intro and details
• Persistent HTTP
-- new stuff --
• HTTP caching
• Content distribution networks
Lecture 19: 2006-11-
02 21
HTTP Caching
• Clients often cache documents
• Challenge: update of documents
• If-Modified-Since requests to check
• HTTP 0.9/1.0 used just date
• HTTP 1.1 has an opaque “entity tag” (could be a file signature,
etc.) as well
• When/how often should the original be checked
for changes?
• Check every time?
• Check each session? Day? Etc?
• Use Expires header
• If no Expires, often use Last-Modified as estimate
Lecture 19: 2006-11-
02 22
Example Cache Check Request
GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT
If-None-Match: "7a11f-10ed-3a75ae4a"
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5;
Windows NT 5.0)
Host: www.intel-iris.net
Connection: Keep-Alive
Lecture 19: 2006-11-
02 23
Example Cache Check Response
HTTP/1.1 304 Not Modified
Date: Tue, 27 Mar 2001 03:50:51 GMT
Server: Apache/1.3.14 (Unix) (Red-Hat/Linux)
mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2
PHP/4.0.1pl2 mod_perl/1.24
Connection: Keep-Alive
Keep-Alive: timeout=15, max=100
ETag: "7a11f-10ed-3a75ae4a”
Lecture 19: 2006-11-
02 24
Ways to cache
Client-directed caching: Web Proxies
Server-directed caching: Content Delivery Networks
(CDNs)
Lecture 19: 2006-11-
02 25
Web Proxy Caches
• User configures browser:
Web accesses via cache
• Browser sends all HTTP
requests to cache
• Object in cache: cache
returns object
• Else cache requests object
from origin server, then
returns object to client
client
Proxy
server
client
HTTP request
HTTP request
HTTP response
HTTP response
HTTP request
HTTP response
origin
server
origin
server
Lecture 19: 2006-11-
02 26
Caching Example (1)
Assumptions
• Average object size = 100,000
bits
• Avg. request rate from
institution’s browser to origin
servers = 15/sec
• Delay from institutional router to
any origin server and back to
router = 2 sec
Consequences
• Utilization on LAN = 15%
• Utilization on access link = 100%
• Total delay = Internet delay +
access delay + LAN delay
= 2 sec + minutes + milliseconds
origin
servers
public
Internet
institutional
network
10 Mbps LAN
1.5 Mbps
access link
Lecture 19: 2006-11-
02 27
Caching Example (2)
Possible solution
• Increase bandwidth of access
link to, say, 10 Mbps
• Often a costly upgrade
Consequences
• Utilization on LAN = 15%
• Utilization on access link = 15%
• Total delay = Internet delay +
access delay + LAN delay
= 2 sec + msecs + msecs
origin
servers
public
Internet
institutional
network
10 Mbps LAN
10 Mbps
access link
Lecture 19: 2006-11-
02 28
Caching Example (3)
Install cache
• Suppose hit rate is .4
Consequence
• 40% requests will be satisfied almost
immediately (say 10 msec)
• 60% requests satisfied by origin
server
• Utilization of access link reduced to
60%, resulting in negligible delays
• Weighted average of delays
= .6*2 sec + .4*10msecs < 1.3 secs
origin
servers
public
Internet
institutional
network
10 Mbps LAN
1.5 Mbps
access link
institutional
cache
Lecture 19: 2006-11-
02 29
Problems
• Over 50% of all HTTP objects are uncacheable – why?
• Not easily solvable
• Dynamic data  stock prices, scores, web cams
• CGI scripts  results based on passed parameters
• Obvious fixes
• SSL  encrypted data is not cacheable
• Most web clients don’t handle mixed pages well many generic
objects transferred with SSL
• Cookies  results may be based on passed data
• Hit metering  owner wants to measure # of hits for revenue, etc.
• What will be the end result?
Lecture 19: 2006-11-
02 30
Content Distribution Networks (CDNs)
• The content providers are the
CDN customers.
Content replication
• CDN company installs hundreds
of CDN servers throughout
Internet
• Close to users
• CDN replicates its customers’
content in CDN servers. When
provider updates content, CDN
updates servers
origin server
in North America
CDN distribution node
CDN server
in S. America CDN server
in Europe
CDN server
in Asia
Lecture 19: 2006-11-
02 31
Outline
• HTTP intro and details
• Persistent HTTP
• HTTP caching
• Content distribution networks
Lecture 19: 2006-11-
02 32
Content Distribution Networks &
Server Selection
• Replicate content on many servers
• Challenges
• How to replicate content
• Where to replicate content
• How to find replicated content
• How to choose among know replicas
• How to direct clients towards replica
Lecture 19: 2006-11-
02 33
Server Selection
• Which server?
• Lowest load  to balance load on servers
• Best performance  to improve client performance
• Based on Geography? RTT? Throughput? Load?
• Any alive node  to provide fault tolerance
• How to direct clients to a particular server?
• As part of routing  anycast, cluster load balancing
• Not covered 
• As part of application  HTTP redirect
• As part of naming  DNS
Lecture 19: 2006-11-
02 34
Application Based
• HTTP supports simple way to indicate that Web page has
moved (30X responses)
• Server receives Get request from client
• Decides which server is best suited for particular client and object
• Returns HTTP redirect to that server
• Can make informed application specific decision
• May introduce additional overhead  multiple connection
setup, name lookups, etc.
• OK solution in general, but…
• HTTP Redirect has some flaws – especially with current browsers
• Incurs many delays, which operators may really care about
Lecture 19: 2006-11-
02 35
Naming Based
• Client does DNS name lookup for service
• Name server chooses appropriate server address
• A-record returned is “best” one for the client
• What information can name server base decision
on?
• Server load/location  must be collected
• Information in the name lookup request
• Name service client  typically the local name server for client
Lecture 19: 2006-11-
02 36
How Akamai Works
• Clients fetch html document from primary server
• E.g. fetch index.html from cnn.com
• URLs for replicated content are replaced in html
• E.g. <img src=“http://cnn.com/af/x.gif”> replaced with
<img src=“http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif”>
• Client is forced to resolve aXYZ.g.akamaitech.net
hostname
Lecture 19: 2006-11-
02 37
How Akamai Works
• How is content replicated?
• Akamai only replicates static content (*)
• Modified name contains original file name
• Akamai server is asked for content
• First checks local cache
• If not in cache, requests file from primary server and
caches file
* (At least, the version we’re talking about today. Akamai actually lets
sites write code that can run on Akamai’s servers, but that’s a pretty
different beast)
Lecture 19: 2006-11-
02 38
How Akamai Works
• Root server gives NS record for akamai.net
• Akamai.net name server returns NS record for
g.akamaitech.net
• Name server chosen to be in region of client’s name
server
• TTL is large
• G.akamaitech.net nameserver chooses server in
region
• Should try to chose server that has file in cache - How
to choose?
• Uses aXYZ name and hash
• TTL is small  why?
Lecture 19: 2006-11-
02 39
Simple Hashing
• Given document XYZ, we need to choose a
server to use
• Suppose we use modulo
• Number servers from 1…n
• Place document XYZ on server (XYZ mod n)
• What happens when a servers fails? n  n-1
• Same if different people have different measures of n
• Why might this be bad?
Lecture 19: 2006-11-
02 40
Consistent Hash
• “view” = subset of all hash buckets that are visible
• Desired features
• Balanced – in any one view, load is equal across
buckets
• Smoothness – little impact on hash bucket contents
when buckets are added/removed
• Spread – small set of hash buckets that may hold an
object regardless of views
• Load – across all views # of objects assigned to hash
bucket is small
Lecture 19: 2006-11-
02 41
Consistent Hash – Example
• Smoothness  addition of bucket does not cause
movement between existing buckets
• Spread & Load  small set of buckets that lie near object
• Balance  no bucket is responsible for large number of
objects
• Construction
• Assign each of C hash buckets to
random points on mod 2n
circle,
where, hash key size = n.
• Map object to random position on
circle
• Hash of object = closest
clockwise bucket
0
4
8
12
Bucket
14
Lecture 19: 2006-11-
02 42
How Akamai Works
End-user
cnn.com (content provider) DNS root server Akamai server
1 2 3
4
Akamai high-level
DNS server
Akamai low-level DNS
server
Nearby matching
Akamai server
11
6
7
8
9
10
Get
index.h
tml
Get /cnn.com/foo.jpg
12
Get foo.jpg
5
Lecture 19: 2006-11-
02 43
Akamai – Subsequent Requests
End-user
cnn.com (content provider) DNS root server Akamai server
1 2 Akamai high-level
DNS server
Akamai low-level DNS
server
7
8
9
10
Get
index.
html
Get
/cnn.com/foo.jpg
Nearby matching
Akamai server
Lecture 19: 2006-11-
02 44
Impact on DNS Usage
• DNS is used for server selection more and more
• What are reasonable DNS TTLs for this type of use
• Typically want to adapt to load changes
• Low TTL for A-records  what about NS records?
• How does this affect caching?
• What do the first and subsequent lookup do?
Lecture 19: 2006-11-
02 45
HTTP (Summary)
• Simple text-based file exchange protocol
• Support for status/error responses, authentication, client-side state
maintenance, cache maintenance
• Workloads
• Typical documents structure, popularity
• Server workload
• Interactions with TCP
• Connection setup, reliability, state maintenance
• Persistent connections
• How to improve performance
• Persistent connections
• Caching
• Replication
Lecture 19: 2006-11-
02 47
Typical Workload (Server)
• Popularity
• Zipf distribution (P = kr-1
)  surprisingly common
• Obvious optimization  caching
• Request sizes
• In one measurement paper  median 1946 bytes, mean 13767
bytes
• Why such a difference? Heavy-tailed distribution
• Pareto – p(x) = aka
x-(a+1)
• Temporal locality
• Modeled as distance into push-down stack
• Lognormal distribution of stack distances
• Request interarrival
• Bursty request patterns
Lecture 19: 2006-11-
02 48
Caching Proxies – Sources for Misses
• Capacity
• How large a cache is necessary or equivalent to infinite
• On disk vs. in memory  typically on disk
• Compulsory
• First time access to document
• Non-cacheable documents
• CGI-scripts
• Personalized documents (cookies, etc)
• Encrypted data (SSL)
• Consistency
• Document has been updated/expired before reuse
• Conflict
• No such misses

More Related Content

What's hot (20)

PDF
HTTP - The Protocol of Our Lives
Brent Shaffer
 
PDF
HA Deployment Architecture with HAProxy and Keepalived
Ganapathi Kandaswamy
 
PDF
Adding Support for Networking and Web Technologies to an Embedded System
John Efstathiades
 
PDF
HTTP
Daniel Kummer
 
PPTX
Grpc present
Phạm Hải Anh
 
PPTX
Introducing HTTP/2
Ido Flatow
 
PDF
Codefest2015
Denis Kolegov
 
PDF
HAProxy 1.9
HAProxy Technologies
 
PDF
O'Reilly Fluent Conference: HTTP/1.1 vs. HTTP/2
Load Impact
 
PDF
HAProxy tech talk
icebourg
 
PPTX
Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...
Jon Watte
 
PPTX
gRPC on .NET Core - NDC Sydney 2019
James Newton-King
 
PPTX
Modern Distributed Messaging and RPC
Max Alexejev
 
PDF
Understanding the Web through HTTP
Olivia Brundage
 
PPTX
Introduction to HTTP
Yihua Huang
 
PDF
SPDY and HTTP/2
Fabian Frank
 
PPT
Web Server Load Balancer
MobME Technical
 
PPTX
Http/2
Adrian Cardenas
 
ODP
ChinaNetCloud Training - HAProxy Intro
ChinaNetCloud
 
PDF
Tomcatx performance-tuning
Vladimir Khokhryakov
 
HTTP - The Protocol of Our Lives
Brent Shaffer
 
HA Deployment Architecture with HAProxy and Keepalived
Ganapathi Kandaswamy
 
Adding Support for Networking and Web Technologies to an Embedded System
John Efstathiades
 
Grpc present
Phạm Hải Anh
 
Introducing HTTP/2
Ido Flatow
 
Codefest2015
Denis Kolegov
 
O'Reilly Fluent Conference: HTTP/1.1 vs. HTTP/2
Load Impact
 
HAProxy tech talk
icebourg
 
Message Queuing on a Large Scale: IMVUs stateful real-time message queue for ...
Jon Watte
 
gRPC on .NET Core - NDC Sydney 2019
James Newton-King
 
Modern Distributed Messaging and RPC
Max Alexejev
 
Understanding the Web through HTTP
Olivia Brundage
 
Introduction to HTTP
Yihua Huang
 
SPDY and HTTP/2
Fabian Frank
 
Web Server Load Balancer
MobME Technical
 
ChinaNetCloud Training - HAProxy Intro
ChinaNetCloud
 
Tomcatx performance-tuning
Vladimir Khokhryakov
 

Viewers also liked (20)

PPT
Becoming w2 lecture_and_seminar_2013-14
Sandra Sinfield
 
PDF
CS5229 09/10 Lecture 6: Simulation
Wei Tsang Ooi
 
PPT
Cs854 lecturenotes01
Mehmet Çelik
 
PDF
Lecture 0: Introduction to CS5229
Wei Tsang Ooi
 
PDF
Lecture 7: How to STUDY the Social Web? (2014)
Lora Aroyo
 
PDF
Lecture 5: Personalization on the Social Web (2014)
Lora Aroyo
 
PPT
[1]اساسيات تفاعل الانسان مع الحاسوب
تامنيت نلالوت
 
PDF
Lecture 3: Human-Computer Interaction: HCI Design (2014)
Lora Aroyo
 
PDF
Lecture 2: Human-Computer Interaction: Conceptual Design (2014)
Lora Aroyo
 
PDF
Lecture 4: Human-Computer Interaction: Prototyping (2014)
Lora Aroyo
 
PPT
Lecture 1: Human-Computer Interaction Introduction (2014)
Lora Aroyo
 
PPTX
Human computer interaction
Ayusha Patnaik
 
PPTX
human computer interface
Santosh Kumar
 
PPT
Artificial inteligence
Intekhab Alam Khan
 
ODP
Artificial Intelligence
Girish Naik
 
PPT
artificial intelligence
vallibhargavi
 
PPTX
Simulation UNIT-I
deganagarajulc
 
PPT
Artificial Intelligence
u053675
 
PPT
Simulation Powerpoint- Lecture Notes
Kesavartinii Bala Krisnain
 
PDF
Fifty Features of Java EE 7 in 50 Minutes
glassfish
 
Becoming w2 lecture_and_seminar_2013-14
Sandra Sinfield
 
CS5229 09/10 Lecture 6: Simulation
Wei Tsang Ooi
 
Cs854 lecturenotes01
Mehmet Çelik
 
Lecture 0: Introduction to CS5229
Wei Tsang Ooi
 
Lecture 7: How to STUDY the Social Web? (2014)
Lora Aroyo
 
Lecture 5: Personalization on the Social Web (2014)
Lora Aroyo
 
[1]اساسيات تفاعل الانسان مع الحاسوب
تامنيت نلالوت
 
Lecture 3: Human-Computer Interaction: HCI Design (2014)
Lora Aroyo
 
Lecture 2: Human-Computer Interaction: Conceptual Design (2014)
Lora Aroyo
 
Lecture 4: Human-Computer Interaction: Prototyping (2014)
Lora Aroyo
 
Lecture 1: Human-Computer Interaction Introduction (2014)
Lora Aroyo
 
Human computer interaction
Ayusha Patnaik
 
human computer interface
Santosh Kumar
 
Artificial inteligence
Intekhab Alam Khan
 
Artificial Intelligence
Girish Naik
 
artificial intelligence
vallibhargavi
 
Simulation UNIT-I
deganagarajulc
 
Artificial Intelligence
u053675
 
Simulation Powerpoint- Lecture Notes
Kesavartinii Bala Krisnain
 
Fifty Features of Java EE 7 in 50 Minutes
glassfish
 
Ad

Similar to computer networking (20)

PPTX
application of http.pptx
ssuseraf60311
 
PDF
02 - Asassssssspplication Layer (HTTP).pdf
HasibTurjo
 
PDF
HTTP colon slash slash: the end of the road?
Alessandro Nadalin
 
PPT
21 Www Web Services
royans
 
PPTX
Http-protocol
Toushik Paul
 
PDF
Lec 7(HTTP Protocol)
maamir farooq
 
PPTX
computer network introduction. psc notes . Assisant professor in cse.
bushraphd2022
 
PPTX
Http protocol
Arpita Naik
 
PDF
Unit v
APARNA P
 
PDF
Hypertexttransferprotocolhttp 131012171813-phpapp02
Nidhitransport
 
PPT
World Wide Web(WWW)
Pratik Tambekar
 
PDF
HTTP colon slash slash: end of the road? @ CakeFest 2013 in San Francisco
Alessandro Nadalin
 
PPT
introduction to Web system
hashim102
 
PPTX
HyperText Transfer Protocol (HTTP)
Gurjot Singh
 
PDF
Web Architectures - Web Technologies (1019888BNR)
Beat Signer
 
PDF
Ch2 the application layer protocols_http_3
Syed Ariful Islam Emon
 
PPT
HTTP.ppt
NapoMosola
 
PPT
Hypertext Transfer Protocol Hypertext Transfer Protocol
sambreaker1
 
PPT
HTTP_2.ppt
Ankit Mune
 
PPT
HTTP.ppt
Jagdeep Singh
 
application of http.pptx
ssuseraf60311
 
02 - Asassssssspplication Layer (HTTP).pdf
HasibTurjo
 
HTTP colon slash slash: the end of the road?
Alessandro Nadalin
 
21 Www Web Services
royans
 
Http-protocol
Toushik Paul
 
Lec 7(HTTP Protocol)
maamir farooq
 
computer network introduction. psc notes . Assisant professor in cse.
bushraphd2022
 
Http protocol
Arpita Naik
 
Unit v
APARNA P
 
Hypertexttransferprotocolhttp 131012171813-phpapp02
Nidhitransport
 
World Wide Web(WWW)
Pratik Tambekar
 
HTTP colon slash slash: end of the road? @ CakeFest 2013 in San Francisco
Alessandro Nadalin
 
introduction to Web system
hashim102
 
HyperText Transfer Protocol (HTTP)
Gurjot Singh
 
Web Architectures - Web Technologies (1019888BNR)
Beat Signer
 
Ch2 the application layer protocols_http_3
Syed Ariful Islam Emon
 
HTTP.ppt
NapoMosola
 
Hypertext Transfer Protocol Hypertext Transfer Protocol
sambreaker1
 
HTTP_2.ppt
Ankit Mune
 
HTTP.ppt
Jagdeep Singh
 
Ad

Recently uploaded (20)

DOCX
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
PDF
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
PPTX
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
PDF
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
PDF
Linux schedulers for fun and profit with SchedKit
Alessio Biancalana
 
PDF
Bitkom eIDAS Summit | European Business Wallet: Use Cases, Macroeconomics, an...
Carsten Stoecker
 
PPTX
Essential Content-centric Plugins for your Website
Laura Byrne
 
PDF
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
PDF
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
PPTX
Securing Model Context Protocol with Keycloak: AuthN/AuthZ for MCP Servers
Hitachi, Ltd. OSS Solution Center.
 
PDF
[GDGoC FPTU] Spring 2025 Summary Slidess
minhtrietgect
 
PPTX
Talbott's brief History of Computers for CollabDays Hamburg 2025
Talbott Crowell
 
PDF
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pdf
ghjghvhjgc
 
PDF
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
PDF
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PPTX
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
PDF
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
PDF
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
PDF
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 
Python coding for beginners !! Start now!#
Rajni Bhardwaj Grover
 
Kit-Works Team Study_20250627_한달만에만든사내서비스키링(양다윗).pdf
Wonjun Hwang
 
CapCut Pro PC Crack Latest Version Free Free
josanj305
 
Automating Feature Enrichment and Station Creation in Natural Gas Utility Net...
Safe Software
 
Linux schedulers for fun and profit with SchedKit
Alessio Biancalana
 
Bitkom eIDAS Summit | European Business Wallet: Use Cases, Macroeconomics, an...
Carsten Stoecker
 
Essential Content-centric Plugins for your Website
Laura Byrne
 
NLJUG Speaker academy 2025 - first session
Bert Jan Schrijver
 
How do you fast track Agentic automation use cases discovery?
DianaGray10
 
Securing Model Context Protocol with Keycloak: AuthN/AuthZ for MCP Servers
Hitachi, Ltd. OSS Solution Center.
 
[GDGoC FPTU] Spring 2025 Summary Slidess
minhtrietgect
 
Talbott's brief History of Computers for CollabDays Hamburg 2025
Talbott Crowell
 
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pdf
ghjghvhjgc
 
“Squinting Vision Pipelines: Detecting and Correcting Errors in Vision Models...
Edge AI and Vision Alliance
 
Transforming Utility Networks: Large-scale Data Migrations with FME
Safe Software
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
Agentforce World Tour Toronto '25 - MCP with MuleSoft
Alexandra N. Martinez
 
UiPath DevConnect 2025: Agentic Automation Community User Group Meeting
DianaGray10
 
Software Development Company Keene Systems, Inc (1).pdf
Custom Software Development Company | Keene Systems, Inc.
 
“Computer Vision at Sea: Automated Fish Tracking for Sustainable Fishing,” a ...
Edge AI and Vision Alliance
 

computer networking

  • 2. Lecture 19: 2006-11- 02 2 Outline • HTTP review and details (more in notes) • Persistent HTTP review • HTTP caching • Content distribution networks
  • 3. Lecture 19: 2006-11- 02 3 HTTP Basics (Review) • HTTP layered over bidirectional byte stream • Almost always TCP • Interaction • Client sends request to server, followed by response from server to client • Requests/responses are encoded in text • Stateless • Server maintains no information about past client requests
  • 4. Lecture 19: 2006-11- 02 4 How to Mark End of Message? (Review) • Size of message  Content-Length • Must know size of transfer in advance • Delimiter  MIME-style Content-Type • Server must “escape” delimiter in content • Close connection • Only server can do this
  • 5. Lecture 19: 2006-11- 02 5 HTTP Request (review) • Request line • Method • GET – return URI • HEAD – return headers only of GET response • POST – send data to the server (forms, etc.) • URL (relative) • E.g., /index.html • HTTP version
  • 6. Lecture 19: 2006-11- 02 6 HTTP Request (cont.) (review) • Request headers • Authorization – authentication info • Acceptable document types/encodings • From – user email • If-Modified-Since • Referrer – what caused this page to be requested • User-Agent – client software • Blank-line • Body
  • 7. Lecture 19: 2006-11- 02 7 HTTP Request (review)
  • 8. Lecture 19: 2006-11- 02 8 HTTP Request Example (review) GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.intel-iris.net Connection: Keep-Alive
  • 9. Lecture 19: 2006-11- 02 9 HTTP Response (review) • Status-line • HTTP version • 3 digit response code • 1XX – informational • 2XX – success • 200 OK • 3XX – redirection • 301 Moved Permanently • 303 Moved Temporarily • 304 Not Modified • 4XX – client error • 404 Not Found • 5XX – server error • 505 HTTP Version Not Supported • Reason phrase
  • 10. Lecture 19: 2006-11- 02 10 HTTP Response (cont.) (review) • Headers • Location – for redirection • Server – server software • WWW-Authenticate – request for authentication • Allow – list of methods supported (get, head, etc) • Content-Encoding – E.g x-gzip • Content-Length • Content-Type • Expires • Last-Modified • Blank-line • Body
  • 11. Lecture 19: 2006-11- 02 11 HTTP Response Example (review) HTTP/1.1 200 OK Date: Tue, 27 Mar 2001 03:49:38 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Last-Modified: Mon, 29 Jan 2001 17:54:18 GMT ETag: "7a11f-10ed-3a75ae4a" Accept-Ranges: bytes Content-Length: 4333 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html …..
  • 12. Lecture 19: 2006-11- 02 12 Outline • HTTP intro and details • Persistent HTTP • HTTP caching • Content distribution networks
  • 13. Lecture 19: 2006-11- 02 13 Typical Workload (Web Pages) • Multiple (typically small) objects per page • File sizes • Heavy-tailed • Pareto distribution for tail • Lognormal for body of distribution -- For reference/interest only -- • Embedded references • Number of embedded objects = pareto – p(x) = aka x-(a+1)
  • 14. Lecture 19: 2006-11- 02 14 HTTP 0.9/1.0 (mostly review) • One request/response per TCP connection • Simple to implement • Disadvantages • Multiple connection setups  three-way handshake each time • Several extra round trips added to transfer • Multiple slow starts
  • 15. Lecture 19: 2006-11- 02 15 Single Transfer Example Client Server SYN SYN SYN SYN ACK ACK ACK ACK ACK DAT DAT DAT DAT FIN ACK 0 RTT 1 RTT 2 RTT 3 RTT 4 RTT Server reads from disk FIN Server reads from disk Client opens TCP connection Client sends HTTP request for HTML Client parses HTML Client opens TCP connection Client sends HTTP request for image Image begins to arrive
  • 16. Lecture 19: 2006-11- 02 16 More Problems • Short transfers are hard on TCP • Stuck in slow start • Loss recovery is poor when windows are small • Lots of extra connections • Increases server state/processing • Server also forced to keep TIME_WAIT connection state -- Things to think about -- • Why must server keep these? • Tends to be an order of magnitude greater than # of active connections, why?
  • 17. Lecture 19: 2006-11- 02 17 Persistent Connection Solution (review) • Multiplex multiple transfers onto one TCP connection • How to identify requests/responses • Delimiter  Server must examine response for delimiter string • Content-length and delimiter  Must know size of transfer in advance • Block-based transmission  send in multiple length delimited blocks • Store-and-forward  wait for entire response and then use content-length • Solution  use existing methods and close connection otherwise
  • 18. Lecture 19: 2006-11- 02 18 Persistent Connection Example (review) Client Server ACK ACK DAT DAT ACK 0 RTT 1 RTT 2 RTT Server reads from disk Client sends HTTP request for HTML Client parses HTML Client sends HTTP request for image Image begins to arrive DAT Server reads from disk DAT
  • 19. Lecture 19: 2006-11- 02 19 Persistent HTTP (review) Nonpersistent HTTP issues: • Requires 2 RTTs per object • OS must work and allocate host resources for each TCP connection • But browsers often open parallel TCP connections to fetch referenced objects Persistent HTTP • Server leaves connection open after sending response • Subsequent HTTP messages between same client/server are sent over connection Persistent without pipelining: • Client issues new request only when previous response has been received • One RTT for each referenced object Persistent with pipelining: • Default in HTTP/1.1 • Client sends requests as soon as it encounters a referenced object • As little as one RTT for all the referenced objects
  • 20. Lecture 19: 2006-11- 02 20 Outline • HTTP intro and details • Persistent HTTP -- new stuff -- • HTTP caching • Content distribution networks
  • 21. Lecture 19: 2006-11- 02 21 HTTP Caching • Clients often cache documents • Challenge: update of documents • If-Modified-Since requests to check • HTTP 0.9/1.0 used just date • HTTP 1.1 has an opaque “entity tag” (could be a file signature, etc.) as well • When/how often should the original be checked for changes? • Check every time? • Check each session? Day? Etc? • Use Expires header • If no Expires, often use Last-Modified as estimate
  • 22. Lecture 19: 2006-11- 02 22 Example Cache Check Request GET / HTTP/1.1 Accept: */* Accept-Language: en-us Accept-Encoding: gzip, deflate If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT If-None-Match: "7a11f-10ed-3a75ae4a" User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0) Host: www.intel-iris.net Connection: Keep-Alive
  • 23. Lecture 19: 2006-11- 02 23 Example Cache Check Response HTTP/1.1 304 Not Modified Date: Tue, 27 Mar 2001 03:50:51 GMT Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24 Connection: Keep-Alive Keep-Alive: timeout=15, max=100 ETag: "7a11f-10ed-3a75ae4a”
  • 24. Lecture 19: 2006-11- 02 24 Ways to cache Client-directed caching: Web Proxies Server-directed caching: Content Delivery Networks (CDNs)
  • 25. Lecture 19: 2006-11- 02 25 Web Proxy Caches • User configures browser: Web accesses via cache • Browser sends all HTTP requests to cache • Object in cache: cache returns object • Else cache requests object from origin server, then returns object to client client Proxy server client HTTP request HTTP request HTTP response HTTP response HTTP request HTTP response origin server origin server
  • 26. Lecture 19: 2006-11- 02 26 Caching Example (1) Assumptions • Average object size = 100,000 bits • Avg. request rate from institution’s browser to origin servers = 15/sec • Delay from institutional router to any origin server and back to router = 2 sec Consequences • Utilization on LAN = 15% • Utilization on access link = 100% • Total delay = Internet delay + access delay + LAN delay = 2 sec + minutes + milliseconds origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link
  • 27. Lecture 19: 2006-11- 02 27 Caching Example (2) Possible solution • Increase bandwidth of access link to, say, 10 Mbps • Often a costly upgrade Consequences • Utilization on LAN = 15% • Utilization on access link = 15% • Total delay = Internet delay + access delay + LAN delay = 2 sec + msecs + msecs origin servers public Internet institutional network 10 Mbps LAN 10 Mbps access link
  • 28. Lecture 19: 2006-11- 02 28 Caching Example (3) Install cache • Suppose hit rate is .4 Consequence • 40% requests will be satisfied almost immediately (say 10 msec) • 60% requests satisfied by origin server • Utilization of access link reduced to 60%, resulting in negligible delays • Weighted average of delays = .6*2 sec + .4*10msecs < 1.3 secs origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache
  • 29. Lecture 19: 2006-11- 02 29 Problems • Over 50% of all HTTP objects are uncacheable – why? • Not easily solvable • Dynamic data  stock prices, scores, web cams • CGI scripts  results based on passed parameters • Obvious fixes • SSL  encrypted data is not cacheable • Most web clients don’t handle mixed pages well many generic objects transferred with SSL • Cookies  results may be based on passed data • Hit metering  owner wants to measure # of hits for revenue, etc. • What will be the end result?
  • 30. Lecture 19: 2006-11- 02 30 Content Distribution Networks (CDNs) • The content providers are the CDN customers. Content replication • CDN company installs hundreds of CDN servers throughout Internet • Close to users • CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia
  • 31. Lecture 19: 2006-11- 02 31 Outline • HTTP intro and details • Persistent HTTP • HTTP caching • Content distribution networks
  • 32. Lecture 19: 2006-11- 02 32 Content Distribution Networks & Server Selection • Replicate content on many servers • Challenges • How to replicate content • Where to replicate content • How to find replicated content • How to choose among know replicas • How to direct clients towards replica
  • 33. Lecture 19: 2006-11- 02 33 Server Selection • Which server? • Lowest load  to balance load on servers • Best performance  to improve client performance • Based on Geography? RTT? Throughput? Load? • Any alive node  to provide fault tolerance • How to direct clients to a particular server? • As part of routing  anycast, cluster load balancing • Not covered  • As part of application  HTTP redirect • As part of naming  DNS
  • 34. Lecture 19: 2006-11- 02 34 Application Based • HTTP supports simple way to indicate that Web page has moved (30X responses) • Server receives Get request from client • Decides which server is best suited for particular client and object • Returns HTTP redirect to that server • Can make informed application specific decision • May introduce additional overhead  multiple connection setup, name lookups, etc. • OK solution in general, but… • HTTP Redirect has some flaws – especially with current browsers • Incurs many delays, which operators may really care about
  • 35. Lecture 19: 2006-11- 02 35 Naming Based • Client does DNS name lookup for service • Name server chooses appropriate server address • A-record returned is “best” one for the client • What information can name server base decision on? • Server load/location  must be collected • Information in the name lookup request • Name service client  typically the local name server for client
  • 36. Lecture 19: 2006-11- 02 36 How Akamai Works • Clients fetch html document from primary server • E.g. fetch index.html from cnn.com • URLs for replicated content are replaced in html • E.g. <img src=“http://cnn.com/af/x.gif”> replaced with <img src=“http://a73.g.akamaitech.net/7/23/cnn.com/af/x.gif”> • Client is forced to resolve aXYZ.g.akamaitech.net hostname
  • 37. Lecture 19: 2006-11- 02 37 How Akamai Works • How is content replicated? • Akamai only replicates static content (*) • Modified name contains original file name • Akamai server is asked for content • First checks local cache • If not in cache, requests file from primary server and caches file * (At least, the version we’re talking about today. Akamai actually lets sites write code that can run on Akamai’s servers, but that’s a pretty different beast)
  • 38. Lecture 19: 2006-11- 02 38 How Akamai Works • Root server gives NS record for akamai.net • Akamai.net name server returns NS record for g.akamaitech.net • Name server chosen to be in region of client’s name server • TTL is large • G.akamaitech.net nameserver chooses server in region • Should try to chose server that has file in cache - How to choose? • Uses aXYZ name and hash • TTL is small  why?
  • 39. Lecture 19: 2006-11- 02 39 Simple Hashing • Given document XYZ, we need to choose a server to use • Suppose we use modulo • Number servers from 1…n • Place document XYZ on server (XYZ mod n) • What happens when a servers fails? n  n-1 • Same if different people have different measures of n • Why might this be bad?
  • 40. Lecture 19: 2006-11- 02 40 Consistent Hash • “view” = subset of all hash buckets that are visible • Desired features • Balanced – in any one view, load is equal across buckets • Smoothness – little impact on hash bucket contents when buckets are added/removed • Spread – small set of hash buckets that may hold an object regardless of views • Load – across all views # of objects assigned to hash bucket is small
  • 41. Lecture 19: 2006-11- 02 41 Consistent Hash – Example • Smoothness  addition of bucket does not cause movement between existing buckets • Spread & Load  small set of buckets that lie near object • Balance  no bucket is responsible for large number of objects • Construction • Assign each of C hash buckets to random points on mod 2n circle, where, hash key size = n. • Map object to random position on circle • Hash of object = closest clockwise bucket 0 4 8 12 Bucket 14
  • 42. Lecture 19: 2006-11- 02 42 How Akamai Works End-user cnn.com (content provider) DNS root server Akamai server 1 2 3 4 Akamai high-level DNS server Akamai low-level DNS server Nearby matching Akamai server 11 6 7 8 9 10 Get index.h tml Get /cnn.com/foo.jpg 12 Get foo.jpg 5
  • 43. Lecture 19: 2006-11- 02 43 Akamai – Subsequent Requests End-user cnn.com (content provider) DNS root server Akamai server 1 2 Akamai high-level DNS server Akamai low-level DNS server 7 8 9 10 Get index. html Get /cnn.com/foo.jpg Nearby matching Akamai server
  • 44. Lecture 19: 2006-11- 02 44 Impact on DNS Usage • DNS is used for server selection more and more • What are reasonable DNS TTLs for this type of use • Typically want to adapt to load changes • Low TTL for A-records  what about NS records? • How does this affect caching? • What do the first and subsequent lookup do?
  • 45. Lecture 19: 2006-11- 02 45 HTTP (Summary) • Simple text-based file exchange protocol • Support for status/error responses, authentication, client-side state maintenance, cache maintenance • Workloads • Typical documents structure, popularity • Server workload • Interactions with TCP • Connection setup, reliability, state maintenance • Persistent connections • How to improve performance • Persistent connections • Caching • Replication
  • 46. Lecture 19: 2006-11- 02 47 Typical Workload (Server) • Popularity • Zipf distribution (P = kr-1 )  surprisingly common • Obvious optimization  caching • Request sizes • In one measurement paper  median 1946 bytes, mean 13767 bytes • Why such a difference? Heavy-tailed distribution • Pareto – p(x) = aka x-(a+1) • Temporal locality • Modeled as distance into push-down stack • Lognormal distribution of stack distances • Request interarrival • Bursty request patterns
  • 47. Lecture 19: 2006-11- 02 48 Caching Proxies – Sources for Misses • Capacity • How large a cache is necessary or equivalent to infinite • On disk vs. in memory  typically on disk • Compulsory • First time access to document • Non-cacheable documents • CGI-scripts • Personalized documents (cookies, etc) • Encrypted data (SSL) • Consistency • Document has been updated/expired before reuse • Conflict • No such misses

Editor's Notes

  • #2: Blast through HTTP review slides on the projector. Swap back to chalkboard for new stuff.
  • #14: \
  • #15: Now that you’ve seen TCP in the previous lectures, the overhead of one TCP connection per object is more clear.
  • #22: Ask: What’s the problem with just using date for IMS queries? - Web servers may generate content per-client Examples: CGI scripts, things like google customized home, vary with cookies, client IP, referrer, accept-encoding, etc.
  • #34: Ask: What things might you want to pick the server using? How do you direct clients to a server? Anyone know how Akamai works?
  • #35: Draw diagram: Client ---&amp;gt; Server &amp;lt;---redir---- -------&amp;gt; Server 2 &amp;lt;----answer----