0% found this document useful (0 votes)
0 views

CMG - page analyzer a tool for performance modelling

page anlayzer

Uploaded by

kmdbasappa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

CMG - page analyzer a tool for performance modelling

page anlayzer

Uploaded by

kmdbasappa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

The Association of System

Performance Professionals

The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.

This paper was originally published in the Proceedings of the Computer Measurement Group’s 2001 International Conference.

For more information on CMG please visit http://www.cmg.org

Copyright Notice and License

Copyright 2001 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.

BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:

License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.

Concurrent use on two or more computers or on a network is not allowed.

Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.

Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.

General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

PAGE ANALYZER: A TOOL FOR WEB PERFORMANCE MODELING

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
Richard A. Mushlin and W. Nathaniel Mills III,
IBM T. J. Watson Research Center, Yorktown Heights, NY 10598

Web page retrieval time depends on many factors. Measuring total


retrieval time only shows if and when a problem exists, not where or why.
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

Tools providing detailed breakdowns of page retrieval activities help to


highlight performance problems. By modeling these activities under
different assumptions, we have built tools to analyze and predict
performance. By comparing the "as-is" and "what-if" performance results,
we can assess the effects of various changes to web content design and
deployment, making informed, cost effective decisions resulting in
improved customer browsing experience.

transformation and scheduling procedures. As the


Introduction modeling acquires a track record for accurate
prediction, it becomes more and more valuable as a
What is web performance, and why model it? guide for selecting and applying performance
In this paper we describe a new tool for modeling web solutions. Although this paper deals with modeling of
page retrieval performance to better understand the an isolated web page retrieval, there is motive for
efficiency of retrieving data over the Internet using expanding the scope to logically connected groups of
standard protocols and client applications. Recently web pages. The use of modeling to analyze and
Mills et Al. [1] have described a set of performance predict resource utilization when web pages are linked
metrics which take into account the differences by e-business processes is discussed by Menasce et
between overhead and content when analyzing the al. [2]. Some of the grand challenges of Internet
durations and sizes of web transactions. For the performance modeling are reviewed by Crovella et al.
purposes of this paper, however, we focus on total [3].
page retrieval time, and how that time is decomposed
into schedules of activities to retrieve the page What do we measure and what do we model?
The starting point for modeling “what-if” scenarios is
components. Anyone who has “surfed the net” knows
an understanding of the elements that make up the
that web performance can vary tremendously - from
process under study and their contributions to the
site to site, from page to page, from day to day, from
metrics being evaluated. For the web page retrieval
minute to minute. As the web becomes more and
process, we have based our modeling on a
more a part of routine business, entertainment,
representation that corresponds to what we can
productivity, and education, the end user wants and
measure. When a browser goes after a web page, a
expects the interactions to be as fast as possible.
series of operations is set in motion which has a
Service and content providers compete to meet these
logical sequence. At a high level, the sequence can
expectations, and IT vendors and consultants compete
be described as:
to sell hardware and software solutions to enhance the
performance of customer browsing experiences.
1. Look up the address of the server where the
Directly gauging the effectiveness of these solutions
document is located.
means installing them, often at substantial cost, and
2. Open a connection to that server.
then measuring to glean anticipated performance
3. Request the document.
improvements. If the solution turns out to have less
4. Receive the document.
impact than anticipated, the measurements often give
5. Interpret the document.
no clue as to why. Clearly, it would be an advantage
6. Render the document.
to be able to estimate the impact of a solution before
buying and implementing it. Performance modeling
In some situations, steps 2 and 5 are more complex.
provides a way to make these estimates based on
For step 2, firewall and/or security protocols may make
actual “as-is” measurements and well defined
the connection a multi-step operation, such as:

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

the sockets, IP addresses, and port numbers used for


• 2a. The client opens a socket connection to a each connection, and the URL of each document.
firewall and requests a connection to the Fortunately, the Page Detailer measurement tool
document server. provides a graphical interface for visualizing this data.

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
• 2b. The firewall (socks server) intercepts this Since the Page Analyzer modeling tool uses a similar
request, checks if the transaction is allowed, and if interface, it is useful at this point to look at some data
so, opens a socket connection on the client’s
behalf.
• 2c. If encryption is required, the client and server
must set up a secure connection, typically using
Sample view of
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

SSL protocols. This SSL connection is then used


to carry the encrypted request and receipt of the measured performance data
document. prior to modeling session
Note that if these three steps are present, they must
be carried out in the specified order and without
overlap.

For step 5, interpreting the document almost always


reveals embedded documents which must be
retrieved, interpreted, and rendered in a similar
fashion. The browser typically handles this by using 1 second
several concurrently active sockets, allowing the
embedded documents to be retrieved somewhat in
parallel. However, there are some choices the
browser must make, and some constraints which limit
these choices. For example, if the embedded
document is located on a server whose address is and become familiar with the representation and
already known, step 1 can be skipped, otherwise it terminology.
must be looked up. If there is already an open
connection to that server and it is not in use, then step Figure 1
2 can be skipped, otherwise a new connection must
be established. This process of interpreting, Figure 1 shows an example of one view of the
requesting, and retrieving continues until there are no measured data for retrieval of a web page. The total
more embedded documents and the rendering is time for retrieving the page, the content size, the
complete. number of items, and the date are shown near the top.
The purple bar at the top represents the retrieval time
Sometimes the server cannot provide the document of the entire page. The total time was 3.911 seconds
requested, but it does provide another address for the to get 83871 bytes of content. Each multicolored bar
client to try (like a forwarding address at the post represents one item. The first multicolored bar is an
office). These redirections are wasteful because as HTML item, and all the rest are embedded GIF
many as five steps must be completed before the images. Only about half of the 67 items on this web
client learns it must look elsewhere, usually at a cost page are visible in the figure. Within each item, each
of repeating these steps with the new address. step is a different color:
Sometimes attempts to connect fail, or requests for
documents are denied. These errors may also force - Yellow = Connect (step 2a)
retries, which take time. - Red = Socks connect (step 2b)
- Blue = Server response (step 3)
Using a tool developed in our lab called Page Detailer - Green = Content delivery (step 4)
[1] [4], we are able to measure the durations of these
individual steps for each item, be they address lookup, In addition to the timing measurements, the data sizes
connection, redirection, document retrieval, or error. (sent and received) for each step are also captured
We also measure the amount of data transferred (but not shown in figure 1). Based on this picture,
during these steps. The measured data set thus is some typical observations about the page retrieval
capable of providing a very detailed picture of the web performance include:
page retrieval process - every begin and end time
stamp and every data size for every step of every item • The socks connection consistently takes longer
on the page, plus - what kinds of steps were used to than the initial socket connection.
retrieve each item, what kind of item was processed, • There is a wide variation in server response times.

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

• There are a lot of items, some of which have very resources, subject to constraints. Scheduling gets
small durations. complex when the possible combinations of tasks,
• Many of the items have their own separate resources, and constraints get numerous. For our
connections. scheduling problem, the complexity is limited by the

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
well defined sequence of web transaction steps, the
These observations might lead us to speculate on design of popular browser software, and the nature of
what the performance would look like if we made a few TCPIP socket communications, which together play
changes, such as: the role of constraints. The resources for our problem
are the connections to servers, represented as a
• Eliminate the use of the socks firewall. (This is socket id, IP address/port pair, and the interpreted
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

often possible when employees have HTML source document, which drives item retrieval.
misconfigured their browsers to use the firewall to The tasks are the items and the steps that go into
access intranet sites.) retrieving them. Thus, running a model consists of
• Increase server capacity in the hope of avoiding making changes to the task attributes, reallocating
long response times. resources, modifying and applying constraints, and
• Combine small images, if possible, into fewer, recalculating the schedule under the new conditions.
larger images, reducing the number of interactions
with the web site. For our modeling tool, we have organized these
• Keep connections open whenever possible, and scheduling operations into 3 computation layers. Each
reuse them when available. model is implemented on this blueprint, performing the
appropriate calculations in each layer. The first, or
The purpose of the Page Analyzer modeling tool is to innermost, layer involves the modification of size
allow the effects of these kinds of scenarios to be and/or timing attributes of individual steps (the colored
calculated, based on existing measurements and an segments in figure 1). Depending on the model, the
understanding of how web page retrieval works. After changes may be applied to one kind of step for all
the calculation is done, the modeled data, which has items, or certain steps of certain items. As an
the same form as the measured data, can be example, suppose a model requires reducing the
visualized using the same interface, for comparison duration of one step to 50% of its original value, and
with the original unmodeled image. eliminating another step. This is shown schematically
Note that this type of modeling is not the only kind of Example of First and Second Layer Modeling Operations
performance analysis available to improve
performance. Killelea [5] describes web performance item before
step a step b step c step d
any modeling
tuning at length. His book is filled with practical
guidelines for improving performance, and is illustrated shorten eliminate
with sample calculations using “typical” numbers.
Menasce [6] uses more theory to calculate queue item after first
step a step b step d
layer of modeling
lengths, throughput, latencies, availability, and other
metrics. Both of these authors are targeting the shift
average, overall performance, qualitative or
quantitative, for entire networks or web sites. Our goal item after second
layer of modeling step a step b step d
is much more focused on the characterization of
in figure 2 for one item.
particular web pages and the factors that come into
play for those pages. We are trying to estimate the Figure 2
impact of changes that Killelea or Manasce might
suggest, by doing virtual experiments. By focusing on The top row shows the steps before any modeling
content design and deployment issues, we address takes place. The middle row shows the 50% reduction
many performance improvements that are less time- of step b and the elimination of step c. Note that after
consuming and costly to implement. the first layer of modeling, there has been no change
in the overall duration of the item, because we have
Methods not changed the start times of any of the steps.

Modeling approach In the second modeling layer, we remove the “gaps”


In developing a tool for modeling web page retrieval within each item by shifting the start time of each step
performance at the level of granularity discussed so that it matches the end time of the previous step.
above, we have taken a scheduling approach. This represents applying the constraints derived from
Scheduling is a well studied, and often very complex, our knowledge of the web transaction sequence. The
discipline. One way to describe a schedule is that it bottom row of figure 2 shows the item after shifting
shows when tasks may be performed by (or using) step d. (The fate of step c depends on the semantics

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

of “eliminate”. If “eliminate” means “set duration and modeling layers 1 and 2, the situation is as we saw in
size to zero”, then a zero-duration step c would be figure 3b. We now apply a proportional HTML
shifted to the end of step b, and step d would be interpretation time model in layer 3. This model
shifted to the end of step c. For scheduling purposes, assumes that items referenced in an HTML document

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
this gives the same result as if “eliminate” meant can never start any sooner than their original
“remove from the data set”.) Note that the item’s proportion into the HTML source. In figure 4b we
overall duration is now shorter. If this were the end of show item 2 shifted to start at 80% into the content
the modeling operations, the result would be some delivery of item 1, the same fraction as before any
shorter items, but the total time to retrieve the page, modeling. In general, one HTML document may
which ends when the last step of the last item ends, reference many other items. Clearly, these references
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

might not change much. To see why this might be the are sequential in the source, and the browser takes
case, consider a page with items initially of all the some time to read and interpret the source, and to
same duration, as shown in figure 3a. If we then apply become aware of each reference in turn. The
a model which eliminates one step from every item, proportional interpretation model maintains the
shortening every item by 25%, we get the situation in sequence and relative timing at the end of the
figure 3b. computation. As a result of shifting item 2 in layer 3
(figure 4b), the total savings is greater than could be
P a g e w ith T w o Ite m s,
M o d e lin g L a ye rs 1 a n d 2
obtained with levels 1 and 2 alone (figure 3b).

The second factor affecting an item’s start time is the


ite m 1
need for a socket connection to the item’s server. The
a ) b e fo re m o d e lin g ite m 2
browser has two ways to obtain a socket connection:
use an existing one, or open a new one. To use an
s a vin g s
existing socket connection to retrieve an item, that
ite m 1
connection must be with the server for that item’s
URL, the connection cannot be in use, and the
b ) a fte r m o d e lin g la ye rs 1 a n d 2 ite m 2
connection must remain open (be “kept alive”) until
needed. To open a new connection, the number of
Figure 3 simultaneous socket connections available to the
browser (typically 2 or 4 per frame), must not all be in
Note that in this scenario the savings comes only from use. Using an existing connection, if it is available,
the shortening of item 2. Clearly, we would like to shift saves the time of opening a new one. However, at
item 2 to start earlier. The question is, of course, how some point, waiting for an existing connection to
early can an item start? become available can take longer than making a new
connection. On the other hand, if all the allowed
P age w ith T w o Item s,
M odeling Layer 3 w ith
sockets are in use, there is no choice but to wait. How
P roportional Interpretation Tim e do these tradeoffs affect the start time of an item? If
there is a live connection to the right server, then the
item 1 c o nte n t d e live ry item can start as soon as the previous item on that
a ) be fo re m od e ling , item 2
b eg ins 80 % in to co n te nt de liv ery
item 2 connection is finished. If there is no live connection to
the right server, then the item can start, by making a
savings new connection, as soon as one of the allowed
number of connections is not in use. For the purpose
item 1 co n te n t de livery of performance modeling, we focus on the potential to
b ) a fte r m o d elin g la ye r 3 , u sing
proportio nal interpretation tim e
item 2 reuse live sockets whenever possible. A detailed
example of this type of model is described in the
Figure 4 results section.

Modeling layer 3 addresses this issue. There are two


major factors which limit an item’s start time: the URL
of the item must be known to the client before it can be
requested, and a socket must be available for
connecting to the item’s server. First consider the
issue of how soon the client can know the item’s URL.
Again using our simple page with two items, assume
that item 1 is an HTML document containing a
reference to a GIF image, item 2. Before modeling,
item 2 starts about 80% into the content delivery
(HTML source) of item 1, as shown in figure 4a. After

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

Architecture overview Flow of Generic Modeling Run


Our modeling tool consists of a framework and a set of
specific models which can be run against the "before"

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
schedule
performance data. Figure 5 shows the relationships
between the high level functional components.

Block Diagram of Functional Components input


data
compare
output
data
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

parallel

initial modify final


m1 m2 m3 m4 "after" schedule
re-flow
"after"
modeling
schedule mode
schedule values schedule
series

performance model modeling


data controller parameters Figure 6

Every modeling run produces one or more output data


sets to be appended to the original and stored in a
scheduler repository of performance data. This structure was
designed so that the data for each web page is kept
together - original measurements plus any modeled
Figure 5 data - as “versions” of the same page under different
conditions. For example, version 1 of a page is, by
The model controller contains utilities which are convention, the actual measured data. Version 2 may
needed by all models, such as reading and writing be the modeled data for eliminating socks
performance data and modeling parameters, while the connections; version 3 the modeled data for combining
specific models themselves are attached to the small images, etc. Since the versions all have the
controller. A modeling run consists of an initialization same structure, just different values, any version can
phase and an execution phase. During the serve as the “before” schedule for another modeling
initialization phase, the controller gives each model run. Thus, a modeling run can be set up to apply the
access to the existing data for the page being modeled selected models in parallel to the same input data, or
and to the modeling parameters. The user may select in series, with each model taking as input the output of
or deselect each model to be employed, and modify the previous model. For example, running 3 models in
the parameters of the selected models. During the run parallel would produce 3 new output versions, but in
phase, the controller specifies the appropriate input series would produce 1 new version containing the
data and starts the calculation procedure for each cumulative effect of all 3 models.
model. Each calculation procedure has code specific
to the model logic being used, but the overall flow of
the procedure is common across all models:

1. Create 2 schedules based on the input data, a


“before” and an “after” schedule.
2. Modify data in the “after” schedule according to
the specific model logic (layer 1).
3. Re-flow the “after” schedule to conform to the
constraints on web transactions (layers 2 and 3).
4. Compare the “before” and “after” schedules.
5. Append the data from the re-flowed “after”
schedule to the original input data.

This flow is illustrated in figure 6.

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

new floor and ceiling. If relative settings are specified


Model implementation
then the two parameters required are the percentages
We have implemented a suite of models and run them
of the range which the floor should be raised and the
against pages from various sources. In this section
ceiling should be lowered. Figure 8 shows an

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
we describe the models and how they work. Our
models are organized into a shallow hierarchy as E x a m p le o f R e la tiv e L im it S e ttin g s
shown in figure 7.

The model class manages the model name and id, the In itia l v a lu e s P e rc e n ta g e o f ra n g e F in a l v a lu e s

input page and version, the two schedules, and the


select and enable switches. “Selected” simply means
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

F lo o r 100 25% up 300


the user wants to run the model. “Enabled”, however,
depends on both user-specified parameters and the
C e ilin g
situation present in the input data. For example, 900 50% dow n 500

suppose the user wants to run the model to eliminate


socks connections. This only makes sense if there are R ange 800 - 200
any socks connections in the data, AND the use of the
socks firewall is negotiable. That is, the user need not example of the effect of relative settings:
model those situations that may not be practical to Figure 8
implement. Each model has its own enabling logic. If
a model is both selected and enabled, it runs. A common scenario involves limiting all durations to
the minimum observed duration. This is done by
Hierarchy of Models
leaving the floor alone, and adjusting the ceiling down
activity- eliminate socks connections by 100% of the range. Another interesting scenario
eliminating
models
eliminate ssl connections
eliminate redirections
involves seeing what happens if the durations are
increased, as might be the case when saving cost by
using fewer servers or restricted bandwidth. All the
limit connect time
duration- limit socks connect time duration-limiting models use this mechanism, applied
limiting limit ssl connect time
models limit server response time to the durations of the appropriate step in each case.
limit content delivery time
models
The size-limiting models are similar to the duration-
size-
limiting limit content delivery size limiting ones, in that they also use the floor/ceiling
at constant rate
models mechanism applied this time to the data sizes of the
steps. The difference is that the size-limiting models
resource eliminate duplicate images assume the original data transfer rate. In terms of
allocation
consolidate small images
all content on one server
performance, a comparison of modeled and measured
models
html-constrained shift
data sizes only makes sense if the transfer rates are
assumed to remain consistent. Seen another way, a
Figure 7 model which halves the data size at constant transfer
rate is equivalent to, from a timing perspective, one
The next level is implemented as classes only for the that doubles the transfer rate at constant size.
top three types - the resource allocation type is a Therefore, this type of model can also be used to get a
conceptual type only. The activity eliminating class rough estimate of the relative performance of a web
manages the two attributes which enable the model - page on a modem- versus LAN-connected client.
activity present and activity required. The activity must
be present and not required to enable the model. The The resource allocation models need individual
duration- and size-limiting classes manage the limit explanations. Eliminating duplicate items would seem
mechanism for these types of models. To illustrate the to be an obvious way to improve performance. We
limit mechanism, consider a model which limits the restrict this model to image files, as those are the most
duration of server response. The group of all server frequent duplicates. The model takes a parameter
response durations for the page have a distribution, specifying what constitutes a duplicate - equal URLs,
with a minimum and a maximum. These values serve equal filenames on same server, equal filenames on
as initial floor and ceiling limits on the duration. The any server. An item is eliminated by setting all the
model then calculates a new floor and ceiling based attributes of all its steps to zero.
on 3 parameters supplied to the model. The first
parameter controls whether the new values should be Placing all the content on one server is a scenario
set to absolute specified values, or to values relative to which addresses the issue of too many connections.
the initial floor and ceiling. If absolute settings are When content is distributed across many servers, the
specified, then the two parameters required are the server response times may go down, but a new

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

connection is needed for each server, and the overall Perhaps if we did not have to go through a socks
performance may suffer. Placing all content on one firewall the performance would improve. Figure 9
server may become advantageous if the connections shows the results of eliminating all socks connection
to that server are kept alive and reused on a first- steps from all items.

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
available basis. (Network sprayers or load balancers
can be used to distribute the back-end load to multiple
servers while allowing the socket connection to the Savings

client to be kept alive.) This model takes a parameter


specifying the number of concurrent sockets the
browser can have open, typically 2 or 4 per frame. Eliminate socks
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

The model allocates these N sockets to the first N


content-bearing items it encounters. The model then
adjusts the start time of the next item to equal the
earliest end time of the previous N items. In addition,
any connection steps within the adjusted item are
eliminated, leaving only server response and content
delivery for all but the first N content-bearing items.

For all the models, layers 2 and 3 (remove gaps within


items and shift whole items) are implemented in the
scheduler. Removing gaps between steps is
straightforward, and is not parameterized in our
Figure 9
models. The item shift process, however, is subject to
some constraints which are supplied as parameters.
There was some improvement, about 499 milliseconds
For example, if HTML interpretation time is to be
for the whole page. Why did removing so many major
considered in limiting an item’s shift to an earlier start,
steps have only a minor effect? Recall the layered
then a parameter indicating how to scale that time is
scheduling approach: The model removes the socks
used. The scaling can be “observed proportional” (as
step from each item (layer 1), closes the gap by left-
discussed previously and depicted in figure 4), or
shifting all later steps in the item (layer 2), then adjusts
“fixed proportional”, where the best case has
the start of each item (layer 3). (For this data set, the
proportion = 0%, and the worst case has proportion =
HTML interpretation adjustment, although enabled, did
100%.
not come into play because only the first item has
HTML source, and there was no attempt to start
Results another item until almost all the HTML was delivered.)
To illustrate the utility of our modeling tools, we Almost all the layer 3 adjustment comes from starting
present some performance data for a variety of an item no later than the end of the previous item. So
models. Each model was run against the same input if two adjacent items overlap, and shortening the first
data, with “observed proportional” HTML interpretation one does not remove the overlap, there is no effect on
time turned on. Several examples are run in series to the start time of the second one. Since there is
see the cumulative effect of the selected models. The considerable overlap to begin with, and since the
starting point of our modeling session is the measured socks steps were not long enough to remove much of
performance displayed in figure 1. For this web page, that overlap, their elimination had little effect on the
the HTML source is in the first item; all the rest of the overall page retrieval time.
items are images. Of the 67 items in the data set, the
first 3 are shown. The total page retrieval time before Eliminate socks, plus limit connect and server
modeling is 3.911 seconds. In the figures that follow, response times to observed minimum values
the width of the figure is held constant at the pre- If eliminating socks by itself did not have a pronounced
modeling retrieval time, except figure 12, where the effect, would shortening the connection and server
width is the modeled retrieval time. The total page response times remove enough item overlap to make
retrieval time is always shown at the top of the figure. a difference? What would be a best case scenario?
We combined the previous model in series with the
Eliminate socks connections relative limits models for connect time and server
Looking at the data in figure 1, we see that the socks response time, with the ceilings set 100% down (i.e.,
connections seem to take up a large amount of time limited to the minimum observed duration). What is
for many of the items. left is a bare minimum of network and server activity,
plus the original content untouched. The result is
shown in figure 10.

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

The savings was about 682 milliseconds, only slightly


more than the savings from eliminating socks alone.
Savings
The model shortened many items enough to force
their successors to be “dragged left”. This model

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
suggests that the socket scheduling strategy of the
client and server dominate the retrieval performance of Eliminate socks
this web page. For the example shown here, that Minimum connect time
strategy results in almost complete sequential retrieval Minimum server response time
of the content, the only significant parallel retrieval All content on one server
being for item #3. Since we have already removed 4 keepalive sockets
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

almost all of the network and server time, any further


improvement must come from changes to the socket
scheduling or the content itself.

Savings

Figure 11
Eliminate socks
Minimum connect time The same data is shown with an expanded timescale
Minimum server response time in figure 12.

Socket 1
Socket 2
Socket 3
Socket 4

Expanded time scale

Eliminate socks
Item 14
Minimum connect time
Minimum server response time
Figure 10 All content on one server
4 keepalive sockets

Eliminate socks, plus limit connect and server


response times to observed minimum values, plus
assume all content on one server and use 4
browser sockets on a first-available basis
In this model we modify the socket scheduling
strategy. To simplify the allocation of sockets to
servers, we pretend that all items reside on the same Figure 12
server as the first item. Next, we assign the 4
available sockets to the first 4 items without shifting Recall that the HTML constraint is turned on, so before
them at all. Then we proceed to assign one of the 4 item 14, which starts after all HTML has been
sockets to each item in turn. The assignment is made interpreted, an item may not be able to start as early
on a first-available basis. That is, the 4 items currently as a socket becomes available. This can be seen in
using the 4 sockets are examined for their end times. item 5, where socket 2 is available early, but item 5
The socket from the item with the earliest end time of has to wait until its HTML constraint is satisfied before
the 4 is assigned to item 5, and item 5 is shifted to it can take advantage of the available socket.
start when the “donor” item ends (subject to the HTML Beginning with item 14, every item is free to start when
interpretation constraint). The end time for item 5 then there is an available socket. The path of one of the
replaces that of item 4 in the socket pool. The sockets through several items is traced with the dotted
remainder of the items are processed in this way. The line. Note that once the 4 sockets have been opened,
result is shown in figure 11. and connections established to the (single) server, no
more connections need be made. Thus even if we
had not previously eliminated the socks connections,
and even if this web page used SSL connections,
none of these connections would be required from
item 5 on. The result of this keepalive strategy is a
significant improvement in performance. Compared

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

with the “almost pure content” model in figure 10, The ability to handle incoming connections is clearly a
adding the single server keepalive feature saves about key factor, and has been the focus of recent work by
3/4 of the retrieval time. The results of the three van der Mei et al. [7].
scenarios are summarized in figure 13.

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
The response of the content server to the request for
S u m m a ry o f m o d e le d p e rfo rm a n c e data also has sub-steps, any of which can cause
delays. First is the same network delay and service
m e a s u re d +no socks + p u re c o n te n t + k e e p a liv e queue that may come into play during the connection
re trie v a l tim e
3 .9 1 1 3 .4 1 2 3 .2 2 9 0 .8 9 4
step. Second, the content may have to be dynamically
(s e c o n d s )
to ta l
assembled, or it may reside on a back-end server
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

- .4 9 9 .6 8 2 3 .0 1 7
s a v in g s such as a database. It would be nice to know a more
% to ta l
s a v in g s
- 13% 17% 77% detailed breakdown of the overall server response
in c re m e n ta l
s a v in g s
- .4 9 9 .1 8 3 2 .3 3 5 time. There is work in progress in our lab to allow the
% in c re m e n ta l
- 13% 5% 72%
proportion of actual service time to total response time
s a v in g s
to be measured. This would allow the modeling to be
Figure 13 extended to account for routing and queuing delays
separately from the time it takes the server (and
downstream servers) to actually perform the service
request. One modeling scenario might be to link the
Discussion connect time to the network/queuing portion of the
The examples presented in this paper demonstrate the server response time, on the grounds that getting a
value of modeling web page retrieval performance at connection and getting a service request started ought
the level of component retrieval activity steps. If the to take about the same time. Another way to say this
analysis were performed only at the level of whole is that if, for a particular server, the server response
item retrieval there would be no way of gauging the time is significantly longer than the connect time, then
relative contributions of the firewall, network, server, or that difference probably represents actual server work,
other activity or resource components. This and making the network faster will not help much.
knowledge is valuable because it suggests the
potential impact of changes. In our example, the The most significant effect as seen in our modeling
socks firewall came into play because the target web example comes from a more efficient use of available
page was being accessed from a node on a corporate sockets. For the single server case, the model we
network (my office workstation). The cost of 8% in this have chosen suggests that this could far outweigh
case seems well spent in order to make the intranet tinkering with gateways and routers in improving the
more secure. However, it is possible to implement a user experience. The implementation would require
socks firewall poorly, using under powered servers the use of keepalive sockets, supported by HTTP 1.1.
and creating bottlenecks. If socks bottlenecks were The client has to request the feature, and the server
suspected, one could compare the measured has to agree, for each transaction. Even if there is
performance with the no-socks model to get an idea of agreement, either side may close the socket for its
the maximum benefit achievable by eliminating the own reasons and without notice. For the benefits
firewall altogether. If improvements were then made, predicted by the model to be realized, the keepalive
such as rotating traffic through multiple servers, or requests have to be honored. One reason this may
using a non-socks firewall, the fractional improvement not happen is that if the server is already at its
could be estimated. The model helps decide if this is maximum number of simultaneous connections, then
even worth trying. There is work in progress in our lab every connection kept open for one customer may be
to validate the model predictions against an alternative another connection refused for another customer
gateway which would not involve the additional accessing the site. Strategies for optimizing traffic by
communication volley associated with the socks honoring some fraction of keepalive requests are
gateway. beyond the scope of this paper. However, a policy of
refusing all keepalive requests (or not upgrading from
Making very fast connections and getting very fast HTTP 1.0) can clearly hurt performance.
turnaround from the content servers resulted in a
significant performance improvement in our models. An interesting set of questions arise whenever multiple
What determines how long these steps take? Each models are combined in some way to give a
depends on several factors. For connections, there is composite prediction: How does the composite
the routing speed (network latency) which determines depend on the way the models are combined? Does
how fast the “connect” message gets to the server. the order matter? What, if any, are the interactions
When the server gets the message, it may be busy between models? First consider the real world that the
making other connections, or the maximum number of models are targeting. It seems clear that the
connections for that server may already be in use. performance of a real system depends on the

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

comprehensive state of the system and its evolution as parameters of every model, so that their effects
rules, not on how the state or the rules came to be. If would always be local.
models were totally realistic simulators then the same
independence would apply to the order in which the The examples presented here represent single

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
models were composed. In practical terms, however, measurements of a single web site, primarily for the
the modeling software we have described is not a purpose of describing the modeling operations and
simulator. It does not duplicate the state of the target illustrating their results. In practice, measurements of
system, change parameters, fire the rules, and see a single web site exhibit variations in performance at
what happens. It is an analysis tool which operates on all levels of granularity. There is work in progress in
high level constructs which represent, but are not our lab to automate the collection of these samples,
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

identical to, the features of the real system. We are and expand the modeling framework to create,
modeling the performance metrics of the system, not analyze, and manipulate the statistical representations
the system itself. of these measurements.

So, what kind of independence between our models Conclusion


should we expect? Recalling the discussion of the
Measurement and modeling complement each other in
functional flow and the three modeling layers, the
evaluating the performance characteristics of the web
answer hinges on the implementation of “in series”
page retrieval process. Measurements at the
and the partition of function across the layers. For
component retrieval activity level provide information
series operation, we take as input the previous
on how the time is being spent. A scheduling model
model’s output. Some models, such as those in the
which accounts for the tasks, resources and
limiting group, can require reference values derived
constraints allows these measurements to be
from the data, such as the basis for setting the relative
analyzed and scenarios constructed in which task
floor and ceiling. If these reference values are derived
attributes are altered, resources reallocated,
directly from the output of the previous model then
constraints modified, and the schedule recomputed.
clearly the order of composition will matter. If,
Examples of several scenarios have been run through
however, we are careful to always use the unmodeled
this modeling process, and the resulting schedules
data for deriving reference values, then these models
compared to the original. Reduction in the duration of
will commute, more closely approximating the real
connection, socks connection, and server response
world scenarios.
steps have a modest effect on the overall schedule.
The maximum impact was obtained by reallocating the
The effect of the functional layering on the
socket and server resources, and using a strict
independence of the models is more complicated.
keepalive socket strategy.
Every model has its own layer 1 (modify steps)
function, and so is immune to interactions (subject to
The future of this type of modeling looks promising.
the reference value caveat above). Layers 2 and 3
Enhancements to measurement techniques provide
are shared by all models, and so have the potential to
more attributes to consider. Changes to real network
cause interactions. As an example of a layer 3 effect,
configurations provide opportunities to validate the
consider the HTML-constrained shift. This model has
accuracy of the models. Expansion of the modeling
no layer 1 or 2 function. It controls the way in which
capabilities to analyze statistical samples of repeated
layer 3 is performed by limiting how far left we can
measurements will increase the precision of the
shift an item, based on the state of some prior item
predictions and enhance the value of the tools.
(the HTML source). Since layer 3 is used by all
models, its position in the series does matter. In the
results presented here, the HTML-constrained shift References
model was always run first, and therefore was part of
every model. The single server keepalive model is
another example of model-specific layer 3 function. [1] Mills, Chiu, Halim, Hellerstein, Squillante,
Recall that this model sets the start time for an item “Metrics for Performance Tuning of Web-based
based on the end time of a prior item that has an
available open connection. If that prior item is not
Applications”, Proceedings of the Computer
immediately preceding the affected item, then the link Management Group, v.2, pp.783-790 (2000)
between them will not be maintained across
subsequent models. In the results presented here, the [2] Menasce, Almeida, Fonseca, Mendes,
keepalive model was run last. These dependencies “Resource Management Policies for E-Commerce
are a consequence of the way we organize our
software, not basic properties of the model logic. We
Servers”, ACM Sigmetrics, v.27(4), p.27 (2000)
could choose to cast all layer 2 and 3 control features

Find a CMG regional meeting near you at www.cmg.org/regions


Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference

[3] Crovalla, Lindemann, Reiser, “Internet


Performance Modeling - The State of the Art at

Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
the Turn of the Century”, Performance
Evaluation, v.42, pp.91-108 (2000)

[4] Hellerstein, Maccabee, Mills, Turek, “ETE:


A Customizable Approach to Measuring End-to-
End Response Times and Their Components in
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe

Distributed Systems”, International Conference


on Distributed Computer Systems (1999)

[5] Killelea, “Web Performance Tuning”,


O’Reilly (1998)

[6] Menasce, Almeida, “Scaling for E-Business,


Technologies, Models, Performance, and
Capacity Planning”, Prentice Hall (2000)

[7] Van der Mei, Erlich, Reeser, Francisco, “A


Decision Support System for Tuning Web Servers
in Distributed Object Oriented Network
Architectures”, ACM Sigmetrics, v.27(4), p.57
(2000)

Find a CMG regional meeting near you at www.cmg.org/regions

You might also like