CMG - How to Validate Performance and Scalibilty
CMG - How to Validate Performance and Scalibilty
Performance Professionals
The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.
This paper was originally published in the Proceedings of the Computer Measurement Group’s 2001 International Conference.
Copyright 2001 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.
BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:
License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.
Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.
Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.
General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
Learn the basics and latest aspects of IT Service Management at CMG's Annual Conference - www.cmg.org/conference
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
Prepared by: Jack Woolley, Kaiser Permanente
This member paper presents a time proven methodology to validate an application’s availability, performance and
scalability. At the end of the paper you will know how to perform this methodology yourself.
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
INTRODUCTION
History
Why Validate Application Quality?
Our company didn’t begin with a Production
Our management recognized that much of their CPU Certification testing program. It evolved slowly over a
utilization, response time and availability issues were period of more than nine years. The initial objective of
a result of “conflict of objectives”. This “conflict of the Production Certification testing program was to
objectives” is between the application development ensure that applications that used a “new” database
groups and the system software/capacity groups. (named DB/2) didn’t over utilize mainframe CPU
resource and cause CICS outages. At that point, this
At the root of this conflict… on the one hand the testing was simply named “Stress Testing”.
development group’s focus on “implementing on time
and within budget”. (Obviously a proper goal for their Stress Testing used a mainframe product called
role in data processing.) Frequently their employees’ TeleProcessing Network Simulator (TPNS). It was
performance reviews and/or bonuses rely significantly used to simulate CICS production transaction loads in
on these two criteria. Unfortunately meeting this order to verify that the new application performed well
objective often means doing things the “easy way” and consumed a rational amount CPU resource.
and perhaps not necessarily the most efficient way for
CPU resource utilization and response times. We quickly found that this stress testing allowed us to
Sometimes meeting the “on time and within budget” pre-tune the CICS environments so that new
objective means cutting corners and not coding fault applications could be implemented without causing
tolerance and recovery procedures into the outages on the first day of production implementation.
application. The lack of this fault tolerance and With this in mind, all new applications (or major
recovery procedures can cause many availability application modifications) were required to “pass” a
issues. stress test. At this time the pass/fail criterion was
based solely on whether the application caused a
All too often the rush is to get the application CICS outage or not.
implemented. The mindset of the application
developers is often “implement now and fix it later”. Later when availability was under control, our
Unfortunately, ‘later’ frequently never comes… management began to focus on application
performance. Soon a “one size fits all” response time
On the other hand the system software and capacity objective was created. This criteria consisted of two
groups maintain and execute applications on a daily parts, “internal” CICS response time and an “external”
basis. Because of this, their primary objectives are response time. The internal response time objective
different than the application developers. Their focus of less than two seconds 95%-ile and an external
is on application efficiency, application performance response time objective of less than five seconds
and application availability. 95%-ile was instituted. The Stress Testing pass/fail
criteria was expanded to include these two additional
In an attempt to balance the difference between these requirements.
disparate objectives, a Production Certification testing
methodology was created. In essence, this Things went along pretty smoothly for several years
Production Certification testing is meant to be a final and then Client Server (C/S) applications started
(and overall) application quality check before the
application implementation is accepted for deployment
into the production environment.
appearing. Obviously the mainframe product (TPNS) They are turned over to the developers/vendors
was not going to be suitable for this new application for resolution. However the Production
category. A frantic search ensued for a C/S stress Certification team will make itself available to
test tool. We tried several and selected one very assist in resolving issues.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
expensive tool that “recorded and then simulated”
multiple-workstation C/S network traffic to stress the · Application functional testing is to be completed
C/S application server. Unfortunately we discovered before Production Certification starts. All the
that testing in this manner consistently gave us significant issues need to be completely resolved
misleading and inaccurate results and we abandoned before testing begins.
the use of this tool. · Application user acceptance is complete before
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
About the same time, some “rules” for Production Frequently we have to point out that Production
Certification were identified: Certification does not necessarily start with the date
on some application development project plan. We
find ourselves reminding people that the start of
· Production Certification should take less than two
Production Certification begins when these
to four weeks.
requirements are met, not by a calendar date.
· The Production Certification team does, not
resolve the issues/problems that are discovered.
Stable Environment
Input Data
A stable environment is one that that closely
resembles the production implementation A list of valid input data which would be used in the
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
environment. For CICS based applications: a quality above workflows/screenflows is also needed. For
assurance environment. For GUI applications: example; if the most frequently used function is to
network, workstations and servers in the expected search for a customer, obviously a list of several
production hardware and software implementation customer numbers are required. (Keep in mind that
configurations. The term “stable” not only refers to the these workflows/screenflows sometimes need to be
availability of the environment, but also mandates a executed thousands of times.)
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
Stable Application
A stable application is one that closely resembles the Total System Volumetric
expected production implementation. The application
should have completed integration testing and have Again, in order to accurately simulate the future
had all significant problems completely resolved. The production environment, we need an indication of total
application also should have user acceptance testing system activity (including the three to five functions
completed and all the resulting application listed above). And once again, these estimates
modifications complete. The term “stable” not only should reflect the activity after two years of full
refers to the availability of the application, but also production implementation.
mandates a “frozen” application.
Workflows/screenflows of the three to five application · Workflow Response time (the time to fulfill the
tasks that are most frequently performed by the client request) must not inhibit the client from
clients. Please note that the “most frequent” functions proceeding with their workflow for more than five
are often not the “most critical” functions. We assume seconds, 95%-ile.
that the most critical functions have been very well · The total number of network packets that traverse
tested in integration testing and therefore, not the network for any single client interaction must
included in the Prodcution Certification effort. not exceed 375 packets.
Production Certification focuses only on the most
frequently used functions. · No “memory leak” is acceptable on the server.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
network analyzer monitors the network traffic. This
testing is designed to identify application network
Workstations For Client Server Production behavior and measure the utilization of the network
Certification resources. Again, our pass/fail criterion is that less
than 375 packets traverse the network for any client
The Production Certification testing methodology interaction.
exercises the complete application. As a result, we
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
use tools that create client activity using the “real” There are two basic reasons for this test. We have
client interface. For Client Server applications this found that the application network utilization normally
means that several client workstations are required to has a significant impact on the application being able
perform the tests. By cutting down the client “think to meet the client response time expectations. The
time” we have found that each workstation can other reason for this testing is to make sure that no
normally generate the activity of 50 to 100 clients. single application monopolizes the network resources
in a way that would impact other applications.
As stated previously in this paper, we’ve decided not
to maintain a dedicated “testing lab”. The application Network Resource Example:
development group normally provides the client
workstations to perform the Production Certification Using the Network Usage test, we discovered an
testing. application that caused 3,200 network packets to
traverse the network as the client opened an empty
“search” window. Opening this empty search window
Time On The Implementation Schedule caused over 1.5 megabytes of data to traverse the
network. In talking to the developers about this
Normally, two weeks are adequate for 3270 (character network traffic, they indicated that all the possible
based) applications which have completed search criteria was being loaded on the local
functionality testing and user acceptance testing. workstation “just in case” the client needed the
Four weeks are normally adequate for GUI selection. One of these pre-loaded selection criteria
Client/Server applications. had a drop-down list that was over 500 entries long.
The testing itself requires much less time. Most of this
scheduled time is meant for the development We discovered this issue and had the developers alter
team/vendor to resolve the issues that are discovered the “search” window logic to populate the search
as a result of the Produciton Certification effort. criteria only when the client selects the particular
criterion. Although this logic is slightly more complex,
This scheduled time period is “fixed”. For example; if it significantly decreased the number of network
the user acceptance testing is a week behind packets. After this was done the application met our
schedule, the full Productiion Certification timeframe is pass criteria for the Network Usage test.
still required. If the project is behind schedule, the
implementation date needs to be adjusted accordingly
before Produciton Certification begins. Functional Contention Test
sequence of the logic. (It lost track of the client’s When we were performing a Longevity test on a two-
selections and the flow of the screens was not tier Client Server application we noticed a significant
consistent.) The developers were notified and their memory leak in an “automatic update” feature. We
investigation discovered that several Java “global” found that the application (without keyboard or mouse
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
variables should have been defined as “local” activity) “leaked” over 300 K of memory every two
variables. This was fixed and the application passed minutes. We notified the developers and at first they
the Functional Contention test. refuted our findings. Later they attempted to resolve
the issue. Unfortunately they were unable to resolve
In another example, we were testing a 3270 character the issue and the application implementation was
based CICS/DB2 application. The Functional permanently canceled.
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
6
The resulting response times are also carefully 4
collected and statistically analyzed for indications of 2
server “memory leaks”. This is indicated by the same 0
functional activity (at the same application load) 3 6 9 12 15 18 21 24 27 30
Functions Per Minute
slowing over the time of the Longevity test.
Ultimately, when a large amount of server memory
has “leaked”, the server stops processing completely.
The following graph shows a gentle upward trend.
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
10 large (and waste money on a machine which will be
8 obsolete within two years).
6
4 We can help by executing the Production Simulation
2 test for an extended time and have our capacity group
0 measure the server CPU consumption and memory
Join over 14,000 peers - subscribe to free CMG publication, MeasureIT(tm), at www.cmg.org/subscribe
Buy the Latest Conference Proceedings and Find Latest Computer Performance Management 'How To' for All Platforms at www.cmg.org
Failure Under Load Example:
CONCLUSION