CMG1991_Performance engineers view of system devleopment
CMG1991_Performance engineers view of system devleopment
Performance Professionals
The Computer Measurement Group, commonly called CMG, is a not for profit, worldwide organization of data processing professionals committed to the
measurement and management of computer systems. CMG members are primarily concerned with performance evaluation of existing systems to maximize
performance (eg. response time, throughput, etc.) and with capacity management where planned enhancements to existing systems or the design of new
systems are evaluated to find the necessary resources required to provide adequate performance at a reasonable cost.
This paper was originally published in the Proceedings of the Computer Measurement Group’s 1991 International Conference.
Copyright 1991 by The Computer Measurement Group, Inc. All Rights Reserved. Published by The Computer Measurement Group, Inc. (CMG), a non-profit
Illinois membership corporation. Permission to reprint in whole or in any part may be granted for educational and scientific purposes upon written application to
the Editor, CMG Headquarters, 151 Fries Mill Road, Suite 104, Turnersville , NJ 08012.
BY DOWNLOADING THIS PUBLICATION, YOU ACKNOWLEDGE THAT YOU HAVE READ, UNDERSTOOD AND AGREE TO BE BOUND BY THE
FOLLOWING TERMS AND CONDITIONS:
License: CMG hereby grants you a nonexclusive, nontransferable right to download this publication from the CMG Web site for personal use on a single
computer owned, leased or otherwise controlled by you. In the event that the computer becomes dysfunctional, such that you are unable to access the
publication, you may transfer the publication to another single computer, provided that it is removed from the computer from which it is transferred and its use
on the replacement computer otherwise complies with the terms of this Copyright Notice and License.
Copyright: No part of this publication or electronic file may be reproduced or transmitted in any form to anyone else, including transmittal by e-mail, by file
transfer protocol (FTP), or by being made part of a network-accessible system, without the prior written permission of CMG. You may not merge, adapt,
translate, modify, rent, lease, sell, sublicense, assign or otherwise transfer the publication, or remove any proprietary notice or label appearing on the
publication.
Disclaimer; Limitation of Liability: The ideas and concepts set forth in this publication are solely those of the respective authors, and not of CMG, and CMG
does not endorse, approve, guarantee or otherwise certify any such ideas or concepts in any application or usage. CMG assumes no responsibility or liability
in connection with the use or misuse of the publication or electronic file. CMG makes no warranty or representation that the electronic file will be free from
errors, viruses, worms or other elements or codes that manifest contaminating or destructive properties, and it expressly disclaims liability arising from such
errors, elements or codes.
General: CMG reserves the right to terminate this Agreement immediately upon discovery of violation of any of its terms.
DOCUMENT TYPE: 1991 PROCEEDINGS
Abstract
This paper describes huw to apply software performance engineering techniques in perfonnance trials for
new applications. It describes assignments in which the likely impact on resource usage of new
applicatiollS developed using advanced software engineering tools and techniques were determined.
Approaches involved were structured or object-oriented methods, IPSE and CASE tools, 4GL a"d RDBMS,
and automatic code generation. The results demonstrate the value of assessing performallce risks and
establishing objective performance measures early in the development lifNyc/e.
455
2. OUIUNE OF PAPER
456
development can not only lead to "paralysis by 2.4 Project Planning
analysis" but also to over-eomplicated systems defined
to meet a perception of what DP technology can now The early planning stages of a major new application
provide rather than what is needed. Adoption of CUI are dominated by the definition of the User
etc lead to more "user-friendly" applications, but may Requirement, the logical definition of the system and
incur such an increase in compu ling resources various stages of procurement. Most projects adopt
required that the business case is invalidated. some form of scoring against given criteria, with
"mandatory" and "desirable" requirements and a final
Traditionally, it has been difficult to derive the selection based on price amongst those solutions
application volumetrics from the system designers. which meet the criteria, score well and are politically
The result has been the launching of new applications acceptable in that they meet the strategic plan of the
on existing configurations with minimal preliminary organisation. The complexity of the exercise is
sizing other than that available from pilot testing. The affected by the nature of the constraints imposed. A
business traffic once it is applied can lead to major new application added to an existing environment
impacts on other systems as well as a poor service to clearly has fewer options than the "green field"
the new users themselves. This is compounded if the situation where the project is part of a major
system is initially a successful one, where users try out procurement exercise. In most cases, the elimination
the new facilities, create excessive traffic, become of any large number of options is done by various
dissatisfied with the responses and then reject the factors, leaving typically three options for evaluation.
system. The best known UK examples of this lie in the
Stock Exchange and a major retailing credit system. The difficulty lies in gaining objective measures for the
allocation of scores. The majority of assessments of
Performance engineering can be applied at all stages in functional benefits and supplier's credibility are
the development life-cycle, as shown in figure 3 where subjective, despite endeavours to use objective
the Performance Engineering aspects are shown measures such as "number of installed sites with this
(using "tx" as an abbreviation for a transaction) version of the software".
I
1
ANALYSIS
~
· ...
• IResource demandl
specification of the system, and particularly its metrics
tend to be highly suspect. The result is that by the
time the system is live, the supplier can sensibly argue
that the specification was wrong in a number of critical
areas and so the commitment is invalid .
I
9
, · .
CODE & TEST
I
Check keytx
~
facto standard for measuring transaction based real
time commercial systems) can be used for raw
comparisons, but need validation in the light of the
, ·
particular workload concerned.
t
After a number of technical reviews of alternative
I IMPLEMENT
... Fonnor perlormanct
suppliers' configurations in major procurement and
upgrade exercises, a procedure has been developed for
effective measurement trials with the leading
I MAINTAIN • • I Track & forecast ~ suppliers. To create such a specific definition of a
workload and undertake a performance trial is a non-
trivial undertaking. This clearly represents a
significant workload on the user and the supplier and
Fig 3 Performance Engineering Stages
457
is only practical for major projects. However, the simplified enough for quick implementation by
effort involved can be minimised by the use of automatic generation, and algorithms defined to
modelling techniques on a sub-set of the application populate a pseudo-database. Workloads have to be set
which are then extrapolated to yield predictions for up which can be used to increase the machine
the total system. Further, it yields major benefits in utilisation from being empty to being saturated. TItis
evaluating suppliers' proposals for such projects as it workload in turn has to reflect the likely population of
imposes a formal sizing approach and contractual tenninals, communications traffic and batch users.
commitments to application sizing and capacity
planning. It also has the added benefits of trapping The criteria for measuring the results of a perfonnance
the logical definition at an early stage and testing a trial have to reflect the measures which are most
possible physical solution based on a simple sub-set of important to the application. Thus if the enquiry
the total application. This clarifies the ideas of the response time is vital, but an update less so, then that
designers and introduces a realistic approach to actual is reflected not only by the different service response
system development. times defined, but also by the weighting applied to
that score.
The resource demands of a new application are
defined. The information on the files, records, 3.1 Specification method
transactions and dialogues is maintained in a database
of related dictionaries for data, processes and layouts. The facilities of a workstation are exploited to
This is then used to define the transaction maintain the specification of a system as it develops
characteristics and aggregate them into workload from the initial logical requirement through database
components for modelling. definition and nonnalisation to the physical design. It
relies on structured procedures and fonns for data
The manufacturer's sizing of the proposed equipment collection and that information is held in four
is used to build a baseline model. The supplier also dictionaries for Data, Processes, Applications and
uses the same methodology to size an application sub- Management. Up to six levels of detail can be held
set, which is then used in measurement trials and the and modified as the design evolves; however, in a
model calibrated. Performance at reference sites is simple trial only two or three are necessary.
also used to test the model further. With confidence in
the calibration thus assured, the model is used to
assess the likely performance of any proposed
alternatives or potential upgrades.
3. PERFORMANCE TRIALS
hem
The new application has to be characterised and the
major transactions identified. Any database has to be
Fig 4 Specification Data Model
458
The Data Dictionary is used to define the entities and 3.2 Specification
elements involved. It comprises both the data model
and an element dictionary,and defines both entities, The specification is a structured definition of a
with their logical attributes and relationships, and also database (with a program to populate it with
elements with their physical definitions. dispersed records of dummy data) and of programs to
The Process Dictionary is used to define the functions build up a workload of scripted interactive jobs with a
and processes involved. Typical major functions are :- master scenario program to alter the workload and a
scripted timer program to provide comparative
Record Displayer Direct by unique key or via measures. The aim of the trial is not merely to
list of non-unique keys measure performance, but also to identify the
Record Selector Matching combinations of resources demanded to drive that level of service.
field values for extracts
Record Updater Update data fields subject to a However, the overall view of the performance has also
particular check to include some measure of the development effort
Record Editor To allow user to change involved at all stages of the development life-cycle.
values from display Thus the logical design, physical specification, code
File Transferrer To another system to generation and testing have all to be taken into
demonstrate communications account to some degree by staff recording activities on
Script/timer job Tuned to reflect workload and time-sheets throughout the project.
take significant time
Scenario job Run in multiple versions for The specification is designed to permit its
saturation tests implementation in reasonable time-scales (days rather
Database Generator Generation of test data to than weeks).
establish the pseudo-database
Main Menu List of all the available The measurements taken are at two levels. Firstly, the
functions job level information reflecting the throughput
achieved during the performance trial has to be
The User view of the system is defined in terms of logged. This is available from metrics incorporated
business transactions and message pair volumetrics in within the trial programs themselves. Secondly, the
the Application Dictionary. The traffic and functions system level information on the resource usage in
performed are chosen to represent the actual workload terms of CPU utilisation and disk usage has to be
with the minimum implementation effort. To this end, recorded. Ideally this is related to the job level data to
the actual paths to be coded are minimised to a few provide resource demands by workload. This in turn
fixed message pairs for each transaction. enables the building of a simple model of the trial and
thus using it to predict the impact of variations.
The Management Dictionary caters for information as
below, which is of less import in a trial situation: Further, as much as possible is done to relate the
results of this pseudo-workload to reality based on
• Control Project Plan and Schedule monitoring results of the eventual production system
• Validation QA, Test Plan and Test Data as it evolves. The evaluation should be seen as setting
• Issue Creation, control and release the basis for the dewlopment of a capacity planning
• Amendments Errors, ModIs and Wish-lists. and monitoring approach for the application.
459
At system level the monitor provides totals over
periods of typically 5 minutes with the following :-
The monitor provides for each job & the operating Scenario Controller
system itself:-
generates workload
• Elapsed time & CPU sees used ./ I \..
•
•
No of disk transfers by device
Memory used under operating system parameters On-line I
/ Batch
• Virtual store interrupts I page faults
• No. of Virtual Memory pages transferred en r
On-line Batch
Report Prints are taken, including
•
•
Throughput and response times for every task
Resource demands per transaction
update update
-
Script Controller
3.3 Performance Trial times predefined path
The first step is to create a pseudo-database based on
the chosen design of typically one million records by Fig 5 Trial Process Structure
running a simple 'generator' program, which is
specified deliberately to incorporate bad loading of The first timing of a 'script' is with the machine empty
indexes, extremes of variability and occurrence in other than for the script itself. The timings are then
accordance with the new application's expected data. repeat.,.,) with an increasing load of competing tasks
from 'scenario'. The number of tasks required to cause
The second step is that of running a series of programs saturation will vary according to the machine and
reflecting typical core tasks. The programs run in a software implement.,.,).
pre-defined manner in a 'script' with an increasing
load being placed on the configuration via a 'scenario' SiZing is carried out to cross-check the findings of the
command file. It is not necessary to code all paths previous two phases. The work entails four phases:
through these transactions but only those in the script.
• Setting the logical baseline from the design
The third step is to produce the results. A systems information
resource utilisation report is run first on an empty
machine to establish the operating system and • Modelling the performance demonstration
monitoring overhead. A number of timings of the
system are then taken both for the batch and • Calibrating against the suppliers' sizing of the
interactive tasks. The transactions are timed by the job demonstration
itself to avoid fast and difficult stopwatch timings and
logged to a spool file for later analysis. • Predicting the performance of the implemented
production system.
The programs are shown in figure 5, where copies of
the On-line and Batch enquiry and update functions 3.4 Measurements
are driven by the Scenario controller.
A number of timings of the performance of the system
have to be taken for the interactive tasks. All
interactive timings are performed at least 20 times to
be statistically significant, times using the Operating
460
System clock as a stop-watch, measuring from SEND Similarly, the priorities of background tasks must be
to the start of the screen being filled in response. This maintained at the same level as that of the SCRIPT.
is done with the machine empty except for the one
user being timed. This is then repeated with other jobs The throughput and resource utilisation for each job in
initiated via SCENARIO with different numbers of the workload is logged on the monitor file report. This
tasks, based on a prescribed ratio of jobs in the mix. includes the disk accesses (both logical and physical),
CPU and memory activity statistics at system and job
An arbitrary measure of saturation is based on a level (including accumulations by job type if possible
predefined degradation in response time compared ie. aggregate statistics for all the tasks running in a
with that on empty machine, found in practice on most particular workload mix).
configurations to be a factor of 5. This tends to show
up on response time curves as being well into the The perfonnance measures for each job are added to a
"elbow" of degradation although the actual measures report file and include the number of loops, response
of utilisation of CPU, Disk controllers and disks time average and standard deviation for SCRIPT and
themselves clearly vary from machine to machine and each job spawned by SCENARIO and the number of
workload to workload. logical/physical transfers in each batch and
communications job.
The number of jobs causing saturation is found, with
saturation % on each device (CPU / disk) depending on 3.5 Analysis of results of trial
the system concerned. The job mix in SCENARIO is
typically a lot of interactive enquiries, some interactive The results of each trial can be summarised with
updates, a data collection job, batch enquiry and batch curves such as those shown in figure 6. Different
update with a monitor in the background (taking 30 configurations saturate (either in measured reality, or
second snapshots) and the SCRIPT running (for in prediction) at different workloads. Note that the
approx 20 minutes with 20 loops of say 60 seconds) response time is typically for a script containing a
number of message pairs, and the workload represents
The number of jobs causing saturation varies with the number of parallel active jobs reflecting traffic
each configuration, but is typically 70% CPU loading which may reflect an individual user, or more
and 30% disk I/O. Then, as well as timing the SCRIPT typically a group of workers.
job with 0 & 1 background jobs, it is done with a range
of increasing workloads to yield some points on the 5 Response
initially linear section of the response-time curve with
some more points "on the elbow".
Time
In
4 Sees
Note that throughout the demonstration, tasks are
started with variable pauses of say 10 +/- 5 seconds,
with the use of a pseudo-random number generator to 3
control the actual pause. Equally, any initiation of
multiple tasks ensures that there is no undue use of
cache or other optimisation by virtue of addressing the 2
same areas of disk. This is achieved by use of random
accessing, eg of the keys presented for a search There Number
is a tendency for systems to synchronise on pseudo- of jobs
workloads and this must be avoided. in
workload
The throughput on every job is monitored. The
control of the start and end of each monitored and 20 40 60 80 100
timed run of the SCRIPT ensures that the workload is
Fig 6 Trial results for different configurations
stable and does not include the impact on the system
with loading programs. Equally, the pattern of disk Each performance trial at each supplier site tends to
accesses must not be constrained so as to achieve have its own saga of minor problems and variations
unduly high disk cache hit rates. from the optimum sp<!Cification and implementation.
461
Physical constraints such as the difficulty in having workload is different. However, some results of
exactly the same version of software and hardware general interest do show that any of the following can
configuration have to be catered for, as well as incur extra resource demands:
restriction on the number of terminals (or pseudo-
terminals) that can be addressed in the trial. It is • Structured methods can lead to data models that
important at this stage to usc analytical modelling are normalised to such an extent that 1/0 activity
techniques to build a model of each trial so that the becomes excessive due to long navigation routes
results can be extended and normalised across trials.
• Formal analysis can lead to over-rationalised
A modelling package which can mn on a number of process definitions so that excessive message pairs
operating systems is essential to this approach. are necessary to perform a transaction
462
5. BENEFITS OF ]HE PERFORMANCE TRIAL References
The problem lies in discovering for each application 1. Adam Grummitt.. Capacity Planning under VMS,
the degree to which the above factors apply, and the Decus UK, Mar 1988
cost impacl. This can only be done at each site for its 2. Adam Grummitt, Capacity Planning and Perform-
own workload and against its own criteria for ance Engineering, Current Perspectives in Health
significance. The results can be dramatic, with factors Computing, BJHC, May 1988
of times 4 to times 40 in terms of resources required 3. Adam Grummitt, Performance Engineering and
(and hence cost). With such financial leverage, it is a Structured Methods, TCL AMSU, Sept 88
sensible precaution to undertake the sort of
performance trial described.
463