CPE Module IV
CPE Module IV
Hardware Performance
When the suitability of a system to an application or the purchase of the best system at the least
cost is to be made, system performance is the best consideration. As we have consistently
mentioned, system performance is a complex issue that involves the capabilities of the hardware,
operating systems and the application software. Simple hardware performance measures are
based on performance parameters of its major components, e.g the main memory, the CPU, the
storage component, etc. Two parameters that readily come to mind are main-memory bandwidth
and CPU instruction execution speed. The main memory bandwidth is described as the rate at
which information can be transferred to and from the main-memory.
Hardware performance refers to how well the components of a system work together to execute
tasks, such as processing data, rendering graphics, or transferring files. Verifying hardware
performance means testing and measuring the speed, efficiency, reliability, and compatibility of
your hardware.
Hardware performance refers to the quantifiable speed and efficiency of computer components
such as processors, memory, and cache, which can impact the overall performance of a device.
Performance benchmarks are commonly used to measure and compare the performance of
hardware components.
One of the easiest ways to assess your hardware performance is to use benchmarking tools.
These are software applications that run various tests on your hardware components, such as
CPU, GPU, RAM, disk, and network, and compare them with standard or average scores.
Hardware components may be grouped into processor, memory, and disk. An average
performance index is obtained for each group. A bar chart display method when used can
indicate best performance with the longest bar. This type of test is carried out with the actual
operating system and application software environment. Let us however consider those factors
that contribute to system performance.
Storage Performance
The performance of a storage subsystem is affected by two major components. The storage drive
medium and the adapter, that is, the interface between the storage and the CPU bus. The CD-
ROM, Hard disk, Floppy drives are a few of the storage media.
The hard disk is considered here for performance analysis for the reasons of its capacity, read-
write ability and usage in network and stand-alone environment. Data on a disk is arranged in a
series of concentric circles or tracks. To read data, a read head has to first move to the
appropriate track on the disk. The time involved in this process is significant in the overall time
measured of the hard disk information read and write consideration. Therefore, most attempts at
improving disk performance is aimed at minimizing the amount of head movement involved.
Some of the parameters of the drive and drive adapters that can affect drive performance are:
1. Data transfer rate
2. Sector buffer and size
1
3. File caching capabilities
4. Data encoding method
5. Access time
6. Average seek time
7. Track to track seek time
8. Capability etc
Access Time
This is a measure of the time required for a drive to find data on the media, starting when a seek
instruction is issued by the controller and ending when the beginning of the requested data is
under the head.
Capability
The capacity of a system or the capacity of work a program or software can effectively handle is
another important consideration in storage system. One is always not happy when a system or
program becomes very slow when handling a task. This metric depends on so many factors
ranging from the program capability, the system memory capacity, the CPU capability to even,
the system configuration. How fast, is not sufficient in a storage media; how much can be stored
is of vital importance.
Memory Hierarchy
1. Internal Memory (Registers)
Register is a small amount of storage available as part of a CPU. Almost all computers
load data from a larger memory into registers where it is used for arithmetic,
manipulated, or tested by some machine instruction. Processor registers are normally at
the top of the memory hierarchy, and provide the fastest way to access data. They are
2
built of the same technology as the main processor and also on the same chip. They are
the smallest, fastest and most expensive storage. This register allocation is either
performed by a compiler, in the code generation phase, or manually by an assembly
language programmer.
Categories of registers
Registers are normally measured by the number of bits they can hold, for example, an “8-
bit register” or a “32-bit register”. Registers are classified according to their content or
instructions that operate on them. We have:
User-accessible Registers: These are divided into data registers and address
registers. Data registers can hold numeric values such as integer and floating-
point values, as well as characters, small bit arrays and other data. In some older
and low end CPUs, a special data register, known as the accumulator, is used
implicitly for many operations. Address registers on the other hand hold address
and are used by instructions that indirectly access primary memory.
Conditional Registers: They hold truth values often used to determine whether
some instruction should or should not be executed.
General Purpose Registers (GPRs): They can store both data and addresses i.e
they are combined Data/Address registers.
Constant Registers: They hold read-only values such as zero, one, or pi.
Special Purpose Registers (SPRs): They hold program state; they usually
include the program counter and status register. Instruction pointer register stores
the instruction currently being executed, etc.
2. Cache Memory
Cache memory is an area of memory that holds frequently accessed data or program
instructions for the purpose of speeding a computer system's performance. It is a small
fast memory placed between a processor and a main memory. The data that is stored
within a cache might be values that have been computed earlier or duplicates of original
values that are stored elsewhere. If requested data is contained in the cache (cache hit),
this request can be served by simply reading the cache, which is comparatively faster.
Otherwise, (cache miss), the data has to be recomputed or fetched from its original
storage location, which is comparatively slower. Hence, the greater the number of
requests that can be served from the cache, the faster the overall system performance
becomes. The goal of cache inclusion is to improve the memory access time.
3. Main Memory
The main memory constructed of dynamic RAM chips is large and fast. It is used for
program and data storage during program execution. Location in the memory can be
accessed by CPU instruction set. The secondary memory which is relatively the largest
and slowest of the hierarchy is devices such as High capacity magnetic disk, R/W optical
disk etc. It is obvious that the farther away the memory is from the CPU the highest its
capacity, the less expensive it is per storage bit and the slower its access time.
CPU Performance
This may be classified into two groups:
(i) Technological improvement (increasing clock rates)
(ii) Degree of parallelism in internal operations - sequential instruction issue and
sequential execution, sequential instruction issue and parallel execution (pipelining),
and parallel instruction issue and parallel execution.
The maximum bandwidth of a microprocessor is a good indication of its performance. Each new
generation of microprocessor also introduces new instructions and data types, providing greater
software capability and performance. Deterministic microprocessor parameters/metrics often
listed by manufacturers are among others:
1) Instruction set size
2) Average clock to execute an instruction
3) Register-register Add instruction Time
4) Primary memory access time, cycle time and size
5) Cache access time, cycle time and size
4
6) Secondary memory access time, transfer rate and size.
The execution speed of a microprocessor can be measured using either the Kernel programs or
the Instruction mix method. The kernel method is approached by writing a small program code
like the Livermore loop that is adjudged to represent the CPU activity. The time of executing the
kernel program may be determined and a comparative evaluation with other machines and other
features may also be done.
Instruction Mix
It is not an over-emphasis to say that many performance measures are based on the CPU
behavior i.e the time to execute an instruction. The set of instruction types in the test workload is
called an instruction mix. Such instruction mix can be made up of:
Fixed-point Arithmetic
Floating-Point Arithmetic
Indexing
Benching
Shifting
Pipeline Effect
Pipeline involves the breaking down of a process into small steps and then executing the steps in
parallel on different data in a production line. A pipeline processor contains sequences of
processing elements (PE) through which a data steam passes. Each processor in the sequence
does its own assigned process at one or several passes.
Scheduler
Process scheduling for CPU service may be done statically or dynamically. Static scheduling is
done during program design while dynamic scheduling is done during program execution. A
cyclostatic schedule is a typical static schedule by which the scheduler is called periodically by a
timer to execute a set of task in pre-arranged order. A dynamic schedule processes may be either
preemptive or non-preemptive schedule. A preemptive schedule process is done on the idea
that one process can take over the CPU before another process finishes its computation. This
situation is achieved by a timing process which periodically returns control to the kernel and
hence allows the next process of the highest priority to run. A non-preemptive process simply
gives up control to the next process. Under non-preemptive scheduling, once the CPU has been
allocated to a process, the process keeps the CPU until it releases the CPU either by terminating
or by switching to the waiting state. This is done by system calls irregularly placed in the
program.
5
Performance testing
Performance testing is a testing measure that evaluates the speed, responsiveness and stability of
a computer, network, software program or device under a workload. Organizations will run
performance tests to identify performance-related bottlenecks.
The goal of performance testing is to identify and nullify the performance bottlenecks
in software applications, helping to ensure software quality. Without some form of performance
testing in place, system performance may be affected by slow response times and inconsistent
experiences between users and the operating system.
Performance testing helps determine if a developed system meets speed, responsiveness and
stability requirements while under workloads to help ensure more positive UX.
Performance tests may be written by developers and can also be a part of code review processes.
Performance test case scenarios can be transported between environments -- for example,
between development teams testing in a live environment or environments that operations teams
monitor. Performance testing can involve quantitative tests done in a lab or in production
environments.
As an example, an organization can measure the response time of a program when a user
requests an action; the same can be done at scale. If the response times are slow, then this means
developers should test to find the location of the bottleneck.
There are several reasons an organization may want to use performance testing, some of the
reasons are:
6
Organizations can also use this form of testing to ensure they are prepared for a
predictable major event.
iii. For testing vendor claims to verify that a system meets the specifications claimed by its
manufacturer or vendor. The process can compare two or more devices or programs.
iv. For providing information to stakeholders to inform project stakeholders about
application performance updates surrounding speed, stability and scalability.
v. For avoiding gaining a bad reputation, as an application released without performance
testing might lead it to run poorly, which can lead to negative word of mouth.
vi. For comparing two or more systems to enable an organization to compare software speed,
responsiveness and stability.
How to conduct performance testing
Owing to the fact that testers can conduct performance testing with different types of metrics, the
process can vary greatly. However, a generic process may look like this:
1. Identify the testing environment. This includes test and production environments, as well
as testing tools. Understanding the details of the hardware, software and network
configurations helps find possible performance issues, as well as aid in creating better
tests.
2. Identify and define acceptable performance criteria. This should include performance
goals and constraints for metrics. For example, defined performance criteria could be
response time, throughput and resource allocation.
3. Plan the performance test. Test all possible use cases. Build test cases and test scripts
around performance metrics.
4. Configure and implement test design environment. Arrange resources to prepare the test
environment, and then implement the test design.
5. Run the test. While testing, developers should also monitor the test.
6. Analyze and retest. Look over the resulting test data, and share it with the project team.
After any fine-tuning, retest to see if there is an increase or decrease in performance.
Organizations should find testing tools that can best automate their performance testing process.
In addition, do not make changes to the testing environments between tests.
7
There are two main performance testing methods: load testing and stress testing. However, there
are numerous other types of testing methods developers can use to determine performance. Some
performance test types are the following:
i. Load testing helps developers understand the behavior of a system under a specific
load value. In the load testing process, an organization simulates the expected number
of concurrent users and transactions over a duration of time to verify expected
response times and locate bottlenecks. This type of test helps developers determine
how many users an application or system can handle before that app or system goes
live. Additionally, a developer can load test-specific functionalities of an application,
such as a checkout cart on a webpage. A team can include load testing as part of
a continuous integration process, in which they immediately test changes to a
codebase through the use of automation tools, such as Jenkins.
ii. Stress testing places a system under higher-than-expected traffic loads so developers
can see how well the system works above its expected capacity limits. Stress tests
have two subcategories: soak testing and Spike testing. Stress tests enable software
teams to understand a workload's scalability. Stress tests put a strain on hardware
resources to determine the potential breaking point of an application based on
resource usage. Resources could include CPUs, memory and hard disks, as well as
solid-state drives. System strain can also lead to slow data exchanges, memory
shortages, data corruption and security issues. Stress tests can also show how long
KPIs take to return to normal operational levels after an event. Stress tests can occur
before or after a system goes live. An example of a stress test is chaos engineering,
which is a kind of production-environment stress test with specialized tools. An
organization might also perform a stress test before a predictable major event, such as
Black Friday on an e-commerce application, approximating the expected load using
the same tools as load tests.
iii. Soak testing, also called endurance testing, simulates a steady increase of end users
over time to test a system's long-term sustainability. During the test, the test engineer
monitors KPIs, such as memory usage, and checks for failures, like memory
shortages. Soak tests also analyze throughput and response times after sustained use
to show if these metrics are consistent with their status at the beginning of a test.
iv. Spike testing, another subset of stress testing, assesses the performance of a system
under a sudden and significant increase of simulated end users. Spike tests help
determine if a system can handle an abrupt, drastic workload increase over a short
period of time, repeatedly. Similar to stress tests, an IT team typically performs spike
tests before a large event in which a system will likely undergo higher-than-normal
traffic volumes.
8
v. Scalability testing measures performance based on the software's ability to scale
performance measure attributes up or down. For example, testers could perform a
scalability test based on the number of user requests.
vi. Capacity testing is similar to stress testing in that it tests traffic loads based on the
number of users but differs in the amount. Capacity testing looks at whether a
software application or environment can handle the amount of traffic it was
specifically designed to handle.
vii. Volume testing, also called flood testing, is conducted to test how a software
application performs with a ranging amount of data. Volume tests are done by
creating a sample file size, either a small amount of data or a larger volume, and then
testing the application's functionality and performance with that file size.
viii. Cloud performance testing: Developers can carry out performance testing in the cloud
as well. Cloud performance testing has the benefit of being able to test applications at
a larger scale, while also maintaining the cost benefits of being in the cloud.
At first, organizations thought moving performance testing to the cloud would ease the
performance testing process, while making it more scalable. The thought process was they could
offload the process to the cloud, and that would solve all their problems. However, when
organizations began doing this, they started to find that there were still issues in conducting
performance testing in the cloud, as the organization won't have in-depth, white box knowledge
on the cloud provider's side.
One of the challenges with moving an application from an on-premises environment to the cloud
is complacency. Developers and IT staff may assume that the application works the same once it
reaches the cloud. They might minimize testing and quality assurance, deciding instead to
proceed with a quick rollout. Because the application is being tested on another vendor's
hardware, testing may not be as accurate as on-premises testing.
Development and operations teams should check for security gaps; conduct load testing; assess
scalability; consider UX; and map servers, ports and paths.
Inter application communication can be one of the biggest issues in moving an app to the cloud.
Cloud environments typically have more security restrictions on internal communications than
on-premises environments. An organization should construct a complete map of which servers,
ports and communication paths the application uses before moving to the cloud. Conducting
performance monitoring may help as well.
9
Some tools may only support web applications.
Free variants of tools may not work as well as paid variants, and some paid tools may be
expensive.
Tools may have limited compatibility.
It can be difficult for some tools to test complex applications.
Organizations should watch out for performance bottlenecks in the following:
o CPU.
o Memory.
o Network utilization.
o Disk usage.
o OS limitations.
o Poor scalability.
An IT team can use a variety of performance test tools, depending on its needs and preferences.
Some examples of performance testing tools are the following:
Akamai CloudTest is used for performance and functional testing of mobile and web
applications. It can simulate millions of concurrent users for load testing as well. Its
features include customizable dashboards; stress tests on AWS, Microsoft Azure and
other clouds; a visual playback editor; and visual test creation.
BlazeMeter, acquired by Perforce Software, simulates a number of test cases and operates
load and performance testing. It provides support for real-time reporting and works with
open source tools, application programming interfaces and more. This testing service
includes features such as continuous testing for mobile and mainframe applications and
real-time reporting and analytics.
10
JMeter, an Apache performance testing tool, can generate load tests on web and
application services. JMeter plugins provide flexibility in load testing and cover areas
such as graphs, thread groups, timers, functions and logic controllers. JMeter supports
an integrated development environment for test recording for browsers or web
applications, as well as a command-line mode for load testing Java-based OSes.
NeoLoad, developed by Neotys, provides load and stress tests for web and mobile
applications and is specifically designed to test apps before release for DevOps
and continuous delivery. An IT team can use the program to monitor web, database and
application servers. NeoLoad can simulate millions of users, and it performs tests in-
house or via the cloud.
11