Slot02 03 CH02 ComputerEvolutionAndPerformace 59 Slides
Slot02 03 CH02 ComputerEvolutionAndPerformace 59 Slides
Chapter 2
Computer Evolution and Performance
William Stallings : Computer Organization and Architecture, 9 th Edition
+ 2
Objectives
Objectives
After studying this chapter, you should be able
to:
Present an overview of the evolution of
computer technology from early digital
computers to the latest microprocessors.
Understand the key performance issues that
relate to computer design.
Explain the reasons for the move to multicore
organization, and understand the trade-off
between cache and processor resources on a
single chip.
+ 4
Contents
Computer
(Read by yourself)
Electronic Numerical Integrator And Computer
Major
Memory drawback
consisted
Occupied was the need
of 20
Contained Capable
1500 Decimal accumulators,
more 140 kW of for manual
Weighed square rather each
than Power 5000 programming
30 feet than capable
18,000 consumpti additions by setting
tons of binary of
vacuum on per
floor machine holding switches
tubes second and
space a
10 digit plugging/
number unplugging
cables
+ 9
IAS computer
Princeton Institute for Advanced Studies
Prototype of all subsequent general-purpose
computers
Completed in 1952
10
data
Instruction
One word contains 2 instructions
+
Structure
of
IAS
Computer
AC: Accumulator
MQ: Multiplier Quotient
MBR: Memory Buffer Register
IBR: Instruction Buffer Register
PC: program counter
IR: Instruction register
MAR: Memory Address Register
+ 13
Table 2.1
The IAS
Instruction
Set
Hexadecimal Code:
+ 010FA210FB
14
Commercial Computers:
UNIVAC
(Read
by yourself)
1947 – Eckert and Mauchly formed the Eckert-Mauchly
Computer Corporation to manufacture computers commercially
UNIVAC I (Universal Automatic Computer)
First successful commercial computer
Was intended for both scientific and commercial applications
Commissioned by the US Bureau of Census for 1950 calculations
Backward compatible
+
16
Series of 700/7000
computers established IBM as
the overwhelmingly dominant
computer manufacturer
+ 17
IBM
7094
Configuration
Read by yourself
Microelectronics
+ A computer consists of
23
and
Gate
Relationshi
p
+ Chip Growth 25
Year m: million
bn: billion
Moore’s Law 26
Consequences of Moore’s
The pace slowed
to a doubling
every 18 months
law:
in the 1970’s The cost of The Computer
computer electrical becomes
but has logic and path length
Reduction
smaller and in power Fewer
sustained that memory is is more
rate ever since and cooling interchip
circuitry has shortened, convenient to
use in a requirement connections
fallen at a increasing
variety of s
dramatic operating
rate speed environments
+ 27
System/360 Family
Generation
VLSI
s Very Large
Scale
Integration
ULSI
Semiconductor Memory Ultra Large
Microprocessors Scale
Integration
+ Semiconductor Memory 31
In 1974 the price per bit of semiconductor memory dropped below the price
There has been a continuing and
perrapid
bit of core Developments
memory in memory and processor
decline in memory cost accompanied by a
technologies changed the nature of
corresponding increase in physical memory
computers in less than a decade
density
Each generation has provided four times the storage density of the previous generation,
accompanied by declining cost per bit and declining access time
+ 32
Microprocessors
The density of elements on processor chips continued to
rise
More and more elements were placed on each chip so that
fewer and fewer chips were needed to construct a single
computer processor
• Image processing
• Speech recognition
• Videoconferencing
• Multimedia authoring
• Simulation modeling
+ Microprocessor Speed 36
Performance
Balance
Increase the
Adjust the organization and number of bits
that are retrieved
architecture to compensate at one time by
making DRAMs
for the mismatch among the “wider” rather
capabilities of the various than “deeper”
and by using wide
components bus data paths
Reduce the
frequency of
Architectural examples memory access by
incorporating
include: increasingly
complex and
efficient cache
structures
between the
processor and
main memory
Increase the
Change the DRAM interconnect
interface to make bandwidth
it more efficient between
by including a processors and
memory by using
cache or other higher speed buses
buffering scheme and a hierarchy of
on the DRAM chip buses to buffer and
structure data flow
Typical I/O Device Data Rates 38
+ 39
Improvements in Chip
Organization and Architecture
Increase hardware speed of processor
Fundamentally due to shrinking logic gate size
More gates, packed more tightly, increasing clock rate
Memory latency
Memory speeds lag (slow down) processor speeds
+ Processor Trends
41
+ 42
Multicore
CPU: CPU has some cores
running concurrently.
MIC: Many integrated core
GPGPU: General Purpose Graphical
Processing Unit
The use of multiple
Multicore
processors on the same
chip provides the potential
to increase performance
without increasing the
clock rate
Strategy is to use two
simpler processors on
the chip rather than
one more complex
processor
As caches became
larger it made
performance sense to
create two and then
three levels of cache on
a chip
+ 44
MIC GPU
Leap (fast growth) in
Core designed to perform parallel
performance as well as the operations on graphics data
challenges in developing
software to exploit such a
Traditionally found on a plug-in
large number of cores graphics card, it is used to
encode and render 2D and 3D
The multicore and MIC graphics as well as process video
strategy involves a
homogeneous (same kind)
Used as vector processors for a
collection of general purpose variety of applications that
processors on a single chip require repetitive computations
Read by Yourself 45
Some definitions:
CISC: Complex Instruction Set Computer, CPU is equipped a
large set of instructions
RISC: Reduced Instruction Set Computer, CPU is equipped basic
instructions only based on the thinking: A high instruction is
created using some basic instructions.
ARM: Advanced RISC Machine
+ 46
2.6- Performance
Assessment
Factors affect on computer
performance:
Factors
Clock Speed and Instructions per Second
Instruction execution rate
Methods: Benchmarks
Some laws: Read by yourself
Amdahl’s Law
Little’s Law
+ 47
System Clock
- Digital devices need pulses to operate. Pulses are created by a
clock generator (a hardware using crystal oscillator)
- The rate of pulses is known as the clock rate, or clock speed.
- The time between pulses is the cycle time.
- One increment, or pulse, of the clock is referred to as a clock
cycle, or a clock tick.
- Unit: cycles per second, Hertz (Hz)
- Operations performed by a processor, such as fetching an
instruction, decoding the instruction, performing an arithmetic
operation, and so on, are governed by a system clock.
High clock rate High performance.
+ 48
Benchmark
Benchmark
- The design of fair benchmarks is something of an art,
because various combinations of hardware and software
can exhibit widely variable performance under different
conditions. Often, after a benchmark has become a
standard, developers try to optimize a product to run that
benchmark faster than similar products run it in order to
enhance sales (MS Computer Dictionary)
Beginning in the late 1980s and early 1990s, industry
and academic interest shifted to measuring the
performance of systems using a set of benchmark
programs
+ 52
yourself)
+ 57
Queuing system
If server is idle an item is served immediately, otherwise an
arriving item joins a queue
There can be a single queue for a single server or for multiple
servers, or multiples queues with one being for each of multiple
servers
2.3 At the integrated circuit level, what are the three principal
constituents of a computer system?
Evolution and
Performance
Chapter 2
Multi-core
First generation computers MICs
Vacuum tubes
Second generation
GPGPUs
computers Performance assessment
Transistors Clock speed and
Third generation computers instructions per second
Integrated circuits Benchmarks
Performance designs
Amdahl’s Law
Microprocessor speed
Little’s Law
Performance balance
Chip organization and
architecture