0% found this document useful (0 votes)
30 views

Slot02 03 CH02 ComputerEvolutionAndPerformace 65 Slides

Uploaded by

Gia Quyên Vòng
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Slot02 03 CH02 ComputerEvolutionAndPerformace 65 Slides

Uploaded by

Gia Quyên Vòng
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 75

+

Chapter 2
Computer Evolution and Performance
William Stallings : Computer Organization and Architecture, 9 th Edition
+ 2

Objectives
CLO2 Present an overview of the evolution of computer
technology from early digital computers to the latest
microprocessors.
CLO3 Understand the key performance issues that relate to
computer design

Why should we study this chapter?


 How are computers developed?  generations
 What
applications require great power
computers?
 What are Multicore, MICs (many integrated
cores), and GPGPUs (general purpose
graphical processing unit)?
 How to assess computer performance?
+ 3

Objectives
After studying this chapter, you should be able
to:
 Present an overview of the evolution of
computer technology from early digital
computers to the latest microprocessors.
 Understand the key performance issues that
relate to computer design.
 Explain the reasons for the move to multicore
organization, and understand the trade-off
between cache and processor resources on a
single chip.
+ 4

15 Questions must be
No. answered:
Question
1 Check correctness: 263.75(d) = 1010110111.1001 (b)
10011100100111(b) = FAB (h) 1ABED(h) = 0011100011011011101(b)
2 Convert: 909(d) = ? (b) 1011011(b) = ? (d) 1023(d) = ? (b) 579(d) = ? (h)
3 Compute:
Not 100111001110(b) = ? Not (100111001110(b) AND 001100110011(b)) = ?
Not (100111001110(b) OR 001100110011(b)) = ? Not (100111001110(b) XOR 001100110011(b)) = ?
4 From the 3rd computer generation, what is the basic technology?
5 What is a stored program computer?
6 Explain the general-purpose computer structure introduced by John Von Neumann?
7 At the integrated circuit level, what are the three principal constituents of a computer system?
8 Explain Moore’s law.
9 List and explain the key characteristics of a computer family (refer to the Intel microprocessors).
10 Refer to the table 2.1 (The first code line of the following program will be explained in this slide).
Given the memory contents of the IAS computer shown below,
Address Contents
08A 010FA210FB
08B 010FA0F08D
08C 020FA210FB
Show the assembly language code for the program, starting at address 08A. Explain what this program does.
+ 5

15 Questions must be
No.answered:
Question
11 List and briefly define some of the techniques used in contemporary processors
to increase speed.
12
Explain the concept of performance balance.
13
Distinguish among multicore, MIC, and GPU organizations.
14
Explain about System Clock.
15
Summarize some of the issues in computer performance assessment.
+ 6

Contents
CLO2 Present an overview of the evolution of computer
technology from early digital computers to the latest
microprocessors.
CLO3 Understand the key performance issues that relate to
computer design

 Basics: Number Systems


 2.1- A Brief History of Computers
 2.2- Designing for Performance
 2.3- Multicore, MICs, and GPGPUs
 2.6- Performance Assessment
+ 7

Number Systems:

 Máy tính là một thiết bị điện tử nên tất cả dữ liệu được lưu trữ dưới dạng các
điện tích của điện 1 chiều: CÓ ĐIỆN (1)/ KHÔNG CÓ ĐIỆN (0). Như vậy, dữ
liệu được lưu trữ dạng chuỗi các ký số 0/1. Mỗi ký hiệu này được gọi là một
BIT  Các thao tác lên dữ liệu sẽ thao tác lên các bit  Chúng ta cần có kiến
thức về hệ thống số.
 Hệ thống số có được là do quy ước  Một cộng đồng chấp nhận nó. Như vậy,
người ta hoàn toàn có thể chấp nhận sự thay thế 0 1 2 3 4 5 6 7 8 9 bằng 0 1 2
8934567
 Cơ chế vận hành:
- Cách đếm (ấn định thứ tự theo giá trị)
- Cách biểu diễn một đại lượng số.
- Cách thực hiện các phép toán.
+ 8

Number Systems: Definition

Definition Base-10/ Base-2/ Base-8/ Base-16/


Decimal Binary Octal system Hexadecimal
system (d) system (q) system
(b)

Base value 10 2 8 16
Set of { 0, 1, 2, …, { 0. 1 } { 0, 1, 2, …, { 0, 1, 2, …, 9, A,
digits 9} 7} B, C, D, E, F }
Basic +, -, *, / +, -, *, / +, -, *, / +, -, *, /
operations

A number system includes a base value (đại lượng chục), a set of digits
(tập ký số) and a set of basic operations (tập phép toán cơ bản)
- People can define an arbitrary number systems (Base- 5/ 16/ 25…
systems)
- Four above systems are common in computer.
+ 9

Number Systems:
Representing a quantity
 A number is
represented as
a chain of
digits of a
specific system
number. Each
digit has it’s
own position
( positional
expansion –
Khai triển dựa
vào vị trí của
các ký số)
+ 10

Number Systems: Counting

 Decimal (d) count: 0, 1, 2, …, 9, 10, 11, 12 ,…


 Binary (b) count: 0, 1, 10, 11, 100, …
 Octal (q) count: 0,1, 2, 3, 4, 5, 6, 7, 10, 11, 12,….
 Hexadecimal (h) count: 0, 1, …, 9, A, B, C, D, E, F,
10, 11, …
+ 11

Decimal Positive Number To


Binary Number
123.627 (d)  1111011.10100 (b)
Đổi phần số nguyên: Chia Đổi phần số lẻ: Nhân vối 2
nguyên cho 2 rồi viết ngược tách ra phần nguyên lấy theo
số dư chiều xuôi
123: 2 0.627 x 2 = 1.254
1 61: 2 0.254 x 2 = 0.508
1 30: 2 0.508 x 2 = 1.016
0 15: 2 0.016 x 2 = 0.032
1 7: 2 0.032 x 2 = 0.064
1 3: 2 (ngưng vì chỉ muốn lấy 5 ký số
1 1: 2 lẻ)
1
0 (stop)
426.52d = ? B
1111011 10100 426.52d = ? octal
Kết quả: 1111011.101 426.52d = ? hex
Làm tương tự cho việc đổi số hệ 10 sang hệ 8, 16
+ 12

Binary Positive Number To


Decimal Number
123.627 (d)  1111011.10100 (b)

Digit 1 1 1 1 0 1 1 . 1 0 1 0 0
Position 6 5 4 3 2 1 0 -1 -2 -3 -4 -5
(p) 6 32 1 8 4 2 1 0.5 0.2 0.12 0.062 0.0312
2p 4 6 5 5 5 5
CỘNG HÀNG NÀY NHƯNG
BỎ QUA CÁC TRỊ CÓ DIGIT=0
123 . 625
Kết quả:123.625
Nếu phần lẻ nhị phân dài hớn sẽ cho kết quả chính xác hơn

Làm tương tự cho việc đổi số hệ 8,16 sang hệ 10 426.5(octal) = decimal?


Chú ý: A:10, B:11, C:12. D:13, E:14, F:15 4AC7.81(h) = decimal?
+ 13

Number Systems:
Conversions
(Decimal  Binary/Hexa expansion)

37d = ?b = ?h
69d = ?b =?h
42d = ?b= ?h
+ 14

Number Systems:
Conversions
(Decimal  Binary/Hexa expansion)

+ 15

Number Systems:
Conversions
(Binary  Hexa expansion)

1001100b = ?h 11001110b = ? h
2AFh = ?b 49Ch= ?b
BF7h = ?b 7EAh = ?b
+ 16

Number Systems:
Basic Binary Operators
+ 17

2.1- History of Computers


Một thế hệ người được ghi nhận khi người phụ nữ sinh đứa con
đầu tiên.
A generation is engraved based on an event/essential invention

Công nghệ
làm nhỏ vật
chất (mili
 micro
 nano)

IC: Integrated
Circuit
First
+ Generation: Vacuum Tubes 18

 Basic technology: Vacuum tubes


 Building block: Composition and operating of
vacuum tube
(https://en.wikipedia.org/wiki/Vacuum_tube)
 Typical computers:
 ENIAC (Electronic Numerical Integrator And Computer)
 EDVAC (Electronic Discrete Variable Computer) and John
Von Neumann
 IAS computer (Princeton Institute for Advanced Studies)
 Commercial Computers: UNIVAC ((Universal Automatic
Computer)
 IBM Computers ( International Business Machines)
+ First Generation: ENIAC 19

Computer
 Electronic
(Read byNumerical
yourself)Integrator And Computer
 Designed and constructed at the University of Pennsylvania
 Started in 1943 – completed in 1946, by John Mauchly and John
Eckert

 World’s first general purpose electronic digital computer


 Army’s Ballistics Research Laboratory (BRL) needed a way to
supply trajectory tables for new weapons accurately and within a
reasonable time frame
 Was not finished in time to be used in the war effort

 Its first task was to perform a series of calculations that were


used to help determine the feasibility of the hydrogen bomb

 Continued to operate under BRL management until 1955


when it was disassembled (Army’s Ballistics Research Laboratory )
Major
drawback Memory
consisted
Occupied
was the need of 20 accumulators, Capable Contained
Decimal 1500
each of more
or manual programming rather 140 kW square Weighed
capable 5000 than
by setting than Power feet 30
of additions 18,000
switches binary consumption of tons
holding per vacuum
and machine floor
a second tubes
plugging/ space
10 digit
unplugging number
cables
20

ENIAC: Characteristics
+ 21

John von Neumann Principal


EDVAC (Electronic Discrete Variable Computer)

 First publication of the idea was in 1945


 Stored program concept
 Attributed to ENIAC designers, most notably the
mathematician John von Neumann
 Program represented in a form suitable for storing in
memory alongside the data (program= data +
instructions)

 IAS computer
 Princeton Institute for Advanced Studies
 Prototype of all subsequent general-purpose
computers
 Completed in 1952
22

Structure of von Neumann


Machine

CA: Cellular Automata


– tế bào/ module tự
động hóa thực thi phù
hợp với lệnh máy  Bộ
phận thực thi
CC: Cellular
Constructor - – tế bào/
module xây dựng cách
thực thi  bộ phận điều
khiển
+ 23

IAS Memory
 The memory of the IAS
consists of 1000 storage
locations (called words –
đơn vị lưu trữ) of 40 bits
each.
 Both data and instructions
are stored there
 Numbers are represented in
binary form and each
instruction is a binary code

 To access the value stored in a memory unit, 2 registers are needed:


 MAR, memory address register (chứa địa chỉ ơ nhớ sẽ truy cập)
 MBR, memory buffer register (giá trị đọc ra từ ô nhớ)
+ 24

IAS Memory Formats


data

Instruction

- Một word chứa được 2 lệnh


- Quy tắc thực thi chương trình: cơ chế tuần tự và biết trước địa chỉ của câu lệnh đầu
tiên trong bộ nhớ  Cần thanh ghi PC(program counter) chứa địa chỉ câu lệnh trong
bộ nhớ sẽ được truy xuất.
- Quy tắc thực thi một instruction word: trái trước, phải sau.
- Cấu trúc 1 lệnh: Lệnh này làm gì (opcode), dữ liệu ở đâu (address)
+
Structure of
IAS Computer
PC: program counter
IBR: Instruction Buffer Register
(tạm chứa nội dung lệnh)
IR: Instruction register
(chứa nội dung lệnh hiện hành)
AC: Accumulator
(chứa kết quả tính toán chung)
MQ: Multiplier Quotient
(chứa kết quả phép nhân/chia)

MAR: Memory Address Register


(chứa add của ô nhớ)
MBR: Memory Buffer Register
(chứa value trong ô nhớ)
+
Structure of
IAS Computer
Read instruction:

Địa chỉ câu lệnh trong PC được


chuyển sang MAR. CU kích hoạt
(BLUE) ĐỌC bộ nhớ, lệnh (opcode,
addr) từ bộ nhớ được chuyển vào
MBR. PC tự động tăng 1 đơn vị
(RED).

Nếu là lệnh đầu tiên, lệnh trong


MBR chuyển về IR Phần mã lệnh
trong IR sẽ kích hoạt CU chọn phần
tứng thực thi lệnh này.

Nếu không phải là lệnh đầu tiên, khi


đang thực thi lệnh hiện hành thì lệnh
kế tiếp được truy xuất sẵn vào IBR.
+
Structure of
IAS Computer
Read data:

Sau khi bộ thực thi đã được chọn ở


bước trước, data được đưa vào bộ
thực thi như sau:
Phần Addr trong IR (địa chỉ data)
được đưa vào MAR, CU kích hoạt
đọc MEM, data từ MEM đi vào
MBR rồi vào ALU.

Bạn có thể suy luận tương tự khi data


được ghi từ thanh ghi AC vào MEM.
+ 28

Table 2.1
The IAS
Instruction
Set
+Run an IAS Instruction word: 29

A part of the
exercise 2.7

IAS code length: 20 bits


Word length: 40 bits  2 instructions
Hexadecimal Code: 010FA210FB
Left instruction: 010FA
Opcode: 01(h)  00000001 (b)
Address: 0FA
01(h)  0000 0001
Load data in the 0FA memory word to AC
AC = [0FA]
Right instruction: 210FB
Opcode: 21(h)  0010 0001 (b) OFA 7 AC: 7
Address: 0FB
Store AC to the 0FB memory word
OFB 7
[0FB] = AC
Kết luận [0FB] = [0FA]
+ 30

Commercial Computers:
UNIVAC
(Read

by yourself)
1947 – Eckert and Mauchly formed the Eckert-Mauchly
Computer Corporation to manufacture computers commercially
 UNIVAC I (Universal Automatic Computer)
 First successful commercial computer
 Was intended for both scientific and commercial applications
 Commissioned by the US Bureau of Census for 1950 calculations

 The Eckert-Mauchly Computer Corporation became part of the


UNIVAC division of the Sperry-Rand Corporation
 UNIVAC II – delivered in the late 1950’s
 Had greater memory capacity and higher performance

 Backward compatible
+
31

 Was the major manufacturer


of punched-card processing
equipment
 Delivered its first electronic
stored-program computer
(701) in 1953
 Intended primarily for
scientific applications IBM
 Introduced 702 product in
1955
(Read by yourself)
 Hardware features made it
suitable to business
applications

 Series of 700/7000
computers established IBM as
the overwhelmingly dominant
computer manufacturer
+ 32

Second Generation: Transistors


 Transistor = Transfer – resistor (vật có thể truyền-cản điện)
 Building block: Composition and operating of transistor

More details: https://en.wikipedia.org/wiki/Transistor


 It’s activity is similar to those in vacuum tube
 Smaller, Cheaper
 Dissipates (phát tán) less heat than a vacuum tube
 Is a solid state device made from silicon
 Was invented at Bell Labs in 1947
 It was not until the late 1950’s that fully transistorized
computers were commercially available
 Typical computers: IBM 700/7000 series
+ 33

Second Generation Computers


 Introduced:  Appearance of the Digital
 More complex arithmetic Equipment Corporation
and logic units and control (DEC) in 1957
units
 The use of high-level  PDP-1 (programmed data
programming languages processor) was DEC’s first
 Provision of system computer
software which provided
the ability to:  This began the mini-
 load programs computer phenomenon
that would become so
 move data to peripherals
prominent (leading) in the
and libraries
third generation
 perform common
computations
Table 2.3 : Example Members of the
34

IBM 700/7000 Series


35

IBM
7094
Configuration
(Read by yourself)

Multiplexer (mạch đa hợp)


manages centrally some devices.
Mag: magnetic
Drum: magnetic drum for storing
data
36

Third Generation: Integrated


Circuits
IC
 1958 – the invention of the integrated circuit
 All components of a circuit are minimize to micro size.
So, all of them are packed in a chip
 Discrete component – thành phần rời rạc
 Single, self-contained transistor
 Manufactured separately, packaged in their own containers, and
soldered or wired together onto masonite (like circuit boards)
 Manufacturing process was expensive and cumbersome
(complex)

 The two most important members of the third


generation were the IBM System/360 and the DEC PDP-
8
+ 37

Microelectronics
Ba thành phần cơ bản của máy tính vi điện tử:
(1) Gate: Cổng cho việc xứ lý
(2) Memory cell: Tế bào nhớ cho việc lưu trữ dữ liệu
(3) Connections: Dây kết nối.
+  A computer consists of
38

Integrated gates, memory cells, and


interconnections among
Circuits these elements

Các IC  Data storage – provided by  The gates and memory


memory cells cells are constructed of

simple digital electronic
phương  Data processing – provided components
tiện để by gates  Exploits the fact that such
hiện components as transistors,
thực 4  Data movement – the
resistors, and conductors can be
chức paths among components
fabricated from a semiconductor
năng are used to move data from
such as silicon
memory to memory and
cơ bản from memory through gates
của  Many transistors can be produced
to memory
máy at the same time on a single
tính  Control – the paths among wafer(thin piece) of silicon
components can carry
control signals
 Transistors can be connected with
a processor metallization (cover
using metal) to form circuits
More details: https://en.wikipedia.org/wiki/Silicon
+ 39

Wafer, Wafer: (tấm bọc) a


Chip, thin piece of silicon
(< 1 mm)
and
Gate
Relationshi
p
Cấu tạo của IC
+ Chip Growth 40

Figure 2.8 Growth in Transistor Count on Integrated Circuits


Number
of
transistor
s

Year m: million
bn: billion
Moore’s Law 41

1965, Gordon Moore


(co-founder of Intel)

Observed number of transistors that could be


put on a single chip was doubling every year

Consequences of Moore’s
The pace slowed
to a doubling
law:
every 18 months
in the 1970’s but The cost of Computer
The
computer becomes
has sustained logic and
electrical Reduction in
path length smaller and is power and Fewer
that rate ever memory more
since is shortened, cooling interchip
circuitry has convenient to
increasing use in a requirement connections
fallen at a
operating variety of s
dramatic
speed environments
rate
+ 42

Table 2.4: Characteristics of the

System/360 Family

Table 2.4 Characteristics of the System/360 Family


43

Table 2.5: Evolution of the


PDP-8
PDP: Programmed Data (Read
Processorby yourself)
Produced by Digital Equipment Corporation (DEC)
+ 44

DEC - PDP-8 Bus Structure


DEC: Digital Equipment Corporation
PDP: Programmed Data Processor

Omni (Latin) = for all


+ LSI
Large
Scale
Later Integration

Generation
VLSI
s Very Large
Scale
Integration

ULSI
Semiconductor Memory Ultra Large
Microprocessors Scale
Integration
+ Semiconductor Memory 46
+ 47

Microprocessors
 The density of elements on processor chips continued to
rise
 More and more elements were placed on each chip so that
fewer and fewer chips were needed to construct a single
computer processor

 1971 Intel developed 4004


 First chip to contain all of the components of a CPU on a single
chip
 Birth of microprocessor

 1972 Intel developed 8008


 First 8-bit microprocessor

 1974 Intel developed 8080


 First general purpose microprocessor
 Faster, has a richer instruction set, has a large addressing
capability
Evolution of Intel Microprocessors 48
Evolution of Intel Microprocessors 49
+ 50

2.2- Designing for


Performance
Desktop applications that require the great power of
today’s microprocessor-based systems include

• Image processing – Phần mềm xử lý ảnh

• Speech recognition – Phần mềm nhận dạng tiếng


nói
Media
• Videoconferencing - Hội thảo qua video Software
s
• Multimedia authoring – Phần mềm tạo video

• Voice and video annotation of files – tạo chú thích


thêm trong video (quảng cáo,…)

• Simulation modeling (phần mềm giả lập)


+ 51

2.2- Designing for


Performance
Làm sao để máy tính chạy nhanh hơn/ mạnh hơn?
- Tăng khả năng làm việc của CPU
- Tăng xung CPU
- CPU đa nhân
- Thêm bô xử lý ảnh để CPU không phải gánh
vác việc trình bày data  GPU (Graphic
Processing Unit)
- Tăng khả năng truyền data: Bộ nhớ phải nhanh
hơn / bus rộng hơn
- Thiết bị IO cần nhanh hơn  Enhanced IO
+ Microprocessor Speed 52

Techniques built into contemporary (current) processors include:


Technique Description
Pipelining Processor moves data or instructions into a
conceptual pipe with all stages of the pipe
processing simultaneously
Branch Processor looks ahead in the instruction
prediction code fetched from memory and predicts
which branches, or groups of instructions,
are likely to be processed next

Data flow Processor analyzes which instructions are


analysis dependent on each other’s results, or data,
to create an optimized schedule of
instructions
Speculativ Using branch prediction and data flow
e (suy analysis, some processors speculatively
đoán) execute instructions ahead of their actual
execution appearance in the program execution,
holding the results in temporary locations,
keeping execution engines as busy as
+ 53

Performance
Balance
Increase the
number of bits
 Adjust the organization and that are retrieved
architecture to compensate at one time by
making DRAMs
for the mismatch among the “wider” rather
than “deeper”
capabilities of the various and by using wide
components bus data paths
Reduce the
frequency of
 Architectural examples memory access by
include: incorporating
increasingly
complex and
efficient cache
structures between
Tăng khả năng truyền data: the processor and
main memory
- Bộ nhớ nhanh hơn
- Bus rộng hơn Change the Increase the
DRAM interface interconnect
to make it more bandwidth between
efficient by processors and
memory by using
including a
higher speed buses
cache or other and a hierarchy of
buffering scheme buses to buffer and
on the DRAM chip structure data flow
Typical I/O Device Data Rates 54
+ 55

Improvements in Chip
Organization and Architecture
 Increase hardware speed of processor
 Fundamentally due to shrinking logic gate size
 More gates, packed more tightly, increasing clock rate

 Propagation time for signals reduced

 Increase size and speed of caches


 Dedicating part of processor chip
 Cache access times drop significantly

 Change processor organization and


architecture
 Increase effective speed of instruction execution
 Parallelism
+ 56

Problems with Clock Speed and


Logic Density
 Power
 Power density increases with density of logic and clock
speed
 Dissipating heat

 RC (Resistance and Capacitance) delay


 Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
 Delay increases as RC product increases
 Wire interconnects thinner, increasing resistance 
Dây càng nhỏ thì điện trở càng lớn – công thực tính điện trở R
 Wires closer together, increasing capacitance  Công
thức tính điện dung của tụ điện C

 Memory latency
+ Processor Trends
57
+ 58

2.3- Multicore, MICs, and


GPGPUs
 MulticoreCPU: CPU has some cores
running concurrently.
 MIC: Many integrated core
 GPGPU: General Purpose Graphical
Processing Unit
- Một bộ xử lý chưa đủ nhanh thì dùng nhiều bộ xử lý chạy đồng thời  Multicore/
MIC.
- Đề CPU tính toán nhiều hơn thì đừng để CPU phải trình bầy data ra màn hình 
GPU
Strategy
As
Thecaches is to uselarger
became
use of multiple two
With two
Multicore simpler
it
sense chip
provides
increase
processors
made performance
processors
processors
the rather
tothe
create than
two
potential
performance
on
on the same chip
largertoand
then three
one more levels ofwithout
complex
Nhiều bộ xử lý trong một chip caches
increasing
cache are
the justified
clock
on a chip
processor rate
đồng thời gắn thêm bộ nhớ cache
vào trong chip này
+ 60

Many Integrated Core (MIC)


Graphics Processing Unit (GPU)

MIC GPU
 Leap (fast growth) in
 Core designed to perform parallel
performance as well as the operations on graphics data
challenges in developing
software to exploit such a
 Traditionally found on a plug-in
large number of cores graphics card, it is used to
encode and render 2D and 3D
 The multicore and MIC graphics as well as process video
strategy involves a
homogeneous (same kind)
 Used as vector processors for a
collection of general purpose variety of applications that
processors on a single chip require repetitive computations

GPU gánh vác hộ CPU việc trình bày data ra màn hình  Tăng hiệu năng
phần cứng
Read by Yourself 61

2.4- The Evolution of The Intel x86 Architecture


2.5- Embedded Systems and the ARM

Some definitions:
CISC: Complex Instruction Set Computer, CPU is equipped a
large set of instructions
RISC: Reduced Instruction Set Computer, CPU is equipped basic
instructions only based on the thinking: A high instruction is
created using some basic instructions.
ARM: Advanced RISC Machine
+ 62

2.6- Performance
Assessment
Factors affect on computer
performance:

Factors
- Clock Speed and Instructions per Second
- Instruction execution rate
Methods: Benchmarks
Some laws: Read by yourself
Amdahl’s Law
Little’s Law
+ 63

System Clock – Xung hệ


thống
- Digital devices need pulses to operate. Pulses are created by a
clock generator (a hardware using crystal oscillator) – Read note
for more details.
- The rate of pulses is known as the clock rate, or clock speed.
- The time between pulses is the cycle time.
- One increment, or pulse, of the clock is referred to as a clock
cycle, or a clock tick.
- Unit: cycles per second, Hertz (Hz)
- Operations performed by a processor, such as fetching an
instruction, decoding the instruction, performing an arithmetic
operation, and so on, are governed by a system clock.
 High clock rate  High performance.
+ 64

Instruction Execution Rate


Tần suất thực thi lệnh

- Xử lý số thực chậm hơn (cần nhiều xung hơn) xử


lý số nguyên.
- Unit: MIPS (millions of instructions per second)
- Unit: MFLOPs (Floating-point performance is
expressed as millions of floating-point operations
per second)
+ 65

Benchmark
Phần mềm đo tốc độ hiệu năng của hệ
thống
- A test used to measure hardware or software
performance.
- Benchmarks for hardware use programs that test the
capabilities of the equipment
- Benchmarks for software determine the efficiency,
accuracy, or speed of a program in performing a
particular task, such as recalculating data in a
spreadsheet.
- The same data is used with each program tested, so the
resulting scores can be compared to see which programs
perform well and in what areas.
Benchmarks …
66

For example, consider this high-level language statement:

A = B + C /* assume all quantities in main memory */

With a traditional instruction set architecture, referred to as a


complex instruction set computer (CISC), this instruction can be
compiled into one processor instruction:
Giả sử CPU phức tạp (complex)
2 codes may
add mem(B), mem(C), mem (A)
cho phép cộng bộ nhớ need the
same amount
On a typical RISC machine, the compilation would look
of time when
something like this:
they execute
load mem(B), reg(1);
Giả sử CPU thu giảm load mem(C), reg(2);
on 2
(reduced) KHÔNG cho
add reg(1), reg(2), reg(3); machines.
phép cộng bộ nhớ mà phải
thông qua thanh ghi. store reg(3), mem (A)
+ 67

Benchmark
- The design of fair benchmarks is something of an art,
because various combinations of hardware and software
can exhibit widely variable performance under different
conditions. Often, after a benchmark has become a
standard, developers try to optimize a product to run that
benchmark faster than similar products run it in order to
enhance sales (MS Computer Dictionary)
 Beginning in the late 1980s and early 1990s, industry
and academic interest shifted to measuring the
performance of systems using a set of benchmark
programs
+ 68

Desirable Benchmark Characteristics

1. It is written in a high-level language, making


it portable across different machines.
2. It is representative of a particular kind of
programming style, such as system
programming, numerical programming, or
commercial programming.
3. It can be measured easily.
4. It has wide distribution.
+ 69

System Performance Evaluation


Corporation (SPEC)
 Benchmark suite
A collection of programs, defined in a high-level
language
 Attempts to provide a representative test of a
computer in a particular application or system
programming area
 SPEC
 An industry consortium
 Defines and maintains the best known collection of
benchmark suites
 Performance measurements are widely used for
+  Best known SPEC benchmark
suite
 Industry standard suite for
SPEC processor intensive applications
 Appropriate for measuring
performance for applications

CPU2006 that spend most of their time


doing computation rather than
I/O
 Consists of 17 floating point
programs written in C, C++, and
Fortran and 12 integer programs
written in C and C++
 Suite contains over 3 million
lines of code

 Gene Amdahl [AMDA67]
+Amdahl’s  Dealswith the potential
Law speedup of a program
using multiple processors
(Read by compared to a single
processor
yourself)  Illustratesthe problems facing
industry in the development
of multi-core machines
 Software must be adapted
to a highly parallel
execution environment to
exploit the power of parallel
processing
 Canbe generalized to
evaluate and design technical
+ Amdahl’s Law (Read by 72

yourself)

f:
CPU frequency
+ 73

Little’s Law (Read by


yourself)

 It is introduced by John Little.

 The general setup is that we have a steady state system to which


items arrive at an average rate of λ items per unit time. The items
stay in the system an average of W units of time. Finally, there is an
average of L units in the system at any one time. Little’s Law relates
these three variables as L = λ W.

 Fundamental and simple relation with broad applications

 Can be applied to almost any system that is statistically in steady


state, and in which there is no leakage

 λ: Average throughput, năng suất trung bình, (số tác vụ hoàn tất trên
1 đơn vị thời gian)

 L: Average works in progress, số lượng công việc trung bình đã hoàn


tất
+ 74

Little’s Law (Read by


yourself)
 Queuing system – Hệ thống hàng đợi

 If server is idle an item is served immediately,


otherwise an arriving item joins a queue (server rảnh,
phục vụ liền. Ngược lại, yêu cần phải đợi).
 There can be a single queue for a single server or for
multiple servers, or multiples queues with one being
for each of multiple servers (dùng 1 hàng đợi hoặc
nhiều hàng đợi).
 Average number of items in a queuing system equals
the average rate at which items arrive multiplied by
the time that an item spends in the system
 Relationship requires very few assumptions

 Because of its simplicity and generality it is extremely


useful
+ Summary Computer
75

Evolution and
Performance
Chapter 2
 Multi-core
 First generation computers  MICs
 Vacuum tubes
 Second generation
 GPGPUs
computers  Performance assessment
 Transistors  Clock speed and
 Third generation computers instructions per second
 Integrated circuits  Benchmarks
 Performance designs
 Amdahl’s Law
 Microprocessor speed
 Little’s Law
 Performance balance
 Chip organization and
architecture

You might also like