0% found this document useful (0 votes)
6 views

Lecture 7 - The CPU (Part 2)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Lecture 7 - The CPU (Part 2)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Lecture 7:

Components of a
computer & the CPU

Foundations of Computing
Part 2 – some more CPU
Foundations of Computing: Technical Strand
João Filipe Ferreira

1
Clocks [1]

• Computers need to use a method of synchronisation – i.e. the clock


– to function correctly
• A clock emits a series of pulses which have a fixed length and
delay – it serves to synchronise components of the system
• The clock-cycle is the interval between two consecutive pulses

Foundations of Computing
• This “frequency” is measured in Hz (Hertz’s) and is commonly seen
as MhZ (Mega-Hertz) or GHz (Giga-Hertz).
• For example:
• My work PC is 3.4 GHz
• This means that there are
3400000000 pulses per second
[1]
being sent simultaneously to all
components in the system 2
Registers
• Registers are small fast memory locations on board the processor
• They are typically implemented using flip-flops

[1]

Foundations of Computing
• They are designed to be small scratch pads for programs
• Their management is under the direct control of the ISA 3
The Microarchitecture
Level
• The microarchitecture view defines what structure a CPU
has and therefore what instructions a CPU is capable of
executing

Foundations of Computing
• Level of complexity varies greatly:
• Accumulator-based CPU (single operand)
• Stack-based CPU (single operand, stack control)
• Register-based CPU (two+ operands, register management)
• Other systems VLIW (Very Long Instruction Word), SIMD
(Single Instruction Multiple Data), MIMD (Multiple
Instruction Multiple Data), … the list is quite extensive
4
Data Path
• The ALU is just one element of the CPU Data-Path, we can largely divide
elements into two types:
• Combinatorial Inputs
• Output follows Input instantly (in an ideal world!)
• Combination of arithmetic / logic operations

Foundations of Computing
• Examples:
Combinatorial
• ALU /n Logic
/m
• Sign Extender
• Number format translator
• Adder
• Multiplexer (MUX)
• State Elements
• Sequential circuits (memory-based components that sync using clock)
/m
• Outputs / inputs change only on the clock edge /n Combinatorial
Logic
• Examples: /x
5
• Registers /x
• Memory State Register
• Program Counter
Fetch-Decode-Execute
A modern CPU can be described as implementing a
fetch–decode–execute cycle for instructions. These steps
dictate how to interpret and execute instructions
1.Fetch next instruction from memory into Instruction
Register

Foundations of Computing
2.Change Program Counter to point to next instruction
(why 2nd in the order? Principle of locality – to discuss later)
3.Determine type of instruction just fetched
4.If instructions uses a word in memory, determine where
to fetch word, if needed, into CPU register
5.Execute the instruction
6
6.Go to step 1 to begin executing following instruction
7.https://www.youtube.com/watch?v=jFDMZpkUWCw
Video 2: Making a Processor

Foundations of Computing
7
https://www.youtube.com/watch?v=-KTKg0Y1snQ
Sidebar: How CPU’s are made -
Video Links
• Intel Factory Tour -
http://www.youtube.com/watch?v=SeGqCl3YAaQ

• How an Intel Processor is made -

Foundations of Computing
http://www.youtube.com/watch?v=Cg-mvrG-K-E

• Making the Intel Core i7 -


http://www.youtube.com/watch?v=tKX8bdHWgu8

• How Microchips are made -


http://www.youtube.com/watch?v=F2KcZGwntgg 8

• View in your own time…


Modern Design Principles
• Most modern processors are based on a few common
principles:
• All instructions directly executed by hardware
• Maximise rate at which instructions are issued
• Instructions should be easy to decode

Foundations of Computing
• Only loads and stores should reference memory
• Provide plenty of registers
• In the past we had:
• RISC (reduced instruction set computers)
• CISC (complex instruction set computers)
• Most ISAs are now a mixture of these two design principles
• Have a look at the Intel X86 instruction Set
9
• https://en.wikipedia.org/wiki/X86_instruction_listings
Computation takes time
• A single stage computer is limited in the speed it can operate at
due to the depth of gates. We need to speed up its operation!
• Consider this circuit:
• The longest path (called critical path) is 7 NAND gates long
• If each NAND gate takes 5ns to operate, this circuit takes 35ns to give

Foundations of Computing
us an output
• This means at best this circuit can operate at: 1/35ns = 28.5 MHz!
• This seems really slow think of a laptop running at 2.9GHz (roughly
100x faster!)

10
Video 3: Moore’s Law

Foundations of Computing
11
https://www.youtube.com/watch?v=aWLBmapcJRU

There are lots of videos on the breaking of Moore’s Law…


Memory / CPU Gap
• Consider the following graph as an illustration of the problem
CPU designers face

1000
“Moore’s Law” CPU

Foundations of Computing
100
Performance

Processor-Memory
Performance Gap:
10 (grows 50% / year)

DRAM

1
12
198
198
198
198
198
198
198
198
198
198
199
199
199
199
199
199
199
199
199
199
200
0
1
2
3
4
5
6
7
8
9
0

2
3
4
5
6
7
8
9
0
Improving Performance
• In order to address the circuit depth issue and the memory
gap we must devise solutions to improve performance:

• Memory hierarchy (e.g. caches)  Directed study exercise


2

Foundations of Computing
• Pre-fetching 
• Locality optimisation  Further reading topic
• Pipelining 
• Superscalar designs 
• Multi-processor designs 
• Out-of-order execution  Further reading topic
Further reading topic (but will
• Multithreading  look at this from an Operating
Systems perspective later on this
term)
13
Caches
• Caches are small, fast (i.e. faster than main memory) blocks of
memory located closer to the CPU than main memory
• They hold blocks of data that the cache believes is currently in
use, has just been used, or will likely be used by the processor
and memory locations near that (principle of locality)

Foundations of Computing
• Level 1 and 2 caches are typically on the CPU chip
• Level 3 cache can be off
chip but with newer
designs, can be found on
chip
• A combination of circuit
design and being physically
closer to the processor 14
make these caches faster
than main memory
Pre-fetch buffers
• As with caches, pre-fetch buffers are designed to manage the
expected data flow into the processor.

Foundations of Computing
15

Image from: Amherst University of Massachusetts © 2007


Pipelining

Foundations of Computing
stage 1, stage 2, stage 3, ...

In a factory... ...objects are operated in sequence, 16


but stages of operation happen in parallel
Pipelining

Foundations of Computing
stage 1, stage 2, stage 3, ...

In a CPU... ...instructions are executed in sequence, 17


but stages of execution happen in parallel
Pipelining

[1]

Foundations of Computing
18
Pipelining

[1]

Foundations of Computing
• A five-stage pipeline
• The state of each stage as a function of time 19
• Nine clock cycles are illustrated
• Read the explanation in the book to understand this…
Superscalar Architectures
• Superscalar architectures act by having multiple physical
hardware units
• The CPU has multiple copies of each pipeline stage
• Each pipeline is independent of the other pipelines
• The CPU executes multiple instructions at once

Foundations of Computing
• Effectively a pre-cursor to multi-threading (each running
execution thread takes turns at being executed by CPU) [1]

20
Superscalar Architectures
• Having completely separate pipelines is a very expensive way
to add efficiency to the system, however it is simple
• Alternatively, we can add multiple functional units as the
decode and fetch / store stages are likely to be simpler
• Adds complexity however is more hardware efficient than the

Foundations of Computing
raw pipeline multiplication version

[1] 21
Multi-processor Systems
• Multiple processors is an obvious way to make a system more
effective without redesigning the processor
• A multicore system can be thought of as a being a natural
evolution from superscalar and multithreading systems:
• Multiple cores are capable of running multiple physical threads at

Foundations of Computing
once (like multithreading systems).
• Multiple cores have multiple functional units / pipelines like
superscalar systems.
• Example: Intel i7/i5/i3 or AMD

22
[1]
Many Core Processors
• Multi-core systems have found their natural evolution in array
and stream processor systems
• Graphics cards (GPUs) are typical examples of this kind of
processor

Foundations of Computing
[1]

23

An array of processor of the ILLIAC IV type.


Video 4: Multicore and
Hyperthreading

Foundations of Computing
24
https://www.youtube.com/watch?v=VcoVYfDVEww
We’ll cover this in greater depth in a couple of weeks…
Summary
In these lectures we have looked at:
• Processors come in many different varieties.
• The design of the processor influences the way in which it can be utilised and the facilities it will
offer.
• A simple CPU has a very limited operational speed.
• Simple improvements such as doubling hardware are expensive.
• More complex improvements such as pipelining allow for greater efficiency at greater complexity.

Foundations of Computing
Directed Study
1. Read chapter on microarchitecture level - 6 th Edition, Structured Computer Organisation.
2. Read http://www.eecs.berkeley.edu/~knight/cs267/papers/cache_memories.pdf
Investigate
3. RISC and CISC architectures.
4. Moore’s Law (visit:
http://www.intel.co.uk/content/www/uk/en/history/museum-gordon-moore-law.html )
5. Locality optimisation.
6. Out of order execution.
7. Multithreading.
8. ARM 3 CPU
25
9. View the video links
References
[1] “Structured Computer Organisation”, Andrew Tanenbaum, 2008
Real CPUs: MIPS CPU
• If we consider a simple generic processor architecture such as
MIPS we have a simple CPU architecture diagram.
• Other ISAs will have different implementations as we will see
on the next slide (for the ARM & Pentium), each is tailored to
the specific CPU design.

Foundations of Computing
26
Real CPUs: The Pentium 3

Foundations of Computing
27
Real CPUs: UltraSPARC III
Cu Pipeline

A simplified

Foundations of Computing
representation of
the UltraSPARC III
Cu pipeline.

28
Real CPUs: The Microarchitecture of the
8051 CPU

The microarchitecture
of the 8051.

Foundations of Computing
29

You might also like