0% found this document useful (0 votes)

3 views

02_Computer Evolution and Performance

Chapter 2 of 'Computer Organization and Architecture' discusses the evolution of computers, starting with the ENIAC and the introduction of the stored program concept by von Neumann. It outlines the transition from vacuum tubes to transistors, detailing their advantages and the development of microelectronics. The chapter also covers various generations of computers, advancements in memory technology, and the implications of Moore's Law on computing performance.

Uploaded by

jerry20050125

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

02_Computer Evolution and Performance

Uploaded by

jerry20050125

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 80

William Stallings

Computer Organization
and Architecture
7th Edition

Chapter 2
Computer Evolution and
Performance
ENIAC - background
• Electronic Numerical Integrator And
Computer
• Eckert and Mauchly
• University of Pennsylvania
• Trajectory ([trəˋdʒɛktrɪ] ) tables for
weapons
• Started 1943
• Finished 1946
—Too late for war effort
• Used until 1955
ENIAC - details
• Decimal (not binary)
• 20 accumulators of 10 digits
• Programmed manually by switches
• 18,000 vacuum tubes
• 30 tons
• 15,000 square feet
• 140 kW power consumption
• 5,000 additions per second
ENIAC - details
Turing Award Nobel Prize of computing
von Neumann/Turing
• Stored Program concept
• Main memory storing programs and data
• ALU operating on binary data
• Control unit interpreting instructions from
memory and executing
• Input and output equipment operated by
control unit
• Princeton Institute for Advanced Studies
—IAS
• Completed 1952
• Institute for Advanced Study (IAS),
Princeton, NJ, USA
Structure of von Neumann machine
IAS - details
• 1000 x 40 bit words
—Binary number One word hold two intsructions

—2 x 20 bit instructions
• Set of registers (storage in CPU)
—Memory Buffer Register
—Memory Address Register
—Instruction Register
—Instruction Buffer Register
—Program Counter
—Accumulator
—Multiplier Quotient
Structure of IAS –
detail

Memory Buffer Register (MBR)

Memory Address Register (MAR)
Instruction Register (IR)
Instruction Buffer Register (IBR)
Program Counter (PC)
Accumulator (AC)
Multiplier Quotient (MQ)
• MBR: the register in a computer's processor, or central
processing unit, CPU, that stores the data being transferred to and
from the immediate access store. It acts as a buffer allowing the
processor and memory units to act independently without being
affected by minor differences in operation.
• MAR: register that either stores the memory address from which
data will be fetched to the CPU or the address to which data will
be sent and stored.
• IR: the part of a CPU's control unit that stores the instruction
currently being executed or decoded.
• IBR (Instruction Buffer Register) is used to temporarily hold
instruction for the next use
• PC: commonly called the instruction pointer (IP), is a processor
register that indicates where a computer is in its program
sequence.  holds the memory address of (“points to”) the next
instruction that would be executed.
• AC: a register in which intermediate arithmetic and logic results
are stored. Without a register like an accumulator, it would be
necessary to write the result of each calculation (addition,
multiplication, shift, etc.) to main memory.
Video

1. CPU Update - Buses, Registers, and RAM

http://www.youtube.com/watch?v=TcAUHp9jjf8
2. CPU Registers ***
http://www.youtube.com/watch?v=RbZDezRyFQc
Commercial Computers
• 1947 - Eckert-Mauchly Computer
Corporation
• UNIVAC I (Universal Automatic Computer)
• US Bureau of Census (人口普查) 1950
calculations
• Became part of Sperry-Rand Corporation
• Late 1950s - UNIVAC II
—Faster
—More memory
UNIVAC
IBM
• Punched-card processing equipment
• 1953 - the 701
—IBM’s first stored program computer
—Scientific calculations
• 1955 - the 702
—Business applications
• Lead to 700/7000 series
Vacuum tube
Diodes: 二極體

• In electronics, a vacuum tube, electron tube (in North America), thermionic

valve, or just valve (elsewhere, especially in Britain) is a device used to amplify,
switch, otherwise modify, or create an electrical signal by controlling the
movement of electrons in a low-pressure space. Some special function vacuum
tubes are filled with low-pressure gas: these are so-called soft valves (or tubes), as
distinct from the hard vacuum type which have the internal gas pressure reduced as
far as possible. Almost all depend on the thermal emission of electrons, hence
thermionic.
• For most purposes, the vacuum tube has been replaced by solid-state devices such
as transistors and solid-state diodes. Solid-state devices last much longer, are
smaller, more efficient, more reliable, and cheaper than equivalent vacuum tube
devices. However, tubes are still used in specialized applications: for engineering
reasons, as in high-power radio frequency transmitters; or for their aesthetic appeal,
as in audio amplification. Cathode ray tubes are still used as display devices in
television sets, video monitors, and oscilloscopes, although they are being replaced
by LCDs and other flat-panel displays. A specialized form of the electron tube, the
magnetron, is the source of microwave energy in microwave ovens and some radar
systems
Vacuum tube (真空管):维基百科
Diodes: 二極體

• 一種電子元件，在電路中控制電子的流動方向,放大訊號。
• 真空管因成本高、不耐用、體積大、效能低等原因，最後被電晶體
(transistor)取代了
• 但是可以在音響、微波爐及人造衛星的高頻發射機看見真空管的身影。部份
戰鬥機為防止核爆造成的電磁脈衝損壞，機上的電子設備亦採用真空管。另
外，像是電視機與電腦陰極射線管顯示器內的陰極射線管以及X光機的X射線
管等則是屬於特殊的真空管。
• 電晶體機聲音容易偏薄、偏硬，真空管聲音則較豐潤、較生動。
Vacuum tube
Transistor

• From Wikipedia, the free encyclopedia

• In electronics, a transistor is a semiconductor device

commonly used to amplify or switch electronic signals. A
transistor is made of a solid piece of a semiconductor
material, with at least three terminals for connection to an
external circuit. A voltage or current applied to one pair of
the transistor's terminals changes the current flowing
through another pair of terminals. Because the controlled
(output) power can be much larger than the controlling
(input) power, the transistor provides amplification of a
signal. The transistor is the fundamental building block of
modern electronic devices, and is used in radio, telephone,
computer and other electronic systems. Some transistors
are packaged individually but most are found in integrated
circuits
Transistor
電晶體

• From Wikipedia, the free encyclopedia

• 一種固體半導體器件，可以用於放大、開關、穩壓、訊號調變和許多
其他功能。
•
• 電晶體作為一種可變開關，基於輸入的電壓，控制流出的電流，因此
電晶體可做為電流的開關，和一般機械開關（如Relay、switch）
• 在類比電路中，電晶體用於放大器、音頻放大器、射頻放大器、穩壓
電路；在計算機電源中，主要用於開關電源。
• 電晶體也應用於數位電路，主要功能是當成電子開關。數位電路包括
邏輯閘、隨機存取記憶體 (RAM) 和微處理器。
Transistor
Transistors
• Replaced vacuum tubes
• Smaller
• Cheaper
• Less heat dissipation
• Solid State device
• Made from Silicon (Sand)
• Invented 1947 at Bell Labs
• William Shockley et al.
Transistor Based Computers
• Second generation machines
• NCR & RCA produced small transistor
machines
• IBM 7000
• DEC - 1957
—Produced PDP-1
Viedos

• A wafer is a thin slice of semiconductor material,

such as a silicon crystal, used in the fabrication
of integrated circuit and other microdevices. The
wafer serves as the substrate for microelectronic
devices built in and over the wafer and
undergoes many microfabrication process steps
such as doping or ion implantation, etching,
deposition of various materials, and
photolithographic patterning.
• Several types of solar cells are made from such
wafers. A solar wafer is a circular solar cell
made from the entire wafer (rather than cutting
into smaller rectangular solar cells).
Relaionship Among Wafer, Chip, and Gate

Silicon Wafer Processing Animation

http://www.youtube.com/watch?v=LWfCq
pJzJYM
Generations of Computer
• Vacuum tube - 1946-1957
• Transistor - 1958-1964
• Small scale integration - 1965 on
—Up to 100 devices on a chip
• Medium scale integration - to 1971
—100-3,000 devices on a chip
• Large scale integration - 1971-1977
—3,000 - 100,000 devices on a chip
• Very large scale integration - 1978 -1991
—100,000 - 100,000,000 devices on a chip
• Ultra large scale integration – 1991 -
—Over 100,000,000 devices on a chip
Moore’s Law
• Increased density of components on chip
• Gordon Moore – co-founder of Intel
• Number of transistors on a chip will double every
year
• Since 1970’s development has slowed a little
— Number of transistors doubles every 18 months
• Cost of a chip has remained almost unchanged
• Higher packing density means shorter electrical
paths, giving higher performance
• Smaller size gives increased flexibility
• Reduced power and cooling requirements
• Fewer interconnections increases reliability
Growth in CPU Transistor Count
IBM 360 series

• 1964,Replaced (& not compatible with)

7000 series
• First planned “family” of computers
—Similar or identical instruction sets
—Similar or identical O/S
—Increasing speed
—Increasing number of I/O ports
(i.e. more terminals)
—Increased memory size
—Increased cost
• Multiplexed switch structure
DEC PDP-8
• 1964
• First minicomputer (after miniskirt!)
• Did not need air conditioned room
• Small enough to sit on a lab bench
• $16,000
—$100k+ for IBM 360
• Embedded applications & OEM
• BUS STRUCTURE
DEC - PDP-8 Bus Structure

PCI Express bus card slots (from top to bottom: x4, x16, x1 and x16),
compared to a traditional 32-bit PCI bus card slot (bottom).
Drum memory (磁鼓存儲器)
• Drum memory is a magnetic data storage device and
was an early form of computer memory widely used in
the 1950s and into the 1960s, invented by Gustav
Tauschek in 1932 in Austria. For many machines, a
drum formed the main working memory of the
machine, with data and programs being loaded on to
or off the drum using media such as paper tape or
punch cards. Drums were so commonly used for the
main working memory that these computers were
often referred to as drum machines. Drums were
later replaced as the main working memory by
memory such as core memory and a variety of other
systems which were faster as they had no moving
parts, and which lasted until semiconductor memory 鐵磁
entered the scene.
• A drum is a large metal cylinder that is coated on the
outside surface with a ferromagnetic recording
material. It is, simply put, a hard disk platter in the
form of a drum rather than a flat disk. A row of read-
write heads runs along the long axis of the drum, one
for each track.
Drum memory (磁鼓存儲器)
• 磁鼓記憶體（英語：Drum memory）是一
種依靠磁介質的資料儲存裝置，為20世紀50
年代和60年代計算機所用記憶體的早期形式
，由Gustav Tauschek於1932年在奧地利
發明。磁鼓為這套機制的主要工作儲存單元
，透過穿孔紙帶或者打孔卡載入、取出資料
。當時許多計算機採用了這種磁鼓記憶體，
以至於它們常常被叫做「鼓機」（drum
machines）。不過不久之後，磁芯記憶體等
其他技術取代了磁鼓器成為了主要的儲存媒
體，直到最後半導體記憶體進入了儲存媒體
的領域。
Magnetic core memory
• Magnetic core memory, or ferrite-core
memory, is an early form of random access
computer memory. It uses small magnetic
ceramic rings, the cores, through which wires are
threaded to store information via the polarity of
the magnetic field they contain. Such memory is
often just called core memory, or, informally,
core.
• Although computer memory long ago moved to
silicon chips, memory is still occasionally called
"core". This is most obvious in the naming of the
core dump, which refers to the contents of
memory recorded at the time of a program error.
Magnetic core memory
• 磁芯記憶體（英語：Magnetic Core Memory）是一種
早期的電腦記憶體。磁芯記憶體是利用磁性材料製成之記
憶體，其原理為：將磁環（磁芯）帶磁性或不帶磁性之狀
態，用以代表1或0之位元，一長串1或0之組合就代表要
儲存之資訊。
• 磁芯記憶體是一種隨機存取記憶體（Random Access
Memory），在電腦中可擔任主記憶體的角色。
• 磁芯記憶體是非揮發性記憶體（Non-volatile
([ˋvɑlət!] ,) Memory），它的一個特色是：即使當
機或電源中斷，只要沒有發生錯誤的寫入訊號，則仍然可
保有其內容。
Semiconductor Memory
• 1970
• Fairchild
• Size of a single core
—i.e. 1 bit of magnetic core storage
• Holds 256 bits
• Non-destructive read
• Much faster than core
• Capacity approximately doubles each year
附帶利益
Fairchild
• Present day Fairchild Semiconductor
International, Inc. is a spin-off
company resulting from reconstitution of
assets in National Semiconductor. It
inherits the Fairchild name of the original
Fairchild Camera and Instrument, which
had been the cornerstone of the
semiconductor industry since 1957. The
original Fairchild had been acquired by
Schlumberger which then sold it to
National Semiconductor.
flip-flop (正反器)
• In digital circuits, a flip-flop is a term referring
to an electronic circuit (a bistable multivibrator)
that has two stable states and thereby is capable
of serving as one bit of memory. Today, the term
flip-flop usually refers to clocked or edge-
triggered devices (i.e., devices that are a
conceptual combination of a transparent-high
latch with a transparent-low latch).
• A flip-flop is usually controlled by one or two
control signals and/or a gate or clock signal. The
output often includes the complement as well as
the normal output. As flip-flops are implemented
electronically, they require power and ground
connections.
flip-flop (正反器)
• 正反器（英語：Flip-flop, FF，中國大陸譯作触发器，台
灣譯作正反器），學名雙穩態多諧振盪器（Bistable
Multivibrator），是一種應用在數位電路上具有記憶功
能的循序邏輯元件，可記錄二進位制數位訊號「1」和「
0」。
• 正反器是構成時序邏輯電路以及各種複雜數位系統的基本
邏輯單元。
Set-Reset flip-flops (SR flip-flops).

• The fundamental latch is the simple SR flip-flop ,

where S and R stand for set and reset
respectively. It can be constructed from a pair of
cross-coupled NOR logic gates. The stored bit is
present on the output marked Q.
• Normally, in storage mode, the S and R inputs
are both low, and feedback maintains the Q and
Q outputs in a constant state, with Q the
complement of Q. If S (Set) is pulsed high while
R is held low, then the Q output is forced high,
and stays high even after S returns low;
similarly, if R (Reset) is pulsed high while S is
held low, then the Q output is forced low, and
stays low even after R returns low.
S
R
Set-Reset flip-flops (SR flip-flops).
• RS正反器
• 基本RS正反器又稱SR閂鎖，是正反器中最簡單的一種，也
是各種其他型別正反器的基本組成部分。兩個反及閘或反或
閘的輸入端輸出端進行交叉耦合或首尾相接，即可構成一個
基本RS正反器。
Shift register
• In digital circuits, a shift register is a group of
flip flops set up in a linear fashion which have
their inputs and outputs connected together in
such a way that the data is shifted down the line
when the circuit is activated.
•
Shift registers can have co parallel inputs and
outputs, including serial-in, parallel-out (SIPO)
and parallel-in, serial-out (PISO) types. There
are also types that have both serial and parallel
input and types with serial and parallel output.
There are also bi-directional shift registers
which allow you to vary the direction of the shift
register. The serial input and outputs of a
register can also be connected together to create
a circular shift register. One could also create
multi-dimensional shift registers, which can
perform more complex computation.
Shift register (移位暫存器)
• 在數位電路中，移位暫存器（英語：shift register）是
一種在若干相同時間脈衝下[1]工作的正反器為基礎[2]的
器件，數據以並行或串列的方式輸入到該器件中，然後每
個時間脈衝依次向左或右移動一個位元，[1]在輸出端進
行輸出
SIPO shift register
Destructive readout
• These are the simplest kind of shift register. The data string is
presented at 'Data In', and is shifted right one stage each time
'Data Advance' is brought high. At each advance, the bit on the
far left (i.e. 'Data In') is shifted into the first flip-flop's output.
The bit on the far right (i.e. 'Data Out') is shifted out and lost.
• 000010001100011010110101001000010000The data are stored
after each flip-flop on the 'Q' output, so there are four storage
'slots' available in this arrangement, hence it is a 4-Bit Register.
To give an idea of the shifting pattern, imagine that the register
holds 0000 (so all storage slots are empty). As 'Data In' presents
1,1,0,1,0,0,0,0 (in that order, with a pulse at 'Data Advance' each
time. This is called clocking or strobing) to the register, this is the
result. The left hand column corresponds to the left-most flip-
flop's output pin, and so on.
• So the serial output of the entire register is 11010000 (). As you
can see if we were to continue to input data, we would get
exactly what was put in, but offset by four 'Data Advance' cycles.
This arrangement is the hardware equivalent of a queue. Also, at
any time, the whole register can be set to zero by bringing the
reset (R) pins high.
• This arrangement performs destructive readout - each datum is
lost once it been shifted out of the right-most bit.
• The animation below shows the write/shift
sequence, including the internal state of the shift
register.
• http://en.wikipedia.org/wiki/Shift_register#Non-
destructive_readout
Non-destructive readout
• Non-destructive readout can be achieved using
the configuration shown (in image link provided)
below. Another input line is added - the
Read/Write Control. When this is high (i.e. write)
then the shift register behaves as normal,
advancing the input data one place for every
clock cycle, and data can be lost from the end of
the register. However, when the R/W control is
set low (i.e. read), any data shifted out of the
register at the right becomes the next input at
the left, and is kept in the system. Therefore, as
long as the R/W control is set low, no data can
be lost from the system.
Intel
• 1971 - 4004
—First microprocessor
—All CPU components on a single chip
—4 bit
• Followed in 1972 by 8008
—8 bit
—Both designed for specific applications
• 1974 - 8080
—Intel’s first general purpose microprocessor
Speeding it up
• Pipelining (see following page)
• On board cache
• On board L1 & L2 cache
• Branch prediction
• Data flow analysis
• Speculative execution

Instructions are scheduled when ready, independently of the original

program order
Using branch prediction and data flow analysis, some processors
speculatively execute instructions ahead of their actual appearance
in the program execution, holding the results in temporary locations
Performance Balance
• Processor speed increased
• Memory capacity increased
• Memory speed lags behind processor
speed
Logic and Memory Performance Gap
Solutions
• Increase number of bits retrieved at one
time
—Make DRAM “wider” rather than “deeper”
(more bits)
• Change DRAM interface
—Cache
• Reduce frequency of memory access
—More complex cache and cache on chip
• Increase interconnection bandwidth
—High speed buses
—Hierarchy of buses
I/O Devices
• Peripherals with intensive I/O demands
• Large data throughput demands
• Processors can handle this
• Problem moving data
• Solutions:
—Caching
—Buffering
—Higher-speed interconnection buses
—More elaborate bus structures
—Multiple-processor configurations
精心設計
Typical I/O Device Data Rates
Key is Balance
• Processor components
• Main memory
• I/O devices
• Interconnection structures
Improvements in Chip Organization and
Architecture
• Increase hardware speed of processor
—Fundamentally due to shrinking logic gate size
– More gates, packed more tightly, increasing clock
rate
– Propagation time for signals reduced
• Increase size and speed of caches
—Dedicating part of processor chip
– Cache access times drop significantly
• Change processor organization and
architecture
—Increase effective speed of execution
—Parallelism
Problems with Clock Speed and Logic
Density
• Power
— Power density increases with density of logic and clock
speed
capacitance is the ability of a body to
— Dissipating heat
hold an electrical charge.
• RC delay resistor–capacitor circuit (RC circuit), or
RC filter or RC network
— Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them
— Delay increases as RC product increases
— Wire interconnects thinner, increasing resistance
— Wires closer together, increasing capacitance
• Memory latency
— Memory speeds lag processor speeds
• Solution:
— More emphasis on organizational and architectural
approaches
Intel Microprocessor Performance
Increased Cache Capacity
• Typically two or three levels of cache
between processor and main memory
• Chip density increased
—More cache memory on chip
– Faster cache access
• Pentium chip devoted about 10% of chip
area to cache
• Pentium 4 devotes about 50%
More Complex Execution Logic
• Enable parallel execution of instructions
• Pipeline works like assembly line
—Different stages of execution of different
instructions at same time along pipeline
• Superscalar allows multiple pipelines
within single processor
—Instructions that do not depend on one
another can be executed in parallel

Pentium 4 instruction pipeline scheduling

Instruction pipeline

• the classic RISC pipeline is broken into five stages

with a set of flip flops between each stage.

— 1.Instruction fetch
— 2.Instruction decode and register fetch
— 3.Execute
— 4.Memory access
— 5.Register write back
Instruction pipeline (指令管線化)

• 指令管線化是為了讓計
算機和其它數位電子裝
置能夠加速指令的通過
速度（單位時間內被執
行的指令數量）而設計
的技術。
• http://zh.wikipedia.
org/wiki/%E6%8C
%87%E4%BB%A4
%E7%AE%A1%E7
%B7%9A%E5%8C
%96
superscalar

• A superscalar CPU architecture implements a form of parallelism

called instruction-level parallelism within a single processor. It
thereby allows faster CPU throughput than would otherwise be
possible at the same clock rate. A superscalar processor executes
more than one instruction during a clock cycle by simultaneously
dispatching multiple instructions to redundant functional units on
the processor. Each functional unit is not a separate CPU core but
an execution resource within a single CPU such as an arithmetic
logic unit, a bit shifter, or a multiplier.
• While a superscalar CPU is typically also pipelined, they are two
different performance enhancement techniques. It is theoretically
possible to have a non-pipelined superscalar CPU or a pipelined
non-superscalar CPU.
• The superscalar technique is traditionally associated with several
identifying characteristics. Note these are applied within a given
CPU core.
• Instructions are issued from a sequential instruction stream
• CPU hardware dynamically checks for data dependencies between
instructions at run time (versus software checking at compile
time)
• Accepts multiple instructions per clock cycle
From scalar to superscalar

• The simplest processors are scalar processors. Each instruction executed by a

scalar processor typically manipulates one or two data items at a time. By
contrast, each instruction executed by a vector processor operates
simultaneously on many data items. An analogy is the difference between
scalar and vector arithmetic. A superscalar processor is sort of a mixture of the
two. Each instruction processes one data item, but there are multiple redundant
functional units within each CPU thus multiple instructions can be processing
separate data items concurrently.
• Superscalar CPU design emphasizes improving the instruction dispatcher
accuracy, and allowing it to keep the multiple functional units in use at all
times. This has become increasingly important when the number of units
increased. While early superscalar CPUs would have two ALUs and a single
FPU, a modern design such as the PowerPC 970 includes four ALUs, two
FPUs, and two SIMD units. If the dispatcher is ineffective at keeping all of
these units fed with instructions, the performance of the system will suffer.
• In a superscalar CPU the dispatcher reads instructions from memory and
decides which ones can be run in parallel, dispatching them to redundant
functional units contained inside a single CPU. Therefore a superscalar
processor can be envisioned having multiple parallel pipelines, each of which
is processing instructions simultaneously from a single instruction thread.
Single instruction, multiple data (SIMD),
• Like vector addition, matrix operation
Diminishing Returns
• Internal organization of processors
complex
—Can get a great deal of parallelism
—Further significant increases likely to be
relatively modest
• Benefits from cache are reaching limit
• Increasing clock rate runs into power
dissipation problem
—Some fundamental physical limits are being
reached
New Approach – Multiple Cores
• Multiple processors on single chip
— Large shared cache
• Within a processor, increase in performance
proportional to square root of increase in
complexity
• If software can use multiple processors, doubling
number of processors almost doubles
performance
• So, use two simpler processors on the chip
rather than one more complex processor
• With two processors, larger caches are justified
— Power consumption of memory logic less than
processing logic
• Example: IBM POWER4
— Two cores based on PowerPC
POWER4 Chip Organization
Pentium Evolution (1)
• 8080
— first general purpose microprocessor
— 8 bit data path
— Used in first personal computer – Altair
• 8086
— much more powerful
— 16 bit
— instruction cache, prefetch few instructions
— 8088 (8 bit external bus) used in first IBM PC
• 80286
— 16 Mbyte memory addressable
— up from 1Mb
• 80386
— 32 bit
— Support for multitasking
Pentium Evolution (2)
• 80486
—sophisticated powerful cache and instruction
pipelining
—built in maths co-processor
• Pentium
—Superscalar
—Multiple instructions executed in parallel
• Pentium Pro
—Increased superscalar organization
—Aggressive register renaming
—branch prediction
—data flow analysis
—speculative execution
register renaming
• register renaming refers to a technique used to avoid
unnecessary serialization of program operations imposed
by the reuse of registers by those operations.
• Programs are composed of instructions which operate on
values. The instructions must name these values in order
to distinguish them from one another. A typical instruction
might say, add X and Y and put the result in Z. In this
instruction, X, Y, and Z are the names of storage locations.
• In order to have a compact instruction encoding, most
processor instruction sets have a small set of special
locations which can be directly named. For example, the
x86 instruction set architecture has 8 integer registers,
x86-64 has 16, many RISCs have 32, and IA-64 has 128.
In smaller processors, the names of these locations
correspond directly to elements of a register file.
Out-of-order execution & register renaming
• Consider this piece of code running on an
out-of-order CPU:
• Instructions 4, 5, and 6 are independent of
instructions 1, 2, and 3, but the processor
cannot finish 4 until 3 is done, because 3
would then write the wrong value.
• We can eliminate this restriction by
changing the names of some of the
registers:
• Now instructions 4, 5, and 6 can be
executed in parallel with instructions 1, 2,
and 3, so that the program can be
executed faster.
Pentium Evolution (3)
• Pentium II
— MMX technology
— graphics, video & audio processing
• Pentium III
— Additional floating point instructions for 3D graphics
• Pentium 4
— Note Arabic rather than Roman numerals
— Further floating point and multimedia enhancements
• Itanium
— 64 bit
— see chapter 15
• Itanium 2
— Hardware enhancements to increase speed
• See Intel web pages for detailed information on
processors
MMX
• Short for Multimedia Extensions, a set of
57 multimedia instructions built into Intel
microprocessors and other x86-
compatible microprocessors. MMX-enabled
microprocessors can handle many
common multimedia operations, such as
digital signal processing (DSP), that are
normally handled by a separate sound or
video card. However, only software
especially written to call MMX instructions
-- so-called MMX-enabled software -- can
take advantage of the MMX instruction
set.
PowerPC
• 1975, 801 minicomputer project (IBM) RISC
• Berkeley RISC I processor
• 1986, IBM commercial RISC workstation product, RT PC.
— Not commercial success
— Many rivals with comparable or better performance
• 1990, IBM RISC System/6000
— RISC-like superscalar machine
— POWER architecture
• IBM alliance with Motorola (68000 microprocessors), and
Apple, (used 68000 in Macintosh)
• Result is PowerPC architecture
— Derived from the POWER architecture
— Superscalar RISC
— Apple Macintosh
— Embedded chip applications
PowerPC Family (1)
• 601:
— Quickly to market. 32-bit machine
• 603:
— Low-end desktop and portable
— 32-bit
— Comparable performance with 601
— Lower cost and more efficient implementation
• 604:
— Desktop and low-end servers
— 32-bit machine
— Much more advanced superscalar design
— Greater performance
• 620:
— High-end servers
— 64-bit architecture
PowerPC Family (2)
• 740/750:
—Also known as G3
—Two levels of cache on chip
• G4:
—Increases parallelism and internal speed
• G5:
—Improvements in parallelism and internal
speed
—64-bit organization
Internet Resources
• http://www.intel.com/
—Search for the Intel Museum
• http://www.ibm.com
• http://www.dec.com
• Charles Babbage Institute
• PowerPC
• Intel Developer Home

William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance Computer Evolution and Performance
100% (3)
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance Computer Evolution and Performance
56 pages
02 - Computer Evolution and Performance
No ratings yet
02 - Computer Evolution and Performance
48 pages
1st 2nd Generation
No ratings yet
1st 2nd Generation
25 pages
Computer Evolution and Performance: Julius Bancud
No ratings yet
Computer Evolution and Performance: Julius Bancud
39 pages
FAC1002 - Computer Generation
No ratings yet
FAC1002 - Computer Generation
9 pages
Generation of Computers
No ratings yet
Generation of Computers
7 pages
DIPLOMA BTE UP UNIT 1ST INTRODUCTION OF IT
No ratings yet
DIPLOMA BTE UP UNIT 1ST INTRODUCTION OF IT
36 pages
CSC 204 NOTES
No ratings yet
CSC 204 NOTES
31 pages
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
No ratings yet
William Stallings Computer Organization and Architecture 8 Edition Computer Evolution and Performance
46 pages
Gr-8 ICT NOTE Logo for BCcamp-WPS Office
No ratings yet
Gr-8 ICT NOTE Logo for BCcamp-WPS Office
8 pages
History of Computers
No ratings yet
History of Computers
36 pages
The Short Notes On Technologies Used in Computers
No ratings yet
The Short Notes On Technologies Used in Computers
16 pages
Amrit Kaurcomputer Generation
No ratings yet
Amrit Kaurcomputer Generation
7 pages
Introduction To COmputer Architecture
No ratings yet
Introduction To COmputer Architecture
20 pages
Computer Generation 1
No ratings yet
Computer Generation 1
64 pages
The 5 Generations of Computers
No ratings yet
The 5 Generations of Computers
23 pages
Anatomy of Computer System
100% (1)
Anatomy of Computer System
9 pages
Generations of Computers
No ratings yet
Generations of Computers
16 pages
Generations of Computers
No ratings yet
Generations of Computers
10 pages
Computer Concept 2
No ratings yet
Computer Concept 2
37 pages
The Computer Generations 2
No ratings yet
The Computer Generations 2
16 pages
Plan and Prepare For Task To Be Undertaken
No ratings yet
Plan and Prepare For Task To Be Undertaken
12 pages
History of Computers
No ratings yet
History of Computers
78 pages
Week 1-2-Introduction to Computer and Programming (1)-PDF
No ratings yet
Week 1-2-Introduction to Computer and Programming (1)-PDF
32 pages
Ge9 - Module 4 - Sem2
No ratings yet
Ge9 - Module 4 - Sem2
13 pages
Electronic Period
100% (1)
Electronic Period
2 pages
GEE-LIE Group 2
No ratings yet
GEE-LIE Group 2
18 pages
Topic 1b - Intro - 2010
No ratings yet
Topic 1b - Intro - 2010
92 pages
Unit 1
No ratings yet
Unit 1
5 pages
CAB Notes
No ratings yet
CAB Notes
21 pages
Introduction To Computer and C Programming Topic - 1 Computer Evolution and Performance
No ratings yet
Introduction To Computer and C Programming Topic - 1 Computer Evolution and Performance
16 pages
History of Computers
No ratings yet
History of Computers
20 pages
Chapter 2 Computer Evolution
No ratings yet
Chapter 2 Computer Evolution
28 pages
Generation of Computer
No ratings yet
Generation of Computer
4 pages
C Programming Unit-1 PPT
No ratings yet
C Programming Unit-1 PPT
102 pages
Software Engineering Note
No ratings yet
Software Engineering Note
42 pages
Computer Architecture and Organization
No ratings yet
Computer Architecture and Organization
61 pages
Generations of Computers
No ratings yet
Generations of Computers
10 pages
archi notes
No ratings yet
archi notes
8 pages
Computer Evolution and Performance: S.Rajarajan Sastra
No ratings yet
Computer Evolution and Performance: S.Rajarajan Sastra
68 pages
Computer Archi Midterm
No ratings yet
Computer Archi Midterm
101 pages
Computer!
No ratings yet
Computer!
11 pages
03 History of Computers 2
No ratings yet
03 History of Computers 2
39 pages
Five Generations of Computers Checklist
No ratings yet
Five Generations of Computers Checklist
3 pages
Computer Architecture 1
No ratings yet
Computer Architecture 1
10 pages
1940 - 1956: First Generation - Vacuum Tubes
No ratings yet
1940 - 1956: First Generation - Vacuum Tubes
8 pages
L2 History of Computer
100% (1)
L2 History of Computer
37 pages
The History of Computer
No ratings yet
The History of Computer
9 pages
Generation of Computers
No ratings yet
Generation of Computers
34 pages
Introduction To Computer Architecture: Instructor
No ratings yet
Introduction To Computer Architecture: Instructor
46 pages
Introduction To Computer Architecture: Instructor
No ratings yet
Introduction To Computer Architecture: Instructor
46 pages
CH 2
No ratings yet
CH 2
23 pages
CH02-COA9e
No ratings yet
CH02-COA9e
61 pages
CH02 Computer Evolution and Performance
No ratings yet
CH02 Computer Evolution and Performance
45 pages
Computer Generations
No ratings yet
Computer Generations
17 pages
L2 More Intro
No ratings yet
L2 More Intro
32 pages
Module 001 Introduction To Computers: Lesson 1: Computer Basics
100% (1)
Module 001 Introduction To Computers: Lesson 1: Computer Basics
11 pages
Topic A Computer Generations - Key Concepts of Computer Studies 1700232042305
No ratings yet
Topic A Computer Generations - Key Concepts of Computer Studies 1700232042305
9 pages
Computer Hardware Uncovered
From Everand
Computer Hardware Uncovered
Mei Gates
No ratings yet
CPU Power Demystified
From Everand
CPU Power Demystified
Alisa Turing
No ratings yet
Thales Computers Powerengine7 Vmebus Cpu Boards Bd1
No ratings yet
Thales Computers Powerengine7 Vmebus Cpu Boards Bd1
6 pages
Byte Magazine Vol 18-09 PowerPC PDF
No ratings yet
Byte Magazine Vol 18-09 PowerPC PDF
280 pages
DS1103 Datasheet PDF
No ratings yet
DS1103 Datasheet PDF
6 pages
DB 2
No ratings yet
DB 2
96 pages
PowerVC2 0-sg248477
No ratings yet
PowerVC2 0-sg248477
368 pages
CA Assignment: What Is Power PC Processor and Its Types
No ratings yet
CA Assignment: What Is Power PC Processor and Its Types
10 pages
Powerpc 601 Cpu Interface To Vesa Bus: Highlights
No ratings yet
Powerpc 601 Cpu Interface To Vesa Bus: Highlights
8 pages
ds1401 System Overview 2017-A
No ratings yet
ds1401 System Overview 2017-A
5 pages
IBM Timeline
No ratings yet
IBM Timeline
1 page
Power PC Architecture
No ratings yet
Power PC Architecture
24 pages
Ez-Micro Manual hc11 But Useful Small
No ratings yet
Ez-Micro Manual hc11 But Useful Small
282 pages
Managing Power Virtualization With Powervc: Demos Provided by
No ratings yet
Managing Power Virtualization With Powervc: Demos Provided by
20 pages
Power PC and Intel's Evolution
No ratings yet
Power PC and Intel's Evolution
22 pages
PowerPC Assembly Overview
No ratings yet
PowerPC Assembly Overview
9 pages
Unit 07 - IBM I - Licensing - 3448710
No ratings yet
Unit 07 - IBM I - Licensing - 3448710
13 pages
Samsung Ml1865w
No ratings yet
Samsung Ml1865w
75 pages
macOS VCP Driver Release Notes
No ratings yet
macOS VCP Driver Release Notes
6 pages
Power PC Slides
No ratings yet
Power PC Slides
13 pages
01 Instructor's Guide
No ratings yet
01 Instructor's Guide
20 pages
MicroAutoBoxHardwareConcept PDF
No ratings yet
MicroAutoBoxHardwareConcept PDF
4 pages
Teachers Handbook Levels 15
100% (2)
Teachers Handbook Levels 15
17 pages
Continue: Apple Inc Company Profile PDF
No ratings yet
Continue: Apple Inc Company Profile PDF
7 pages
Ibm Cell Processor
No ratings yet
Ibm Cell Processor
26 pages
20 Advanced Processor Designs
No ratings yet
20 Advanced Processor Designs
28 pages
Daily Digest: Samsung Agrees To Sell Symbian Stake To Nokia
No ratings yet
Daily Digest: Samsung Agrees To Sell Symbian Stake To Nokia
7 pages
The Four Generations of Digital Computing
0% (1)
The Four Generations of Digital Computing
45 pages
AIX 64bit Performance in Focus - sg245103
No ratings yet
AIX 64bit Performance in Focus - sg245103
218 pages
NetVault Backup Supported NAS and NDMP Compatibility Guide
No ratings yet
NetVault Backup Supported NAS and NDMP Compatibility Guide
20 pages
CH 12
No ratings yet
CH 12
41 pages

02_Computer Evolution and Performance

Uploaded by

02_Computer Evolution and Performance

Uploaded by

William Stallings

Memory Buffer Register (MBR)

1. CPU Update - Buses, Registers, and RAM

• In electronics, a vacuum tube, electron tube (in North America), thermionic

• From Wikipedia, the free encyclopedia

• In electronics, a transistor is a semiconductor device

• From Wikipedia, the free encyclopedia

1. How It's Made - vacuum tubes

• A wafer is a thin slice of semiconductor material,

Silicon Wafer Processing Animation

• 1964,Replaced (& not compatible with)

• The fundamental latch is the simple SR flip-flop ,

Instructions are scheduled when ready, independently of the original

Pentium 4 instruction pipeline scheduling

• the classic RISC pipeline is broken into five stages

• A superscalar CPU architecture implements a form of parallelism

• The simplest processors are scalar processors. Each instruction executed by a

You might also like