Chapter Overview: IA-32 Processor Architecture
Chapter Overview: IA-32 Processor Architecture
• General Concepts
• IA-32 Processor Architecture
• IA-32 Memory Management
• Components of an IA-32 Microcomputer
• Input-Output System
Javed Ahmed Shahani
ALU CU clock
control bus
address bus
3 4
1
Clock Instruction Execution Cycle
• synchronizes all CPU and BUS operations
• machine (clock) cycle measures time of a
PC program
I-1 I-2 I-3 I-4
one cycle
•
decode
Store output
write
write
1
flags ALU
0 execute
(output)
5 6
5 I-2 I-1
6 I-1 k + (n – 1)
6 I-2 I-1
Many wasted cycles. 7 I-2
7 I-2
8 I-2
9 I-2
10 I-2
11 I-2
12 I-2
7 8
2
Wasted Cycles (pipelined) Superscalar
• When one of the stages requires two or more clock cycles, clock A superscalar processor has multiple execution pipelines. In the following,
cycles are again wasted. note that Stage S4 has left and right pipelines (u and v).
Stages Stages
exe
S4
S1 S2 S3 S4 S5 S6
1 I-1 S1 S2 S3 u v S5 S6
1 I-1
2 I-2 I-1 For k states and n For k states and n
2 I-2 I-1
3 I-3 I-2 I-1 instructions, the instructions, the
3 I-3 I-2 I-1
Cycles
Cycles
4 I-4 I-3 I-2 I-1
5 I-3 I-1 cycles is: cycles is:
5 I-4 I-3 I-1 I-2
6 I-2 I-1
7 I-2 I-1 k + (2n – 1) 6 I-4 I-3 I-2 I-1
k+n
7 I-3 I-4 I-2 I-1
8 I-3 I-2
8 I-4 I-3 I-2
9 I-3 I-2
9 I-4 I-3
10 I-3
10 I-4
11 I-3
9 10
11 12
3
How a Program Runs Multitasking
User
sends program
• OS can run multiple programs at the same
name to
time.
Operating searches for Current • Multiple threads of execution within the same
system program in directory
program.
gets starting
cluster from returns to
System
• Scheduler utility assigns a given amount of
loads and
starts
path
CPU time to each running program.
Directory
entry
Program
• Rapid switching of tasks
– gives illusion that all programs are running at once
– the processor must support task switching.
13 14
15 16
4
Basic Execution Environment Addressable Memory
• Addressable memory • Protected mode
• General-purpose registers – 4 GB
• Index and base registers – 32-bit address
• Specialized register uses • Real-address and Virtual-8086
• Status flags modes
• Floating-point, MMX, XMM – 1 MB space
registers – 20-bit address
17 18
AH ECX,
AL and EDX 8 bits + 8 bits
EAX EBP
EBX ESP
ECX AX 16 bits
ESI
EDX EDI
EAX 32 bits
EFLAGS CS ES
SS FS
EIP
DS GS
19 20
5
Index and Base Registers Some Specialized Register Uses (1 of
2)
• Some registers have only a 16-bit name for • General-Purpose
their lower half: – EAX – accumulator
– ECX – loop counter
– ESP – stack pointer
– ESI, EDI – index registers
– EBP – extended frame pointer
(stack)
• Segment
– CS – code segment
– DS – data segment
21 – SS – stack segment 22
6
Floating-Point, MMX, XMM
Intel Microprocessor History
Registers 80-bit Data Registers 48-bit Pointer Registers
• Eight 80-bit floating-point data registers ST(0)
FPU Instruction Pointer
– ST(0), ST(1), . . . , ST(7) ST(1) • Intel 8086, 80286
– arranged in a stack
ST(2) FPU Data Pointer
• IA-32 processor family
ST(3)
– used for all floating-point arithmetic • P6 processor family
ST(4) 16-bit Control Registers
• Eight 64-bit MMX registers
• Eight 128-bit XMM registers for single-
ST(5) Tag Register • CISC and RISC
ST(6) Control Register
instruction multiple-data (SIMD) operations
ST(7) Status Register
Opcode Register
25 26
27 28
7
Intel IA-32 Family Intel P6 Family
• Intel386 • Pentium Pro
– 4 GB addressable RAM, 32-bit registers, – advanced optimization techniques in
paging (virtual memory) microcode
• Intel486 • Pentium II
– instruction pipelining – MMX (multimedia) instruction set
• Pentium
• Pentium III
– superscalar, 32-bit address bus, 64-bit
internal data path – SIMD (streaming extensions) instructions
• Pentium 4
– NetBurst micro-architecture, tuned for
multimedia
29 30
31 32
8
Real-Address mode Segmented Memory
Segmented memory addressing: absolute (linear) address is a
combination of a 16-bit segment value added to a 16-bit offset
• 1 MB RAM maximum addressable
• Application programs can access
F0000
E0000 8000:FFFF
D0000
30000 8000:0000
20000
10000
seg ofs
00000
33 34
35 36
9
Your turn . . . Protected Mode (1 of 2)
What segment addresses correspond to the linear address • 4 GB addressable RAM
28F30h?
– (00000000 to FFFFFFFFh)
• Each program assigned a memory
Many different segment-offset addresses can produce the
linear address 28F30h. For example:
partition which is protected from other
28F0:0030, 28F3:0000, 28B0:0430, . . .
programs
• Designed for multitasking
• Supported by Linux & MS-Windows
37 38
• Program structure
not used
(4GB)
39 40
10
Multi-Segment Model Paging
• Each program has a local descriptor table (LDT)
– holds descriptor for each segment used by the program
RAM
• Supported directly by the CPU
• Divides each segment into 4096-byte blocks
Local Descriptor Table
called pages
• Sum of all programs can be larger than
base limit access
26000 physical memory
00026000
00008000
0010
000A
• Part of running program is in memory, part is
00003000 0002 8000 on disk
3000 • Virtual memory manager (VMM) – OS utility
that manages the loading and unloading of
41 42
pages
Components of an IA-32
Motherboard
Microcomputer
• Motherboard • CPU socket
• Video output • External cache memory slots
• Memory • Main memory slots
• Input-output ports • BIOS chips
• Sound synthesizer chip (optional)
• Video controller chip (optional)
• IDE, parallel, serial, USB, video, keyboard,
joystick, network, and mouse connectors
43 • PCI bus connectors (expansion cards) 44
11
Intel D850MD Motherboard mouse, keyboard,
Video
parallel, serial, and USB
connectors Video Output
Audio chip
• Video controller
– on motherboard, or on expansion card
PCI slots
memory controller hub
– AGP (accelerated graphics port
Pentium 4 socket
technology)*
• Video memory (VRAM)
AGP slot
45
– no raster scanning required 46
47 48
12
Input-Output Ports Input-Output Ports (cont)
• USB (universal serial bus) • Serial
– intelligent high-speed connection to
devices – RS-232 serial port
– up to 12 megabits/second – one bit at a time
– USB hub connects multiple devices – uses long cables and modems
– enumeration: computer queries – 16550 UART (universal asynchronous
devices receiver transmitter)
– supports hot connections – programmable in assembly language
• Parallel
– short cable, high speed
– common for printers
– bidirectional, parallel data transfer 49 50
51 52
13
ASM Programming levels
ASM programs can perform input-output at
each of the following levels:
OS Function Level 2
Hardware Level 0
53
14