The Microprocessor and Its Architecture
The Microprocessor and Its Architecture
INTRODUCTION
This chapter presents the microprocessor as a programmable device by first looking at its
internal programming model and then how its memory space is addressed. The architecture of
the family of Intel microprocessors is presented simultaneously, as are the ways that the family
members address the memory system.
The addressing modes for this powerful family of microprocessors are described for the real,
protected, and flat modes of operation. Real mode memory (DOS memory) exists at locations
00000H–FFFFFH, the first 1M byte of the memory system, and is present on all versions of the
microprocessor. Protected mode memory (Windows memory) exists at any location in the entire
protected memory system, but is available only to the 80286–Core2, not to the earlier 8086 or
8088 microprocessors. Protected mode memory for the 80286 contains 16M bytes; for the 80386–
Pentium, 4G bytes; and for the Pentium Pro through the Core2, either 4G or 64G bytes. With the
64-bit extensions enabled, the Pentium 4 and Core2 address 1T byte of memory in a flat memory
model. Windows Vista or Windows 64 is needed to operate the Pentium 4 or Core2 in 64-bit mode
using the flat mode memory to access the entire 1T byte of memory.
CHAPTER OBJECTIVES
Before a program is written or any instruction investigated, the internal configuration of the
micro- processor must be known. This section of the chapter details the program-visible internal
architec- ture of the 8086–Core2 microprocessors. Also detailed are the function and purpose of
each of these
51
52 CHAPTER 2
internal registers. Note that in a multiple core microprocessor each core contains the same
program- ming model. The only difference is that each core runs a separate task or thread
simultaneously.
FLAGS
IP
CS DS ES
SS
architectures, a subset of the registers shown in Figure 2–1. The 80386 through the Core2
microprocessors contain full 32-bit internal architectures. The architectures of the earlier 8086
through the 80286 are fully upward-compatible to the 80386 through the Core2. The shaded
areas in this illustration represent registers that are found in early versions of the 8086, 8088, or
80286 microprocessors and are provided on the 80386–Core2 microprocessors for
compatibility to the early versions
The programming model contains 8-, 16-, and 32-bit registers. The Pentium 4 and Core2
also contain 64-bit registers when operated in the 64-bit mode as illustrated in the programming
model. The 8-bit registers are AH, AL, BH, BL, CH, CL, DH, and DL and are referred to when
an instruction is formed using these two-letter designations. For example, an ADD AL,AH
instruction adds the 8-bit contents of AH to AL. (Only AL changes due to this instruction.) The
16-bit registers are AX, BX, CX, DX, SP, BP, DI, SI, IP, FLAGS, CS, DS, ES, SS, FS, and GS.
Note that the first 4 16 registers contain a pair of 8-bit registers. An example is AX, which con-
tains AH and AL. The 16-bit registers are referenced with the two-letter designations such as
AX. For example, an ADD DX, CX instruction adds the 16-bit contents of CX to DX. (Only
DX changes due to this instruction.) The extended 32-bit registers are EAX, EBX, ECX, EDX,
ESP, EBP, EDI, ESI, EIP, and EFLAGS. These 32-bit extended registers, and 16-bit registers
FS and GS, are available only in the 80386 and above. The 16-bit registers are referenced by
the desig- nations FS or GS for the two new 16-bit registers, and by a three-letter designation
for the 32-bit registers. For example, an ADD ECX, EBX instruction adds the 32-bit contents
of EBX to ECX. (Only ECX changes due to this instruction.)
Some registers are general-purpose or multipurpose registers, while some have special
purposes. The multipurpose registers include EAX, EBX, ECX, EDX, EBP, EDI, and ESI.
These registers hold various data sizes (bytes, words, or doublewords) and are used for almost
any pur- pose, as dictated by a program.
The 64-bit registers are designated as RAX, RBX, and so forth. In addition to the renam-
ing of the registers for 64-bit widths, there are also additional 64-bit registers that are called
R8 through R15. The 64-bit extensions have multiplied the available register space by more
than 8 times in the Pentium 4 and the Core2 when compared to the original microprocessor
architecture as indicated in the shaded area in Figure 2–1. An example 64-bit instruction is
ADD RCX, RBX, instruction, which adds the 64-bit contents of RBX to RCX. (Only RCX
changes due to this instruction.) One difference exists: these additional 64-bit registers (R8
through R15) are addressed as a byte, word, doubleword, or quadword, but only the rightmost
8 bits is a byte. R8 through R15 have no provision for directly addressing bits 8 through 15 as
a byte. In the 64-bit mode, a legacy high byte register (AH, BH, CH, or DH) cannot be
addressed in the same instruction with an R8 through R15 byte. Because legacy software does
not access R8 through R15, this causes no problems with existing 32-bit programs, which
function without modification.
Table 2–1 shows the overrides used to access portions of a 64-bit register. To access the
low-order byte of the R8 register, use R8B (where B is the low-order byte). Likewise, to access
the low-order word of a numbered register, such as R10, use R10W in the instruction. The letter
D is used to access a doubleword. An example instruction that copies the low-order doubleword
from R8 to R11 is MOV R11D, R8D. There is no special letter for the entire 64-bit register.
CS (code) The code segment is a section of memory that holds the code
(programs and procedures) used by the microprocessor. The code
segment register defines the starting address of the section of
memory holding code. In real mode operation, it defines the start of
a 64K- byte section of memory; in protected mode, it selects a
descriptor that describes the starting address and length of a section
of memory holding code. The code segment is limited to 64K bytes
in the 8088–80286, and 4G bytes in the 80386 and above when
these microprocessors operate in the protected mode. In the 64-bit
mode, the code segment register is still used in the flat model, but
its use differs from other programming modes as explained in
Section 2-5.
DS (data) The data segment is a section of memory that contains most data
used by a program. Data are accessed in the data segment by an
offset address or the contents of other registers that hold the offset
address. As with the code segment and other segments, the length is
limited to 64K bytes in the 8086–80286, and 4G bytes in the 80386
and above.
ES (extra) The extra segment is an additional data segment that is used by
some of the string instructions to hold destination data.
SS (stack) The stack segment defines the area of memory used for the stack.
The stack entry point is determined by the stack segment and
stack pointer registers. The BP register also addresses data within
the stack segment.
FS and GS The FS and GS segments are supplemental segment registers available
in the 80386–Core2 microprocessors to allow two additional memory
segments for access by programs. Windows uses these segments for
internal operations, but no definition of their usage is available.
The 80286 and above operate in either the real or protected mode. Only the 8086 and 8088
operate exclusively in the real mode. In the 64-bit operation mode of the Pentium 4 and Core2,
there is no real mode operation. This section of the text details the operation of the
microprocessor in the real mode. Real mode operation allows the microprocessor to address
only the first 1M byte of memory space—even if it is the Pentium 4 or Core2 microprocessor.
Note that the first 1M byte of memory is called the real memory, conventional memory, or
DOS memory system. The DOS operating sys- tem requires that the microprocessor operates in
the real mode. Windows does not use the real mode. Real mode operation allows application
software written for the 8086/8088, which only contains 1M byte of memory, to function in the
80286 and above without changing the software. The upward compatibility of software is
partially responsible for the continuing success of the Intel family of microprocessors. In all
cases, each of these microprocessors begins operation in the real mode by default whenever
power is applied or the microprocessor is reset. Note that if the Pentium 4 or Core2 operate in the
64-bit mode, it cannot execute real mode applications; hence, DOS applications will not execute
in the 64-bit mode unless a program that emulates DOS is written for the 64-bit mode.
Protected mode memory addressing (80286 and above) allows access to data and programs
located above the first 1M byte of memory, as well as within the first 1M byte of memory.
Protected mode is where Windows operates. Addressing this extended section of the memory
system requires a change to the segment plus an offset addressing scheme used with real mode
memory addressing. When data and programs are addressed in extended memory, the offset
address is still used to access information located within the memory segment. One difference is
that the segment address, as discussed with real mode memory addressing, is no longer present
in the protected mode. In place of the segment address, the segment register contains a selector
that selects a descriptor from a descriptor table. The descriptor describes the memory
segment’s location, length, and access rights. Because the segment register and offset address
still access memory, protected mode instructions are identical to real mode instructions. In fact,
most programs written to function in the real mode will function without change in the protected
mode. The difference between modes is in the way that the segment register is interpreted by
the microprocessor to access the memory seg- ment. Another difference, in the 80386 and
above, is that the offset address can be a 32-bit number instead of a 16-bit number in the
protected mode. A 32-bit offset address allows the microproces- sor to access data within a
segment that can be up to 4G bytes in length. Programs that are written for the 32-bit protected
mode execute in the 64-bit mode of the Pentium 4.
EXAMPLE 2–1
Base = Start = 10000000H
G = 0
End = Base + Limit = 10000000H + 001FFH = 100001FFH
Example 2-2 uses the same data as Example 2-1, except that the G bit = 1. Notice that the
limit is appended with FFFH to determine the ending segment address.
EXAMPLE 2–2
Base = Start = 10000000H
G = 1
End = Base + Limit = 10000000H + 001FFFFFH = 101FFFFFH
The AV bit, in the 80386 and above descriptor, is used by some operating systems to
indicate that the segment is available (AV = 1) or not available (AV = 0). The D bit indicates
how the 80386 through the Core2 instructions access register and memory data in the protected or
real mode. If D = 0, the instructions are 16-bit instructions, compatible with the 8086–80286
microprocessors. This means that the instructions use 16-bit offset addresses and 16-bit register by
default. This mode is often called the 16-bit instruction mode or DOS mode. If D = 1, the
instructions are 32-bit instructions. By default, the 32-bit instruction mode assumes that all offset
addresses and all registers are 32 bits. Note that the default for register size and offset address is
overridden in both the 16- and 32-bit instruction modes. Both the MSDOS and PCDOS operating
systems require that the instruc- tions are always used in the 16-bit instruction mode. Windows
3.1, and any application that was writ- ten for it, also requires that the 16-bit instruction mode is
selected. Note that the instruction mode is accessible only in a protected mode system such as
Windows Vista. More detail on these modes and their application to the instruction set appears
in Chapters 3 and 4.
The access rights byte (see Figure 2–7) controls access to the protected mode segment.
This byte describes how the segment functions in the system. The access rights byte allows
complete control over the segment. If the segment is a data segment, the direction of growth is
specified. If the segment grows beyond its limit, the microprocessor’s operating system
program is interrupted, indicating a general protection fault. You can even specify whether a
data segment can be written or is write-protected. The code segment is also controlled in a
similar fashion and can have reading inhibited to protect software. Again, note that in 64-bit
mode there is only a code segment and no other segment descriptor types. A 64-bit flat model
program contains its data and stacks in the code segment.
Descriptors are chosen from the descriptor table by the segment register. Figure 2–8
shows how the segment register functions in the protected mode system. The segment register
contains a 13-bit selector field, a table selector bit, and a requested privilege level field. The 13-
bit selector chooses one of the 8192 descriptors from the descriptor table. The TI bit selects
either the global descriptor table (TI = 0) or the local descriptor table (TI = 1). The requested
privi- lege level (RPL) requests the access privilege level of a memory segment. The highest
privilege level is 00 and the lowest is 11. If the requested privilege level matches or is higher in
priority than the privilege level set by the access rights byte, access is granted. For
example, if the
requested privilege level is 10 and the access rights byte sets the segment privilege level at 11,
access is granted because 10 is higher in priority than privilege level 11. Privilege levels are
used in multiuser environments. Windows uses privilege level 00 (ring 0) for the kernel and
driver programs and level 11 (ring 3) for applications. Windows does not use levels 01 or 10. If
privi- lege levels are violated, the system normally indicates an application or privilege level
violation. Figure 2–9 shows how the segment register, containing a selector, chooses a descriptor
from the global descriptor table. The entry in the global descriptor table selects a segment in the
memory sys- tem. In this illustration, DS contains 0008H, which accesses the descriptor number 1
from the global descriptor table using a requested privilege level of 00. Descriptor number 1
contains a descriptor that defines the base address as 00100000H with a segment limit of 000FFH.
This means that a value of 0008H loaded into DS causes the microprocessor to use memory
locations 00100000H–001000FFH for the data segment with this example descriptor table. Note
that descriptor zero is called the null
descriptor, must contain all zeros, and may not be used for accessing memory.
The EMM386.EXE program reassigns extended memory, in 4K blocks, to the system
memory between the video BIOS and the system BIOS ROMS for upper memory blocks.
Without the paging mechanism, the use of this area of memory is impossible.
In Windows, each application is allowed a 2G linear address space from location
00000000H–7FFFFFFFH even though there may not be enough memory or memory available
at these addresses. Through paging to the hard disk drive and paging to the memory through the
memory paging unit, any Windows application can be executed.
Paging Registers
The paging unit is controlled by the contents of the microprocessor’s control registers. See Figure
2–11 for the contents of control registers CR0 through CR4. Note that these registers are available to
the 80386 through the Core2 microprocessors. Beginning with the Pentium, an additional control
register labeled CR4 controls extensions to the basic architecture provided in the Pentium or newer
microprocessor. One of these features is a 2M- or a 4M-byte page that is enabled by controlling
CR4.
The registers important to the paging unit are CR0 and CR3. The leftmost bit (PG)
position of CR0 selects paging when placed at a logic 1 level. If the PG bit is cleared (0), the
linear address generated by the program becomes the physical address used to access memory.
If the PG bit is set (1), the linear address is converted to a physical address through the paging
mecha- nism. The paging mechanism functions in both the real and protected modes.
CR3 contains the page directory base or root address, and the PCD and PWT bits. The
PCD and PWT bits control the operation of the PCD and PWT pins on the microprocessor. If
PCD is set (1), the PCD pin becomes a logic one during bus cycles that are not paged. This
allows the exter- nal hardware to control the level 2 cache memory. (Note that the level 2 cache
memory is an inter- nal [on modern versions of the Pentium] high-speed memory that functions
as a buffer between the microprocessor and the main DRAM memory system.) The PWT bit
also appears on the PWT pin during bus cycles that are not paged to control the write-through
cache in the system. The page directory base address locates the directory for the page
translation unit. Note that this address locates the page directory at any 4K boundary in the
memory system because it is appended inter- nally with 000H. The page directory contains
1024 directory entries of 4 bytes each. Each page directory entry addresses a page table that
contains 1024 entries.
The linear address, as it is generated by the software, is broken into three sections that are
used to access the page directory entry, page table entry, and memory page offset address.
Figure 2–12 shows the linear address and its makeup for paging. Notice how the leftmost 10
bits address an entry in the page directory. For linear address 00000000H–003FFFFFH, the
first page directory is accessed. Each page directory entry represents or repages a 4M section of
the memory system. The contents of the page directory select a page table that is indexed by the
next 10 bits of the linear address (bit positions 12–21). This means that address 00000000H–
00000FFFH selects page directory entry of 0 and page table entry of 0. Notice this is a 4K-byte
address range. The off- set part of the linear address (bit positions 0–11) next selects a byte in
the 4K-byte memory page. In Figure 2–12, if the page table entry 0 contains address
00100000H, then the physical address is 00100000H-00100FFFH for linear address
00000000H–00000FFFH. This means that when the program accesses a location between
00000000H and 00000FFFH, the microprocessor physically addresses location 00100000H–
00100FFFH.
Because the act of repaging a 4K-byte section of memory requires access to the page
direc- tory and a page table, which are both located in memory, Intel has incorporated a special
type of cache called the TLB (translation look-aside buffer). In the 80486 microprocessor, the
cache holds the 32 most recent page translation addresses. This means that the last 32 page
table trans- lations are stored in the TLB, so if the same area of memory is accessed, the
address is already present in the TLB, and access to the page directory and page tables is not
required. This speeds program execution. If a translation is not in the TLB, the page directory
and page table must be accessed, which requires additional execution time. The Pentium–
Pentium 4 microprocessors contain separate TLBs for each of their instruction and data caches.
the entire 4G byte of memory is paged, the system must allocate 4K bytes of memory for the
page directory, and 4K times 1024 or 4M bytes for the 1024 page tables. This represents a con-
siderable investment in memory resources.
The DOS system and EMM386.EXE use page tables to redefine the area of memory
between locations C8000H–EFFFFH as upper memory blocks. This is done by repaging
extended memory to backfill this part of the conventional memory system to allow DOS access
to additional memory. Suppose that the EMM386.EXE program allows access to 16M bytes of
extended and conventional memory through paging and locations C8000H–EFFFFH must be
repaged to locations 110000–138000H, with all other areas of memory paged to their normal
locations. Such a scheme is depicted in Figure 2–14.
Here, the page directory contains four entries. Recall that each entry in the page directory
corresponds to 4M bytes of physical memory. The system also contains four page tables with
1024 entries each. Recall that each entry in the page table repages 4K bytes of physical
memory. This scheme requires a total of 16K of memory for the four page tables and 16 bytes
of memory for the page directory.
As with DOS, the Windows program also repages the memory system. At present,
Windows version 3.11 supports paging for only 16M bytes of memory because of the amount
of memory required to store the page tables. Newer versions of Windows repage the entire
memory system. On the Pentium–Core2 microprocessors, pages can be 4K, 2M, or 4M bytes in
length. In the 2M and 4M variations, there is only a page directory and a memory page, but no
page table.
The memory system in a Pentium-based computer (Pentium 4 or Core2) that uses the 64-bit
exten- sions uses a flat mode memory system. A flat mode memory system is one in which there
is no seg- mentation. The address of the first byte in the memory is at 00 0000 0000H and the
last location is at FF FFFF FFFFH (address is 40-bits). The flat model does not use a segment
register to address a location in the memory. The CS segment register is used to select a
descriptor from the descriptor table that defines the access rights of only a code segment. The
segment register still selects the privilege level of the software. The flat model does not select
the memory address of a segment using the base and limit in the descriptor (see Figure 2–6). In
64-bit mode the actual address is not modified by the descriptor as in 32-bit protected mode.
The offset address is the actual physical address in 64-bit mode. Refer to Figure 2–15 for the
flat mode memory model.
This form of addressing is much easier to understand, but offers little protection to the
sys- tem, through the hardware, as did the protected mode system discussed in Section 2.3. The
real mode system is not available if the processor operates in the 64-bit mode. Protection and
paging are allowed in the 64-bit mode. The CS register is still used in the protected mode
operation in the 64-bit mode.
In the 64-bit mode if set to IA32 compatibility (when the L bit -0 is in the descriptor), an
address is 64-bits, but since only 40 bits of the address are brought out to the address pins, any
address above 40 bits is truncated. Instructions that use a displacement address can only use a
32- bit displacement, which allows a range of ;2G from the current instruction. This
addressing mode is called RIP relative addressing, and is explained in Chapter 3. The move
immediate instruction allows a full 64-bit address and access to any flat mode memory
location. Other instructions do not allow access to a location above 4G because the offset
address is still 32-bits. If the Pentium is operated in the full 64-bit mode (where the L = 1 in
the descriptor), the address may be 64-bits or 32-bits. This is shown in examples in the next
chapter with addressing modes and in more detail in Chapter 4. Most programs today are
operated in the IA32 compati- ble mode so current versions of Windows software operates
properly, but this will change in a
6. Protected mode operation allows memory above the first 1M byte to be accessed by the
80286 through the Core2 microprocessors. This extended memory system (XMS) is
accessed via a segment address plus an offset address, just as in the real mode. The differ-
ence is that the segment address is not held in the segment register. In the protected mode,
the segment starting address is stored in a descriptor that is selected by the segment
register.
7. A protected mode descriptor contains a base address, limit, and access rights byte. The
base address locates the starting address of the memory segment; the limit defines the last
location of the segment. The access rights byte defines how the memory segment is accessed
via a program. The 80286 microprocessor allows a memory segment to start at any of its 16M
bytes of memory using a 24-bit base address. The 80386 and above allow a memory segment
to begin at any of its 4G bytes of memory using a 32-bit base address. The limit is a 16-bit
number in the 80286 and a 20-bit number in the 80386 and above. This allows an 80286
memory segment limit of 64K bytes, and an 80386 and above memory segment limit of
either 1M bytes (G = 0) or 4G bytes (G = 1). The L bit selects 64-bit address operation in
the code descriptor.
8. The segment register contains three fields of information in the protected mode. The left-
most 13 bits of the segment register address one of 8192 descriptors from a descriptor
table. The TI bit accesses either the global descriptor table (TI = 0) or the local descriptor
table (TI = 1). The rightmost 2 bits of the segment register select the requested priority
level for the memory segment access.
9. The program-invisible registers are used by the 80286 and above to access the descriptor
tables. Each segment register contains a cache portion that is used in protected mode to
hold the base address, limit, and access rights acquired from a descriptor. The cache allows
the microprocessor to access the memory segment without again referring to the descriptor
table until the segment register’s contents are changed.
10. A memory page is 4K bytes in length. The linear address, as generated by a program, can
be mapped to any physical address through the paging mechanism found within the 80386
through the Pentium 4 microprocessor.
11. Memory paging is accomplished through control registers CR0 and CR3. The PG bit of
CR0 enables paging, and the contents of CR3 addresses the page directory. The page
directory contains up to 1024 page table addresses that are used to access paging tables.
The page table contains 1024 entries that locate the physical address of a 4K-byte memory
page.
12. The TLB (translation look-aside buffer) caches the 32 most recent page table translations.
This precludes page table translation if the translation resides in the TLB, speeding the exe-
cution of the software.
13. The flat mode memory contains 1T byte of memory using a 40-bit address. In the future,
Intel plans to increase the address width to 52 bits to access 4P bytes of memory. The flat
mode is only available in the Pentium 4 and Core2 that have their 64-bit extensions
enabled.