White Paper: Pci Express Technology
White Paper: Pci Express Technology
WHITE PAPER
February 2004
™
1. This term does not connote an actual operating speed of 1 Gbps. For high-speed transmission, connection to a Gigabit Ethernet server and network infrastructure is required.
2. The term, Northbridge, refers to the controller for the processor bus, memory bus, AGP bus, and the link to the Southbridge. The term, Southbridge, refers to the I/O device controller.
form architecture by offloading various functions to of a typical client PC system and the bandwidth of its
higher-bandwidth PCI derivatives, including AGP and I/O and graphics buses.
PCI-X, both of which are PCI variants. Table 1 presents
the peak bandwidth of the PCI, PCI-X, and AGP buses.
3. On a multidrop bus, all devices attached to it are connected to the same set of wires. When a device is using the PCI bus, no other device can communicate over the bus. All connected
devices must share the bus and wait their turn before sending or receiving data.
2
February 2004
The PCI Special Interest Group (PCI SIG) has been devel-
oping the PCI-X 2.0 specification, which will effectively
create a 64-bit, 266-MHz PCI-X bus with double the data
rate of the 133-MHz PCI-X bus. However, there are sig-
nificant design issues associated with extending these
parallel PCI-X bus variants. The connectors are large and
expensive, and stringent design requirements drive up
the cost of system boards significantly as frequencies
are increased. In addition, to avoid excessive electrical
loading at the higher speeds of PCI-X 2.0, only one I/O
device can be attached in a point-to-point configuration
to the PCI-X bus. It cannot be implemented as a shared
bus.
Server-System Bottlenecks
Figure 3 shows the internal system interconnects in a
Figure 2. Bandwidth of Devices Serviced by PCI Bus typical dual-processor server system. In this architec-
ture, high-bandwidth expansion is provided via a propri-
today's AGP8X. AGP8X operates at 2.134 gigabytes per
etary interface between the Northbridge and PCI-X
second (GB/sec). Despite this bandwidth, the progres-
bridge chips. Multiple PCI-X buses connect to high-
sive performance demands on the AGP bus are putting
speed expansion slots, 10-Gigabit Ethernet, and SAS/
considerable pressure on board design and intercon-
SATA drives. This architecture has some drawbacks.
nection costs. Like the PCI bus, extending the AGP bus
The proprietary PCI-X bridge chips connect multiple par-
becomes more difficult and expensive as frequencies
allel PCI-X buses to the chip set's proprietary serial inter-
increase.
connect. This approach is expensive, inefficient, and
Link Between Northbridge and Southbridge. Conges- introduces latency between I/O devices and the North-
tion on the PCI bus also affects the link between the bridge. For example, the approach connects a serial 10-
Northbridge and the Southbridge. SATA drives and USB Gbps fabric to a point-to-point, 64-bit parallel bus that is,
devices further stress this link. A higher-bandwidth link in turn, connected via a proprietary PCI-X bridge chip to
will be required in the future. a proprietary serial interconnect into the Northbridge.
Server Systems
In servers, the original 32-bit, 33-MHz PCI bus was ex-
tended to a 64-bit, 66-MHz bus with a bandwidth of 532
MB/sec. The 64-bit bus was recently extended to 100
and 133 MHz, referred to as PCI-X. The PCI-X bus con-
nects the server-system (and the high-end, dual-proces-
sor workstation-system) chip set to expansion slots,
Gigabit Ethernet controllers, and Ultra320 SCSI control-
lers embedded on the system board. A 64-bit PCI-X bus
at 133 MHz delivers 1 GB/sec of peak bandwidth be-
tween the system chip set and the I/O device. This is
sufficient bandwidth for the majority of immediate serv-
er I/O requirements, including Gigabit Ethernet,
Ultra320 SCSI, and 2-GB/sec Fibre Channel. However,
like PCI, PCI-X is a shared bus and is likely to require a Figure 3. Current Dual-Processor Server Architecture
higher-bandwidth alternative in 2004.
3
www.dell.com/r&d
PCI Express Technology
In addition, next-generation external server I/O technol- store (and flat address space) model. The layered archi-
ogies are expected to require much greater bandwidth tecture is discussed in the sidebar below, “PCI Express
than a 133-MHz PCI-X bus can provide. These technolo- Layered Architecture.”
gies include system-area fabrics such as 10-Gigabit
The PCI Express architecture defines a high-perfor-
Ethernet, 10-Gbps Fibre Channel, and 4x Infiniband.
mance, point-to-point, scalable, serial bus. A PCI Ex-
They also include future higher-speed hard-drive inter-
press link consists of dual simplex channels, each
faces such as 3-Gbps SATA and SAS. In the case of a
implemented as a transmit pair and a receive pair for si-
10-Gbps fabric, each 10-Gbps port will be able to trans-
multaneous transmission in each direction. Each pair
mit bidirectional data at a peak bandwidth of 2 GB/sec.
consists of two low-voltage, differentially driven pairs of
The 133-MHz PCI-X bus delivers a maximum of 1 GB/
signals. A data clock is embedded in each pair, using an
sec in one direction at a time. This suggests that the
8b/10b clock-encoding scheme to achieve very high
133-MHz PCI-X bus could throttle the peak bandwidth of
data rates. Figure 4 compares the PCI and PCI Express
these fabrics by as much as 50 percent. Although PCI-X
links.
2.0 at 266 MHz would double the PCI-X peak bandwidth
to 2 GB/sec, it would still fall short of the total 4 GB/sec
required by a dual-ported 10-Gbps fabric controller.
4
February 2004
The bandwidth of a PCI Express link can be scaled by In contrast to PCI, PCI Express has minimal sideband
adding signal pairs to form multiple lanes between the signals and the clocks and addressing information are
two devices. The specification supports x1, x4, x8, and embedded in the data. Because PCI Express is a serial
x16 lane widths and stripes the byte data across the technology with few sideband signals, it provides a very
links accordingly. Once the two agents at each end of high bandwidth per I/O connector pin compared to PCI.
the PCI Express link negotiate lane widths and frequen- This is designed to provide more efficient, smaller, and
cy of operation, the striped data bytes are transmitted cheaper connectors. Figure 5 compares the bandwidth
with 8b/10b encoding. per I/O connector pin of PCI, PCI-X, AGP, and PCI Ex-
press.
The basic “x1” link has
PCI Express “Coded” and
a peak raw bandwidth “Unencoded” Bandwidth
of 2.5 Gbps. Because
PCI Express bandwidth is commonly
the bus is bidirectional expressed as “encoded” bandwidth. PCI
(that is, data can be Express uses 8b/10b encoding, which
encodes 8-bit data bytes into 10-bit trans-
transferred in both di- mission characters. This approach improves
rections simultaneous- the physical signal so that bit synchroniza-
ly), the effective raw tion is easier, design of receivers and
transmitters is simplified, error detection is
data transfer rate is 5 improved, and control characters can be
Gbps. Table 2 summa- distinguished from data characters.
Figure 5. Comparison of I/O Bus Bandwidth Per Pin
rizes the encoded and The “encoded” bandwidth of a basic x1 PCI PCI Express technology achieves high data rates reli-
unencoded data rates Express lane is 5 Gbps. However, a more
accurate bandwidth figure is the “unen- ably by using low-voltage differential signaling. In this
(see sidebar) of x1, x4,
coded” bandwidth, which is 80 percent of 5 approach, the signal is sent from the source to the re-
x8, and x16 implemen- Gbps or 4 Gbps. Table 2 presents both
ceiver over two lines. One contains a “positive” image
tations, which are de- encoded and unencoded PCI Express band-
width. In this paper, we follow the common and the other, a “negative” or “inverted” image of the
fined in the initial industry practice of citing the higher signal. The lines are routed using strict routing rules so
generation of PCI encoded bandwidth figures.
that any noise that affects one line also affects the other
Express.
line. The receiver collects both signals, inverts the neg-
ative version back to the positive and sums the two col-
lected signals, which effectively removes the noise.
PCI Express Encoded Data
Implementation Rate Unencoded Data Rate The original PCI Express specification defines graphics
x1 5 Gbps 4 Gbps (500 MB/sec) cards with up to 75 watts of power. In addition, a new
x4 20 Gbps 16 Gbps (2 GB/sec) high-end PCI Express graphics specification is under
x8 40 Gbps 32 Gbps (4 GB/sec) development that defines cards of up to 150 watts.
x16 80 Gbps 64 Gbps (8 GB/sec) These higher power levels accommodate the require-
Table 2. PCI Express Bandwidth ments of graphics adapters, which currently peak at 41
watts for mainstream AGP cards and 110 watts for AGP
Future implementations of PCI Express will raise the Pro 110 cards.
channel communication frequency to even higher lev-
els. For example, a second generation of PCI Express PCI Express Advanced Features
could increase the communication frequency by a fac- PCI Express has advanced features that will be phased
tor of 2 or more. in as operating system and device support is developed
and as customer applications require them:
Because it is a point-to-point architecture, the entire
bandwidth of each PCI Express bus is dedicated to the • Advanced power management
device at the end of the link. Multiple PCI Express devic- • Support for real-time data traffic
es can be active without interfering with each other. • Hot plug and hot swap
• Data integrity and error handling
5
www.dell.com/r&d
PCI Express Technology
6
February 2004
7
www.dell.com/r&d
PCI Express Technology
(Source: Intel)
Figure 8. Comparison of PCI and Transitional PCI Express
System Boards Figure 9. PCI Express Mini Versus Mini PCI
The first devices that will migrate to PCI Express cards A PCI Express Mini Card socket on the system board
will be those that require the bandwidth. For client sys- must support both a x1 PCI Express link and a USB 2.0
tems, these devices will include graphics, 1394, Gigabit link. A PCI Express Mini Card can use either PCI Express
Ethernet, and TV tuner cards. For server systems, or USB 2.0 (or both). USB 2.0 support will help during
Ultra320 SCSI RAID cards, Fibre Channel host bus the transition to PCI Express, because peripheral ven-
adapters (HBAs), and 1- and 10-Gigabit Ethernet cards dors will need time to design PCI Express into their chip
will be available initially. The cost of these cards is ex- sets. During the transition, PCI Express Mini Cards can
pected to be comparable to (and, in some cases, lower) be quickly implemented using USB 2.0.
than PCI-X alternatives. Other cards are expected to
gradually migrate to PCI Express, but it will be many ExpressCard
years before inexpensive and low-bandwidth cards ExpressCard is a small, modular add-in card designed to
such as modems are migrated. Similar to the transition replace the PC Card over the next few years. The Ex-
from the ISA to PCI bus, systems with both PCI and PCI pressCard specification was developed by the Personal
Express will exist for many years. Computer Memory Card International Association (PC-
MCIA). The ExpressCard form factors shown in Figure
PCI Express Mini Card 10 are designed to provide a small, less-expensive, and
The PCI Express Mini Card replaces the Mini PCI card, higher-bandwidth replacement for the PC Card. Like the
which is a small internal card functionally identical to PCI Express Mini Card, an ExpressCard module can
standard desktop computer PCI cards. Mini PCI cards support a x1 PCI Express and a USB 2.0 link. Its low cost
are used mainly to add communications functions to also makes it feasible for small form-factor desktop sys-
portable computers that are built- or customized-to-or- tems. The ExpressCard module also has low power re-
der. The PCI Express Mini Card is half the size of the quirements and is hot pluggable. It is likely to be used
Mini PCI card as shown in Figure 9. This allows system for communications, hard-disk storage, and emerging
designers to include one or two cards, depending on I/O technologies. ExpressCard modules are expected in
the size constraints of a particular portable computer. the second half of 2004.
8
February 2004
Client Systems
Figure 11 shows how PCI Express could be implement-
ed in a client system. Initially, a x16 PCI Express link will
replace the AGP bus between the graphics subsystem
and the Northbridge. A PCI Express variant could also
replace the link between the Northbridge and South-
Figure 10. ExpressCard Modules
bridge, relieving the bottleneck between peripheral I/O
devices and the Northbridge. There will also be multiple
PCI Express Server I/O Module PCI Express links off the Southbridge for the network in-
terface controller (NIC), 1394 devices, and other periph-
The SIOM specification is currently being defined.
erals. The Southbridge will continue to support legacy
SIOMs are expected with the second generation of the
PCI slots.
PCI Express technology. The PCI Express SIOM will pro-
vide a robust form factor that can be easily installed or
replaced. It will be modular, allowing I/O cards to be in-
stalled and serviced in a system while it is still operating
and without opening the chassis.
4. This term does not connote an actual operating speed of 1 Gbps. For high-speed transmission, connection to a Gigabit Ethernet server and network infrastructure is required.
9
www.dell.com/r&d
PCI Express Technology
Portable Computers
Figure 12 shows how PCI Express could be implement-
ed in a portable computer system. Like desktop sys-
tems, PCI Express will replace the AGP bus, and a PCI
Express variant is a candidate to replace the link be-
tween the Northbridge and Southbridge. In addition,
PCI Express could be used to replace the PCI bus be-
tween the Northbridge and the build-to-order/custom-
ize-to-order (BTO/CTO) slot. This slot currently
accommodates Mini PCI cards, but in new systems it Figure 13. Sample Server Architecture
may be used for PCI Express Mini Cards.
date the peak bandwidth required for a dual-ported
10-Gbps controller.
• Lower implementation cost. More slots and em-
bedded I/O devices can be connected to the sys-
tem chip set with fewer bridge chips and fewer
signal routing requirements on the system board.
• Lower latency. Transmission latency between I/O
devices and the CPU and memory can be reduced
by eliminating the PCI-X bridge chip.
Figure 12. Sample Portable Computer Architecture Enabling Future Modular Designs
The PCI SIG is also working on the PCI Express cable
specification. Because PCI Express has high data rates
The PCI bus between the Northbridge and the docking and low-pin-count connectors, it is likely to be used as
station could also migrate to PCI Express. A x1 Express- a high-speed interconnect between components in cli-
Card slot that uses a USB 2.0 link will replace the PC ent and server systems. Modular systems with sepa-
Card slot. Finally, individual PCI Express links will re- rate high-speed components can be connected with
place the PCI bus that supports integrated peripheral PCI Express cables. Figure 14 illustrates the concept of
devices such as Gigabit Ethernet, audio, and graphics. a “split” system that separates components that gener-
ate heat such as processors, memory, and graphics
Server Systems
from other components such as removable storage,
Figure 13 shows how PCI Express could be implement- display devices, and I/O ports. It may also make sense
ed in a dual-processor server architecture. PCI Express to separate high-end graphics subsystems, which re-
can help to significantly reduce server system complex- quire more power and generate heat, from the main
ity. PCI Express links for I/O devices and slots are processor chassis. This approach would make it easier
placed directly off the Northbridge. This approach is ex- to deliver appropriate power and cooling to the graphics
pected to provide the following potential advantages: subsystem.
• Higher bandwidth for next-generation I/O such
as 10-Gbps Ethernet and x4 Infiniband fabrics.
For example, a x8 PCI Express link can accommo-
10
February 2004
Figure 14. Examples of Split Systems That Separate Processor From I/O
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT
IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.
Trademarks used in this text: Dell and the DELL logo are trademarks of Dell Inc.; Intel is a registered trademark of Intel Corporation. Other trademarks and trade
names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell Inc. disclaims any proprietary interest
in trademarks and trade names other than its own.
11