0% found this document useful (0 votes)
9 views

Chapter3 2024 Prev 4in1

Uploaded by

Elias Derese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Chapter3 2024 Prev 4in1

Uploaded by

Elias Derese
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Outline

 Introduction
 Bus Interconnection Structures
 Background
 Evolution of Buses
 Operation of the Bus
 Classification of Bus Lines Based on Functions
 Elements of Bus Design
 PCI Bus
 Point‐to‐point Interconnection Structures
 Background
 PCIe

1 2

Introduction Introduction…
 A computer consists of a set of components or modules  Connection requirements of the three basic components
of three basic types, that communicate with each other of a computer system (below)
 Processor, memory, I/O modules

 All the units must be connected, and the collection of


paths connecting various units is called
 Interconnection structure

 In effect interconnection structures are the glue that


holds computer system together

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 3 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 4

Computer Architecture and Organization (CoEng4091) - Tinbit


Introduction… Introduction…
 CPU  Memory
 Reads instruction and data  Receives and sends data
 Writes out data (after processing)  Receives addresses (of locations)
 Sends control signals to other units  Receives control signals
 Receives (& acts on) interrupts  Read
 Write
 Timing

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 5 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 6

Introduction… Introduction…
 The interconnection structure must support the following types of transfers
 Input/Output Module  Memory to processor
 Similar to memory from computer’s viewpoint  The processor reads an instruction or a unit of data from memory
 Processor to memory
 Output  The processor writes a unit of data to memory
 I/O to processor
 Receive data from computer  The processor reads data from an I/O device via an I/O module
 Send data to peripheral  Processor to I/O
 The processor sends data to the I/O device
 Input  I/O to or from memory
 Receive data from peripheral  An I/O module is allowed to exchange data directly with memory, without going through the processor,
using direct memory access
 Send data to computer
 The most common interconnection structures are
 The bus and various multiple‐bus structures
 E.g. PCI bus (many PCs), ISA bus (PC/AT), EISA (80386), SCSI bus (PCs and workstations), Nubus
(Macintosh), IBM PC bus (PC/XT), Universal Serial Bus (modern PCs), and FireWire (consumer electronics)
 Point‐to‐point interconnection structures
 E.g. PCI Express (PCIe), Quick Path Interconnect (QPI)

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 7 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 8

Computer Architecture and Organization (CoEng4091) - Tinbit


Bus Interconnection Structure ‐ Background Evolution of Buses
 What is a BUS ?  Early personal computers had a single external bus, called system bus
 Connecting major components of the computer: CPU, Memory and I/O
 A shared communication pathway connecting multiple (two or more)
devices
 Consists of multiple lines, each line capable of transmitting signals
representing binary 1 or binary 0
 Allowing parallel movement of information
 Physically buses are a little more than bunches of wires

 Key characteristic of a bus


 As computer components (CPU, memory, I/O devices) got faster, a singe bus could
 It is a shared transmission medium
no longer handle the load, hence
 Multiple devices connect to the bus, and a signal transmitted by any  Various types of buses have been proposed
one device is available for reception by all other devices attached to the  Each having its own speed and performance characteristics
 A hierarchy of buses employed to connect subset of computer components
bus  Multiples buses laid out in hierarchy

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 9 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 10

Evolution of Buses… Evolution of Buses…


 Single Bus Problems  A Traditional Bus Architecture
 If many number of devices are connected to a single bus, the  A local bus
 Directly connects CPU to cache via cache controller
computer system performance will be poor, due to  The cache controller connects the cache to the system bus also
 More devices means, the greater the bus length, the greater the  A system bus
propagation delay  Connects main memory, CPU, and some I/O
 An expansion bus
 Co‐ordination of bus use can adversely affect performance
 Ties the system bus to I/O devices
 The bus becomes a bottleneck as the aggregate data transfer
approaches the capacity of bus, solution
 Increasing the data rate that the bus can carry
 Using the wider bus

 Most early systems use multiple buses, laid out in hierarchy, to


overcome these problem

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 11 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 12

Computer Architecture and Organization (CoEng4091) - Tinbit


Evolution of Buses… Evolution of Buses…
 An Enhanced Traditional Bus Architecture  Architecture of early Pentium Buses (Mid/late 1990s)
 Incorporates a high speed bus that brings bandwidth intensive
devices into closer integration with the processor via the bridge

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 13 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 14

Evolution of Buses… Evolution of Buses…


 Architecture of Intel PC buses(Mid 2000s)  Buses of contemporary PCs (Early 2010s)
 North bridge (Memory controller hub)  Manufacturers continue to move bus control hardware onto the
 Contains high‐bandwidth interfaces, connecting the same chip with the CPU
CPU, memory, and PCIe bus  Intel included the functionality of the memory controller hub on the
 Provides a fast communication pathway to the CPU same chip as the CPU in 2008, and AMD included it in 2011
through the front‐side bus
 Provides a fast connection to main memory
 South bridge (I/O controller hub)
 Connects to slower I/O buses, like SATA, USB, and so
forth, that connects slower I/O devices to the
computer system
 Contains legacy interfaces and devices:
 ISA bus (audio, LAN), interrupt controller, DMA
controller, time/counter

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 15 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 16

Computer Architecture and Organization (CoEng4091) - Tinbit


Evolution of Buses… Evolution of Buses…
 Multicore configurations, using QPI (Recent)  Bus structure of Core i7 system

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 17 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 18

Bus Terminology Operations of the Bus


 Bus Transaction  If one module wishes to send data to another, it must
 A sequence of bus operations that include a request and may include a
 Obtain the use of the bus
response, either of which may carry data
 A transaction is initiated by a single request and may take many individual bus  Transfer data via the bus
operations
 The complete activity of doing either of the following
 Memory Read/Write, I/O Read/Write  If one module wishes to request data from another, it
 Bus Cycle Time must
 The time between two consecutive ticks of the bus clock  Obtain the use of the bus
 Clock Skew
 Transfer a request to the other module over the
 Difference in propagation time of signals sent on parallel paths
 Drift in the clock, occurs when signals on different lines travel at slightly appropriate control and address lines
different speed  Wait for the second module to send the data
 The longer the bus and the faster the clock speed/the bus, the more the skew

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 19 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 20

Computer Architecture and Organization (CoEng4091) - Tinbit


Operations of the Bus… Classification of Bus Lines Based on Functions
 Devices attached to the bus are classified into Master and  On any bus, the lines can be classified into three
Slave categories functional groups
 Bus masters are active and initiate bus transfers  Data, address, and control lines
 Bus slaves are passive, wait for bus transfer requests
 Memory is always a bus slave
 Possible bus master and slave configuration

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 21 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 22

Data Bus Address Bus


 The data lines provide a path for moving data among  Identify the source or destination of data on the data bus
system modules  If the processor wishes to read a word (8, 16, or 32…bits) of
 Remember that there is no difference between “data” and data from memory, it puts the address of the desired word
“instruction” at this level on the address lines
 These lines, collectively, are called the data bus
 The data bus may consist of 32, 64, 128 or more separate  Address bus width determines maximum memory
lines capacity of the system
 The number of lines being referred to as width of the data  Example Intel Processors Address Bus Width, bits Maximum Memory Capacity

bus 8080 16 64K

 Width is a key determinant of performance 8086 20 1M


80286 24 16M
80386 32 4G
Pentium 4/Core 2 40 1T
Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 23 24

Computer Architecture and Organization (CoEng4091) - Tinbit


Control Bus Control Bus…
 Transmit both command and timing information  Lines forming the control bus can be roughly grouped into the following
 Timing signals indicate the validity of data and address major categories
 Bus Control, Interrupts, Bus Arbitration, Co‐processor signaling, Status,
information Miscellaneous
 Command signals specify operations to be performed
 Bus control
 Memory write
 Causes data on the bus to be written into the addressed location
 Control the access to and the use of the data and address  Memory read
lines  Causes data from the addressed location to be placed on the bus
 I/O write
 As the data and address lines are shared by all components,  Causes data on the bus to be output to the addressed I/O port
 I/O read
there must be a means of controlling their use  Causes data from the addressed I/O port to be placed on the bus
 Transfer ACK
 Indicates that data have been accepted from or placed on the bus

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 25 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 26

Control Bus… Elements of Bus Design


 Interrupts  Basic design elements that serve to classify and
 Interrupt request
 Indicates that an interrupt is pending
differentiate buses
 Interrupt ACK  Bus Types
 Acknowledges that the pending interrupt has been recognized
 Dedicated, Multiplexed
 Bus arbitration
 Bus request  Method of Arbitration
 Indicates that a module needs to gain control of the bus  Centralized, Decentralized
 Bus grant  Timing
 Indicates that a requesting module has been granted control of the bus
 Synchronous, Asynchronous
 Coprocessor signaling
 Bus Width
 Status
 Address, Data
 Clock
 Used to synchronize operations  Data Transfer Types
 Miscellaneous  Read, Write, Read‐modify‐write, Read‐after‐write , Block

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 27 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 28

Computer Architecture and Organization (CoEng4091) - Tinbit


Elements of Bus Design… Elements of Bus Design…
 Bus Types  Bus Types...
 Dedicated Bus  Multiplexed Bus
 Functionally dedicated bus  A bus that uses the same lines for multiple purposes at different
 A bus line that is permanently assigned to one function times, using time multiplexing
 E.g. separate data and address lines  E.g. Address and data information may be transmitted over the
 Physically dedicated bus same set of lines using Address Valid control line (8086)
 Refers to use of multiple buses, each of which connects only a subset
 Advantage
of modules  Uses fewer lines, which saves space and usually cost
 E.g. Use of an I/O bus to interconnect all I/O modules
 Disadvantage
 Advantage
 More complex circuitry needed within each module
 High throughput ‐‐‐ less contention
 Reduction in performance
 Disadvantage
 Certain events that share the same lines cannot takes place in
 Increases size and cost of the system
parallel

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 29 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 30

Elements of Bus Design… Elements of Bus Design…


 Method of Arbitration  Method of Arbitration...
 Broadly classified into
 In systems with more than one potential bus master device,  Centralized
Decentralized
What happens if two or more devices all want to become 

bus master at the same time ?  Centralized


 Solution ‐‐‐‐ some bus arbitration mechanism is needed to  A single hardware device (bus controller/arbiter) controlling bus access (allocating time on
the bus)
prevent chaos  It may be part of the processor or separate module

 Potential bus master devices include  Further divided into: Daisy‐Chain arbitration, Centralized parallel arbitration

 CPU, I/O controllers, Coprocessors


 Distributed
 Bus arbitration schemes must  There is no central controller
 Each module contains access control logic
 Provide priority to certain master devices and, at the same time
 The module act together to share the bus
 Make sure that low priority devices are not starved out  Further divided into: Distributed arbitration using self‐selection, Distributed arbitration
using token passing, Distributed arbitration using collision detection

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 31 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 32

Computer Architecture and Organization (CoEng4091) - Tinbit


Elements of Bus Design… Elements of Bus Design…
 Centralized Method of Arbitration  Centralized Method of Arbitration...
 Daisy‐Chain Arbitration  Centralized Parallel Arbitration
 Bus request line can be asserted by one/more devices at any time  Uses multiple request/grant lines, one for each priority level
 When the arbiter sees a bus request, it issues a grant by asserting  Solves, daisy chained arbitration’s implicit priorities, based on
the bus grant line. This line is wired through all of the I/O devices distance from the arbiter
in series  But, Grant line is daisy chained among devices of same priority
 Devices are effectively assigned priorities depending on how level
close to the arbiter they are. The closest device wins

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 33 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 34

Elements of Bus Design… Elements of Bus Design…


 Distributed Method of Arbitration  Distributed Method of Arbitration...
 Distributed Arbitration using Self‐Selection  Distributed Arbitration using Token passing
 Each device has its own request line, which is prioritized  Uses only three lines, no matter how many devices are present
 All devices monitor all the request lines, so at the end of each bus  The BUSY line is asserted by the current bus master
cycle  The arbitration line is daisy chained through all the devices,
 The devices themselves determine who has highest priority and passes (grants)/denies token
who should is permitted to use the bus during the next cycle  A device holding token has been given exclusive access to the bus
 Requires more bus lines but avoids the potential cost of the
arbiter
 Limits the number of devices to the number of request lines

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 35 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 36

Computer Architecture and Organization (CoEng4091) - Tinbit


Elements of Bus Design… Elements of Bus Design…
 Distributed Method of Arbitration...  Timing
 Refers to the way in which events are coordinated on the bus
 Distributed Arbitration using Collision Detection  Buses use either synchronous timing or asynchronous timing
 Each device is allowed to make a request for the bus
 If the bus detects any collisions (multiple simultaneous requests),  Synchronous Bus
 Occurrence of events on the bus is determined by a master bus clock
the device must make another request
 All bus activities take an integral number of bus clock cycles/Bus cycles
 Much like the old Ethernet method of arbitration  All devices on the bus can read clock line
 All events start at the beginning of a clock cycle; Usually a single bus cycle for an
event
 Drawbacks
 Everything works in multiples of the bus cycles
 The bus has to be geared to the slowest one and the fast ones cannot use their full potential
 When heterogeneous collection of devices, some fast and some slow are located on the
bus
 Difficult to take advantage of future improvements in technology

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 37 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 38

Elements of Bus Design… Elements of Bus Design…


 Timing...  Timing...
 Synchronous Bus…  Asynchronous Bus
 A simplified timing diagram for synchronous read and write  Control lines coordinate the bus operations/transaction, and a complex
handshaking protocol used to enforce timing
 It does not tie everything to the clock
 Each event is caused by a prior event, not by a clock pulse
 The occurrence of one event on the bus follows and depends on the
occurrence of a previous events
 If a particular master/slave pair is slow, there is no way a subsequent
master/slave pair, that is much faster, is affected
 Scales better with technology and can support a wider variety of devices (as
protocols, not the clock is coordinating transactions)

39 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 40

Computer Architecture and Organization (CoEng4091) - Tinbit


Elements of Bus Design… Elements of Bus Design…
 Timing...  Bus Width
 Asynchronous Bus…  Has an impact on system performance
 A simplified timing diagram for asynchronous write  The wider the data bus
 The greater the number of bits transferred at one time
 Example
Intel Processors Data Bus Width, bits
8080 8
8086 16
80286 16
80386 32
Pentium 4/Core 2 64

 The wider the address bus


 The greater the range of locations that can be referenced

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 41 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 42

Elements of Bus Design… PCI ‐ Bus


 Data Transfer Types/ BUS operation types  Introduced by intel in 1993 to succeed older buses, like
 Write
 Master to slave ISA, EISA…
 Read  An upgrade to the older ISA bus with higher speeds and
 Slave to master
more bits transferred in parallel
 Read‐modify‐write
 A read followed immediately by a write to the same address  Can be configured as a 32 or 64 bit bus
 An indivisible operation, to prevent any access to the data element by other potential bus
masters
 Successive generations operate at 33MHz, 66MHz
 Read‐after‐write  Superseded by PCI‐X (PCI eXtended) in 2004
 An indivisible operation consisting of a write followed immediately by a read from the same  PCI‐X basically doubled the bandwidth of regular PCI
address
 The read operation may be performed for checking purposes
 Operates at 133MHz
 Block  Every Intel‐based computer since the Pentium has a PCI
 One address cycle is followed by n data cycles
 The first data item is transferred to or from the specified address; the remaining data items bus
are transferred to or from subsequent addresses
 PCI bus can be used in many configurations, (Top‐right)
43 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 44

Computer Architecture and Organization (CoEng4091) - Tinbit


PCI ‐ Bus… PCI ‐ Bus…
 PCI BUS Signals
 PCI BUS Operation  Divide into Mandatory (49 signals) and Optional (51 signals)
 A synchronous bus, using centralized arbitration  The signals can be divided into functional groups

 Multiplexed Address and data lines, to keep low pin count  Mandatory (Functional Groups)
 Slave can insert “wait states”, when it is not ready to supply  System pins
 Include the clock and reset pins
the requested data, by activating appropriate control line  Address and Data Pins
 Include 32 lines, time multiplexed for addresses and data
 Different kinds of bus cycles possible
 Other lines in this group are used to interpret and validate the signal lines that carry the
 Block transfers, …. addresses and data
 Interface Control pins
 Control the timing of transactions and provide coordination among initiators and targets
 Arbitration pins
 Each PCI master has its own pair of arbitration lines that connect it directly to the PCI bus
arbiter
 Error reporting pins
 Used to report parity and other errors

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 45 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 46

PCI ‐ Bus… PCI ‐ Bus…


 Mandatory (Functional Groups)  Optional (Functional Groups)
 Interrupt pins
 For PCI devices that must generate interrupt requests for service
 Each PCI device has its own interrupt line or lines to an interrupt
controller
 Cache support pins
 Needed to support a memory on PCI that can be cached in the
processor or another device
 64‐bit bus extension pins
 Need to support extensions to support 64‐bit data transfer
 JTAG/boundary scan pins
 To support testing procedures

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 47 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 48

Computer Architecture and Organization (CoEng4091) - Tinbit


PCI ‐ Bus… PCI ‐ Bus…
 Optional (Functional Groups)…  PCI Commands
 When a bus master acquires control of the bus, it determines the
type of transaction that will occur next
 The C/BE lines (4 bits wide) are used to signal the transaction type,
during the address phase of the transaction
 The commands are as follows
 Memory Read/Read Line/Read Multiple
 Memory Write/Write and Invalidate
 I/O Read/Write
 Interrupt Acknowledge
 Special Cycle
 Configuration Read/Write
 Dual address Cycle

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 49 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 50

PCI ‐ Bus… Point‐to‐point Interconnection Structures ‐ Background


 PCI Bus Transactions  Bus was the dominant means of computer system component interconnection for
 Example ‐ Timing diagram of a 32 bit PCI bus transactions. The first decades
 For general‐purpose computers, it has gradually given way to various point‐to‐point
three cycles used for read operation, and then three cycles for a write
interconnection structures
operation
 Reasons, Why BUS did not rise up to the challenge ?
 Many I/O devices become increasingly too fast for PCI bus
 Increasing further the bus clock frequency not a solution, due to electrical constraints
 Problems with bus skew, crosstalk between the wires, and capacitance effects just
get worse
 At higher and higher data rates, it becomes increasingly difficult to perform the
synchronization and arbitration functions in a timely fashion
 The advent of multicore chips, with multiple processors and significant memory on a
single chip
 Use of a conventional shared bus on the same chip magnified the difficulties of increasing
bus data rate and reducing bus latency to keep up with the processors

51 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 52

Computer Architecture and Organization (CoEng4091) - Tinbit


Point‐to‐point Interconnection Structures ‐ Background Point‐to‐point Interconnection ‐ PCIe
 Point to point interconnection structures uses a high speed serial connection  Introduced in 2004, to replace PCI and its successor PCI‐X
 As there is no clock skew, a serial connection at much higher speed offsets by big  Progressively replaced the AGP (accelerated graphics port) graphics interface
margin, the loss of parallelism designed by Intel specifically for 3D graphics; PCIe generation depicted (below)
 Popular Point to point interconnection structures include  Represents a radical change from the PCI bus
 PCIe (PCI Express), QPI (Intel’s Quick Path Interconnect)  In fact, it is not even a bus at all
 It is point‐to‐point network using bit‐serial lines and packet switching, more like the
 Key characteristics of point‐to‐point interconnect schemes Internet than like a traditional bus
 Multiple direct connections PCIe Generation Year Introduced Transfer Rate Effective one‐way data rate
 Multiple components within the system enjoy direct pairwise connections to other
components 1.0a 2003 2.5GT/s 250 MB/s
 This eliminates the need for arbitration found in shared transmission systems 2.0 2007 5GT/s 500 MB/s
 Layered protocol architecture
3.0 2010 8GT/s 985 MB/s
 The processor‐level interconnects use a layered protocol architecture, as in TCP/IP‐based
data networks, than control signals as found in shared bus arrangements 4.0 2017 16GT/s 1.969 GB/s
 Packetized data transfer
5.0 2019 32GT/s 3.938 GB/s
 Data sent not as a raw bit stream, rather as a sequence of packets
 Each including control headers and error control codes 6.0 2022 64GT/s 7.563 GB/s
7.0 2025 (planned) 128 GT/s 15.125 GB/s
Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 53 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 54

PCIe/PCI bridge

PCIe… PCIe Architecture


 Sample architecture of a PCIe system with three PCIe ports (left)  A root complex device/ a chipset/a host bridge
 Connects the processor and memory subsystem to the PCI Express switch fabric comprising one or more PCIe
 Typical configuration of a multi‐computer using PCIe (right) and PCIe switch devices
Acts as a buffering device
 As depicted in the two figures, a PC with PCIe is a miniature packet‐ 
 To deal with differences in data rates between I/O controllers and memory and processor components
switching network  Translates between PCIe transaction formats and the processor and memory signal and control requirements
 Typically support multiple PCIe ports, which can be attached to either of the following devices, that
implements PCIe: Switch, PCIe End point, PCIe/PCI bridge, Legacy endpoint
 Physically its on the mother board
 Switch
 Manages multiple PCIe streams, coming from PCIe endpoint, Legacy endpoint, PCIe/PCI bridge
 Could be connected to the root complex, or possible part of it or integrated directly into the processor
 PCIe endpoint
 An I/O device or controller that implements PCIe
 E.g. A Gigabit ethernet switch, a graphics or video controller, disk interface, or a communications controller
PCIe/PCI bridge  PCIe/PCI bridge
 Allows older PCI devices to be connected to PCIe‐based system

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 55 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 56

Computer Architecture and Organization (CoEng4091) - Tinbit


PCIe Communications PCIe Protocol Stack…
 Each I/O chip has a dedicated point‐to‐point connection to the switch
 Each connection consists of a pair of unidirectional channels, one to the switch and one
 The data packets generated and consumed by the
from it respective transaction layers and data link layers are
 It is called a lane ‐ a bidirectional lane
 DLLPs – Data Link Layer Packets
 Each channel is made up of two wires
 One for the signal and one for ground, to provide high noise immunity during high‐speed  TLPs – Transaction Layer Packets
transmission
 Devices are not limited to a single bidirectional lane to communicate with the root
complex or a switch
 A device can have up to 32 lanes, which are not synchronous, so skew is not important
 When the CPU wants to talk to a device, it sends a packet to the device and generally
later gets an answer. The packet goes through the root complex, which is on the
motherboard, and then on to the device, possibly through a switch (or if the device is a
PCI device, through the PCIe/PCI bridge)
 This evolution of a system in which all devices listened to the same bus to one using point‐to‐
point communications parallels the development of Ethernet (a popular local area network)

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 57 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 58

PCIe Protocol Stack… PCIe Protocol Stack…


 Physical Layer  Link Layer
 Deals with moving bits from a sender to receiver over a point‐  Responsible for ensuring reliable transmission and flow control across the PCIe link
to‐point connection
 Has a provision for verification and retransmission of data sent over the PCIe link
 Consists of the actual wires carrying the signals, as well as
 Has a mechanism that controls a fast sender from overwhelming a slow receiver, with floods of
circuitry and logic to support ancillary features required in the
transmission and receipt of the 1s and 0s packets
 Each PCIe port  Data packets generated and consumed by the DLL are called Data Link Layer Packets (DLLPs)
 Consists of a number of bidirectional lanes  There are three important groups of DLLPs used in managing a link
 Can provide 1, 4, 6, 16, or 32 lanes  Flow control packets
 PCIe relies on the receiver synchronizing with the transmitter  Regulate the rate at which TLPs and DLLPs can be transmitted across a link
based on the transmitted signal (Bottom Figure (a) Transmitter  Power management packets
block diagram and (b) Receiver block diagram)  Used in managing power platform budgeting
 As it does not use a clock to synchronize bit streams
 TLP ACK and NAK packets
 Used for ensuring reliable transmission
 PCIe 3.0 uses the following techniques, to aid in
 Used in TLP processing
synchronization
 Multilane distribution (Top figure)  Also adds two fields to the core of TLP created by the TL
 Example: Distributing a Byte stream into a PCIe port of four lanes  16 bit sequence no & error correcting code (32 bit link layer CRC)
 Scrambling
 The two fields are processed at each intermediate node on the way from source to destination
 128b/130b encoding

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 59 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 60

Computer Architecture and Organization (CoEng4091) - Tinbit


PCIe Protocol Stack… PCIe Protocol Stack…
 Transaction Layer
 Receives read and write requests from the Software Layer and
 Software Layer
creates request packets for transmission to a destination via the link  Generates read/write requests that are transported by
layer
 Handles bus actions the TL to the I/O devices using a packet‐based
 Data packets generated and consumed by the TL are called
Transaction Layer Packets (TLPs) transaction protocol
 PCIe transactions are conveyed using TLPs  Sends to the TL, the info needed to create the core of the
 PCIe transactions can be of
 Split transactions ‐‐‐ A request packet, which will be followed at a later TLP
time by a completion packet; i.e. a response is expected
 Posted transactions ‐‐‐ Does not expect a response  Header, Data and ECRC


TLPs originate at sending devices and terminates at the receiving devices
TLPs consists of the following fields
 Interfaces the PCIe system to the OS, emulating PCI bus (to
 Header ‐‐‐‐describes the type of packet, info need by the receiver to run existing OS unmodified on PCIe)
process the packet, including any need routing information
 Data‐‐‐Up to 4096, some TLPs contain no data field
 ECRC‐‐‐‐ End to end CRC field, for the destination TL

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 61 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 62

PCIe Protocol Stack…


 Transaction Layer Packet Processing
 When the TLP arrives at the device, the DLL
 Strips of the seq no and CRC field
 Checks the LCRC Field
 If NO ERROR detected


The DLL sends an ACK packet, back to the transmitter at the other end of the link
The core portion of the TLP is handed up to the local TL

If the intermediate node is the intended final destination, the local TL processes the TLP
End of Chapter 3
 If not, the TL determines a route for the TLP, passes the packet back down to the DLL for transmission over the next
link on the way to the destination
 The DLL retains a copy of the TLP, which will be discarded from the buffer upon the reception of an ACK DLL packet from
the subsequent node

 If ERROR detected
 The DLL schedules a NAK DLL packet to return back to the remote transmitter, the TLP is eliminated
 The remote transmitter upon reception of a NAK DLL for its TLP with right sequence no, it retransmits the TLP

 Note:
 The core fields created at the TL are only used at the destination TL
 But the two fields added by the DLL to the TLP are processed at each intermediate node on the way
from source to destination

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 63 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 64

Computer Architecture and Organization (CoEng4091) - Tinbit


Buses – Physical Appearance Single Bus System

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 65 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 66

Expansion Bus – ISA Bus A System with PCI and ISA Bus

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 67 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 68

Computer Architecture and Organization (CoEng4091) - Tinbit


PCI Bus Based System PCI Bus Based System…

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 69 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 70

PCI Bus Based System… Core i7 Chip with Bus Control Hardware

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 71 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 72

Computer Architecture and Organization (CoEng4091) - Tinbit


Latest Intel Core i7 Based Systems Latest Intel Core i7 Based Systems.

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 73 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 74

Latest Intel Core i7 Based Systems. Motherboard Layout

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 75 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 76

Computer Architecture and Organization (CoEng4091) - Tinbit


Motherboard Expansion Slots Motherboard Layout…

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 77 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 78

Motherboard Layout… Various Buses Bandwidth Comparison

Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 79 Computer Architecture and Organization (CoEng4091) ‐ Tinbit A. 80

Computer Architecture and Organization (CoEng4091) - Tinbit

You might also like