0% found this document useful (0 votes)
41 views

Pcie 3.0

overview physical layer. it contains following things 1. MAC (Media Access Control) 2. PCS (Physical Coding Sublayer) 3. PMA (Physical Media Attachment Sublayer).

Uploaded by

Nagaraju Chakali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Pcie 3.0

overview physical layer. it contains following things 1. MAC (Media Access Control) 2. PCS (Physical Coding Sublayer) 3. PMA (Physical Media Attachment Sublayer).

Uploaded by

Nagaraju Chakali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

PCIe 3.

Aim: To develop an RTL design for physical coding sub-layer present in the physical layer of
PCIe 3.0 2nd Generation.
Software: Questasim 10.7.
Overview of Physical Layer:
Physical layer of PCI Express consists of three sublayers as shown in the figure.
1. MAC (Media Access Control)
2. PCS (Physical Coding Sublayer)
3. PMA (Physical Media Attachment Sublayer).
MAC comprises of State Machines for Link Training and Status State Machine (LTSSM) and
lane–lane de-skew.
PCS comprises of 8b/10b encoder/decoder, Rx detection, and elastic buffer.
PMA comprises of Analog buffers, SEDRES, 10bit interface.

Fig: Block Diagram of Physical Layer of PCI Express


Physical Coding Sublayer:
PCS block diagram is shown in the figure. It has transmitter and receiver blocks.Transmitter part
consist of Scrambler, 8b/10b encoder and serializer. Receiver Block consists of De-scrambler,
8b/10b decoder and de-serializer.

Fig: Physical Coding Sublayer Block Diagram.

Transmitter:

Scrambler:
• In telecommunications, a scrambler is a device that transposes, or inverts signals or
otherwise encodes a message at the sender's side to make the message unintelligible at a
receiver not equipped with an appropriately set descrambling device.

• Whereas encryption usually refers to operations carried out in the digital domain,
scrambling usually refers to operations carried out in the analog domain.

• Scrambling is accomplished by the addition of components to the original signal or the


changing of some important component of the original signal to make extraction of the
original signal difficult.
• A scrambler (also referred to as a randomizer) is a device that manipulates a data stream
before transmitting. The manipulations are reversed by a descrambler at the receiving side.
Scrambling is widely used in satellite, radio relay communications and PSTN modems.

• A scrambler replaces sequences with other sequences without removing undesirable


sequences, and as a result it changes the probability of occurrence of unnecessary
sequences.

There are two main reasons why scrambling is used:

• To enable accurate timing recovery on receiver equipment without resorting to redundant


line coding. It facilitates the work of a timing recovery circuit (see also clock recovery), an
automatic gain control and other adaptive circuits of the receiver (eliminating long
sequences consisting of '0' or '1' only).

• For energy dispersal on the carrier, reducing inter-carrier signal interference. It eliminates
the dependence of a signal's power spectrum upon the actual transmitted data, making it
more dispersed to meet maximum power spectral density requirements.

• Because if the power is concentrated in a narrow frequency band, it can interfere with
adjacent channels due to the intermodulation also known as cross-modulation caused by
non-linearities of the receiving tract.

A scrambler (or randomizer) can be either:

• An algorithm that converts an input string into a seemingly random output string of the
same length (e.g., by pseudo-randomly selecting bits to invert), thus avoiding long
sequences of bits of the same value; in this context, a randomizer is also referred to as a
scrambler.

• An analog or digital source of unpredictable (i.e., high entropy), unbiased, and usually
independent (i.e., random) output bits. A "truly" random generator may be used to feed a
(more practical) deterministic pseudo-random random number generator, which extends
the random seed value.

• Scramblers are essential components of physical layer. They are usually defined based on
linear-feedback shift registers (LFSRs) due to their good statistical properties and ease of
implementation in hardware.

• PCI Express employs a technique called data scrambling to reduce the possibility of
electrical resonances on the link.
• PCI Express specification defines a scrambling/descrambling algorithm that is
implemented using a linear feedback shift register.
• PCI Express accomplishes scrambling or descrambling by performing a serial XOR
operation to the data with the seed output of a Linear Feedback Shift Register that is
synchronized between PCI Express devices.
• In the current work for scrambler the polynomial used is 𝑥 5 + 𝑥 3 + 1.

Fig: 8-bit Scrambler.

Note: From the polynomial expression, the powers 3 & 5 represent the XOR gate taken at the
output of 3rd and 5th D flip flop.

8b/10b Encoder:
• The scrambled 8-bit data is given to the encoder.

• Each Lane of a device's transmitter implements an 8-bit to 10-bit Encoder that encodes 8-
bit data or control characters into 10-bit symbols. The coding scheme was invented by IBM
in 1982.
Purpose of Encoding a Character Stream:

• The primary purpose of this scheme is to embed a clock into the serial bit stream
transmitted onall Lanes. No clock is therefore transmitted along with the serial data bit
stream.

• This eliminates the need for a high frequency 2.5GHz clock signal on the Link which
would generate significant EMI noise and would be a challenge to route on a standard
FR4 board.

• Link wire routing between two ports is much easier given that there is no clock to route,
removing the need to match clock length to Lane signal trace lengths. Two devices are
connected by simply wiring their Lanes together.

Below is a summary of the advantages of 8b/10b encoding scheme:

• Embedded Clock. Creates sufficient 0-to-1 and 1-to-0 transition density (i.e.,
signal changes) to facilitate re-creation of the receive clock on the receiver
end using a PLL (by guaranteeing a limited run length of consecutive ones or
zeros). The recovered receive clock is used to clock inbound 10-bit symbols
into an elastic buffer. The figure illustrates the example case wherein 00h is
converted to 1101000110b, where an 8- bit character with no transitions has
5 transitions when converted to a 10b symbol. These transitions keep the
receiver PLL synchronized to the transmit circuit clock:

i) Limited 'run length' means that the encoding scheme ensures the signal line
will not remain in a high or low state for an extended period. The run length
does not exceed five consecutive 1s or 0s.

ii) 1s and 0s are clocked out on the rising edge of the transmit clock. At the
receiver, a PLL can recreate the clock by syncing to the leading edges of 1s
and 0s.
iii) Limited run length ensures minimum frequency drift in the receiver's PLL
relative to the local clock in the transmit circuit.

Fig: Example of 8-bit Character of 00h Encoded to 10-bit Symbol

• DC Balance. Keeps the number of 1s and 0s transmitted as close to equal as


possible, thus maintaining DC balance on the transmitted bit stream to an
average of half the signalthreshold voltage.
• This is very important in capacitive- and transformer-coupled circuits.

• It maintains a balance between the number of 1s and 0s on the signal line,


thereby ensuring that the received signal is free of any DC component. This
reduces the possibility of inter-bit interference. Inter-bit interference results
from the inability of a signal to switch properly from one logic level to the other
because the Lane coupling capacitor or intrinsic wire capacitance is over-
charged.
• Encoding of Special Control Characters: Permits the encoding of special
control ('K') characters such as the Start and End framing characters at the start
and end of TLPs and DLLPs.
• Error Detection: A secondary benefit of the encoding scheme is that it
facilitates the detection of most transmission errors. A receiver can check for
'running disparity' errors, or the reception of invalid symbols. Via the running
disparity mechanism, the data bit stream transmitted maintains a balance of 1s
and 0s. The receiver checks the difference between the total number of 1s and
0s transmitted since link initialization and ensures that it is as close to zero as
possible. If it isn't, a disparity error is detected and reported, implying that a
transmission error occurred.

The disadvantage of 8b/10b encoding scheme is that due to the expansion of each 8-
bit character into a 10-bit symbol prior to transmission, the actual transmission
performance is degraded by 25% or said another way, the transmission overhead is
increased by 25%.

Properties of 10-bit (10b) Symbols:

• For 10-bit symbol transmissions, the average number of 1s transmitted over


time is equal to the number of 0s transmitted, no matter what the 8-bit character
to be transmitted is; i.e., the symbol transmission is DC balanced.
• The bit stream never contains more than five continuous 1s or 0s (limited-run length).

• Each 10-bit symbol contains:

- Four 0s and six 1s (not necessarily contiguous), or

- Six 0s and four 1s (not necessarily contiguous), or

- Five 0s and five 1s (not necessarily contiguous).

• Each 10-bit symbol is subdivided into two sub-blocks: the first is six bits wide
and the second is four bits wide.

- The 6-bit sub-block contains no more than four 1s or four 0s.

- The 4-bit sub-block contains no more than three 1s or three 0s.


• Any symbol with other than the above properties is considered invalid, and a
receiver consider this an error.
• An 8-bit character is submitted to the 8b/10b encoder along with a signal
indicating whether the character is a Data (D) or Control (K) character. The
encoder outputs the equivalent 10-bit symbol along with a current running
disparity (CRD) that represents thesum of 1s and 0s for this transmission link
since link initialization.
• The PCI Express specification defines Control characters that encode into the
following Control symbols: STP, SDP, END, EDB, COM, PAD, SKP, FTS,
and IDL.

Preparing 8-bit Character Notation:

8b/10b conversion lookup tables refer to all 8-bit characters using a special notation
(represented by Dxx.y for Data characters and Kxx.y. for Control characters). Figure
illustrates the notation equivalent for any 8-bit D or K character. Below are the
steps to convert 8-bit number to its notation equivalent.

Figure: Preparing 8-bit Character for Encode

• In figure, the example character is the Data character, 6Ah.


1. The bits in the character are identified by the capitalized alpha designators A through H.
2. The character is partitioned into two sub-blocks: one 3-bit wide and the other 5-bit
wide.
3. The two sub-blocks are flipped.
4. The character takes the written form Zxx.y, where:
5. Z = D or K for Data or Control,
6. xx = the decimal value of the 5-bit field,
7. y = the decimal value of the 3-bit field.
8. The example character is represented as D10.3 in the 8b/10b lookup tables.

Disparity:
Character disparity refers to the difference between the number of 1s and 0s in a 10-bit symbol:

• When a symbol has more 0s than 1s, the symbol has negative (–) disparity
(e.g.,0101000101b).
• When a symbol has more 1s than 0s, the symbol has positive (+) disparity
(e.g.,1001101110b).
• When a symbol has an equal number of 1s and 0s, the symbol has neutral
disparity (e.g.,0110100101b).
• Each 10-bit symbol contains one of the following numbers of ones and zeros
(notnecessarily contiguous):
o Four 0s and six 1s (+ disparity).
o Six 0s and four 1s (– disparity).
o Five 0s and five 1s (neutral disparity).

There are two categories of 8-bit characters:

• Those that encode into 10-bit symbols with + or – disparity.


• Those that encode into 10-bit symbols with neutral disparity.

CRD (Current Running Disparity):


The CRD reflects the total number of 1s and 0s transmitted over the link since
link initialization and has the following characteristics:

• Its current state indicates the balance of 1s and 0s transmitted since link initialization.
• The CRD's initial state (before any characters are transmitted) can be + or –.
• The CRD's current state can be either positive (if more 1s than 0s have been
transmitted) or negative (if more 0s than 1s).
• Each character is converted via a table lookup with the current state of the
CRD factored in.
• As each new character is encoded, the CRD either remains the same (if the
newly generated 10-bit character has neutral disparity) or it flips to the
opposite polarity (if the newly generated character has + or – disparity).

8b/10b Encoding Procedure:

Refer to the figure below. The encode is accomplished by performing two table
lookups in parallel.

• First Table Lookup: Three elements are submitted to a 5-bit to 6-bit table for
a lookup (see Table 4-1 and Table 4-2):

- The 5-bit portion of the 8-bit character (bits A through E).

- The Data/Control (D/K#) indicator.

- The current state of the CRD (positive or negative).

- The table lookup yields the upper 6-bits of the 10-bit symbol (bits abcdei).

• Second Table Lookup: Three elements are submitted to a 3-bit to 4-bit table
for a lookup (see Table 4-3 and Table 4-4):

- The 3-bit portion of the 8-bit character (bits F through H).


- The same Data/Control (D/K#) indicator.

- The current state of the CRD (positive or negative).

- The table lookup yields the lower 4-bits of the 10-bit symbol (bits fghj).

Fig. 8-bit to 10-bit (8b/10b) Encoder

The 8b/10b encoder computes a new CRD based on the resultant 10-bit symbol and supplies this
CRD for the 8b/10b encode of the next character. If the resultant 10-bit symbol is neutral (i.e., it
has an equal number of 1s and 0s), the polarity of the CRD remains unchanged. If the resultant 10-
bit symbol is + or –, the CRD flips to its opposite state. It is an error if the CRD is currently + or
– and the next 10-bit symbol produced has the same polarity as the CRD (unless the next symbol
has neutral disparity, in which case the CRD remains the same).

The 8b/10b encoder feeds a Parallel-to-Serial converter which clocks 10-bit symbols out in the bit
order 'abcdeifghj' (shown in above figure).
The Lookup Tables:

The following four tables define the table lookup for the two sub-blocks of 8-
bit Data andControl characters.

Table 4-1: 5-bit to 6-bit Encode Table for Data Characters

Data Byte Name Unencoded Bits EDCBA Current RD – abcdei Current RD + abcdei

D0 00000 100111 011000

D1 00001 011101 100010

D2 00010 101101 010010

D3 00011 110001 110001

D4 00100 110101 001010

D5 00101 101001 101001

D6 00110 011001 011001

D7 00111 111000 000111

D8 01000 111001 000110

D9 01001 100101 100101

D10 01010 010101 010101

D11 01011 110100 110100

D12 01100 001101 001101

D13 01101 101100 101100

D14 01110 011100 011100

D15 01111 010111 101000

D16 10000 011011 100100


D17 10001 100011 100011

D18 10010 010011 010011

D19 10011 110010 110010

Table 4-1: 5-bit to 6-bit Encode Table for Data Characters

Data Byte Name Unencoded Bits EDCBA Current RD – abcdei Current RD + abcdei

D20 10100 001011 001011

D21 10101 101010 101010

D22 10110 011010 011010

D23 10111 111010 000101

D24 11000 110011 001100

D25 11001 100110 100110

D26 11010 010110 010110

D27 11011 110110 001001

D28 11100 001110 001110

D29 11101 101110 010001

D30 11110 011110 100001

D31 11111 101011 010100

Table 4-2: 5-bit to 6-bit Encode Table for Control Characters

Data Byte Name Unencoded Bits EDCBA Current RD – abcdei Current RD + abcdei

K28 11100 001111 110000


K23 10111 111010 000101

K27 11011 110110 001001

K29 11101 101110 010001

K30 11110 011110 100001

Table 4-3: 3-bit to 4-bit Encode Table for Data Characters

Data Byte Name Unencoded Bits HGF Current RD - fghj Current RD + fghj

--.0 000 1011 0100

--.1 001 1001 1001

--.2 010 0101 0101

--.3 011 1100 0011

--.4 100 1101 0010

--.5 101 1010 1010

--.6 110 0110 0110

--.7 111 1110/0111 0001/1000

Table 4-4: 3-bit to 4-bit Encode Table for Control Characters

Data Byte Name Unencoded Bits HGF Current RD – fghj Current RD + fghj

--.0 000 1011 0100

--.1 001 0110 1001

--.2 010 1010 0101

--.3 011 1100 0011

--.4 100 1101 0010


--.5 101 0101 1010

--.6 110 1001 0110

--.7 111 0111 1000

Serializer:
• A serializer/de-serializer (SerDes) circuit converts parallel data—in other words, multiple
streams of data—into a serial (one bit) stream of data that is transmitted over a high-speed
connection, such as LVDS, to a receiver that converts the serial stream back to the original,
parallel data. A clock system puts parallel into a serial by taking bits from the multiple
streams and alternating them on up and down parts of the signals.

• Both the serializer and de-serializer are functional blocks on the transmitting and receiving
chips. The two functional blocks are Parallel In Serial Out (PISO) and the Serial In Parallel
Out (SIPO).

• LVDS (low-voltage differential signaling) has two wires for one bit of data.

• Types of SerDes: PCI Express, SATA, XAUI

• SerDes has emerged as the primary solution in chips where there is a need for fast data
movement and limited I/O, but this technology is becoming significantly more challenging
to work with as speeds continue to rise to offset the massive increase in data.

• A serializer/de-serializer consists of functional blocks in a chip that are used to convert


parallel data into serial data, allowing designers to speed up data communication without
having to increase the number of pins. But as the volume of data increases, and as more
devices are connected to the Internet and ultimately the cloud, the need to move more data
much faster is growing. This, in turn, has made SerDes design increasingly complicated.

• Much of the demand for high-speed SerDes comes from large data centers, where the
current state-of-the-art throughput is 100 Gbps. Standards from IEEE and the Optical
Internetworking Forum are defining higher and higher data rates on a single lane, which
allow data to be aggregated to much larger systems. Then, to move SerDes technology to
the next level of performance, one of the major advancements is the adoption of PAM4
signaling above 28Gbps.
• The serializer converts the 10-bit parallel data obtained from encoder into serial form.
• This serial data is sent via a link to the receiver side.

Fig: Serializer.

Receiver:
De-serializer:

• A serializer/de-serializer (SerDes) is an integrated circuit or device used in high-speed


communications for converting between serial data and parallel interfaces in both
directions. A SerDes is used in a variety of applications and technologies, where its primary
purpose is to provide data transmission over a single or differential line by minimizing the
number of I/O pins and connections. In short, it converts parallel data into serial data so
that they can travel over media that does not support parallel data or it is used to save
bandwidth.

• The basic SerDes function has two blocks: the Parallel in Serial Out (PISO) block or
parallel-to-serial converter, and the Serial In Parallel Out (SIPO) block or serial-to-parallel
converter. Each end of a communication link has a SerDes with these two fundamental
blocks; the PISO block is used for transmission and the SIPO block is used for reception.

• SerDes chips are available in several architectures like:


• Parallel clock — This is used to serialize a parallel bus input together with data
addresses and control signals. A reference clock is used to synchronize the data stream,
which has a jitter tolerance at the serializer of 5–10 ps rms.

• Embedded clock — This serializes the data and the clock into a single stream. One clock
cycle is transmitted first followed by the actual data, creating a periodic rising edge at the
beginning of the data stream.

• 8b/10b SerDes — This maps the data to a 10-bit code right before serializing. The de-
serializer makes use of the reference clock to monitor the recovered clock from the bit
stream.

• Bit interleaved — This multiplexes multiple slower serial data streams into faster
streams, whereas the receiver demultiplexes the faster streams back into multiple slower
streams.

• The serial data that is sent from the transmitter is sent to receiver’s first block.

• De-serializer converts the 10-bit serial data into parallel form and sends it to the buffer.

Fig. De-serializer.
Buffer:
• Buffer can be used as a delay element to overcome synchronization problem.

• Adding a buffer reduces the wire length, which reduces the net capacitance, and hence
delay from source to destination decreases.

• The buffer is used as a delay element to delay the arrival of parallel data from de-serializer
before giving it to the decoder.

• The parallel data is given to the decoder at once from the buffer after a delay of 10 clock
cycles.
8b/10b Decoder:
Each receiver Lane incorporates a 10b/8b Decoder which is fed from the buffer. The 8b/10b
Decoder uses two lookup tables (the D and K tables) to decode the 10-bit symbol stream into 8-
bit Data (D) or Control (K) characters plus the D/K# signal. The state of the D/K# signal indicates
that the received symbol is:

Fig. 8b/10b Decoder per Lane

• A Data (D) character if a match for the received symbol is discovered in the D table.
D/K# is driven High.
• A Control (K) character if a match for the received symbol is discovered in the K table.
D/K# is driven Low.
Disparity Calculator:

The decoder determines the initial disparity value based on the disparity of the first symbol
received. After the first symbol, once the disparity is initialized in the decoder, it expects the
calculated disparity for each subsequent symbol received to toggle between + and - unless the
symbol received has neutral disparity in which case the disparity remains the same value.

Code Violation and Disparity Error Detection:

The error detection logic of the 8b/10b Decoder detects errors in the received symbol
stream. It should be noted that it doesn't catch all possible transmission errors. The
specification requires that these errors be detected and reported as a Receiver Error
indication to the Data Link Layer. The two types of errors detected are:

Code violation errors (i.e., a 10-bit symbol could not be decoded into a valid
8-bit Data orControl character).
Disparity errors.

There is no automatic hardware error correction for these errors at the Physical Layer.

Code Violations:

The following conditions represent code violations:

• Any 6-bit sub-block containing more than four 1s or four 0s is in error.


• Any 4-bit sub-block containing more than three 1s or three 0s is in error.
• Any 10-bit symbol containing more than six 1s or six 0s is in error.
• Any 10-bit symbol containing more than five consecutive 1s or five
consecutive 0s is inerror.
• Any 10-bit symbol that doesn't decode into an 8-bit character is in error.

Disparity Errors:

• A character that encodes into a 10-bit symbol with disparity other than neutral is encoded
into a 10-bit symbol with polarity opposite to that of the CRD. If the next symbol does
not have neutraldisparity and its disparity is the same as the CRD, a disparity error is
detected.
• If two bits in a symbol flip in error, the error may not be detected (and the symbol may
decode into a valid 8-bit character). The error goes undetected at the Physical Layer.

Physical Layer Error Handling:

When the Physical Layer logic detects an error, it sends a Receiver Error indication
to the Data Link Layer. The specification lists a few of these errors, but it is far from
being an exhaustive error list. It is up to the designer to determine what Physical Layer
errors to detect and report. Some of these errors include:
• 8b/10b Decoder-related disparity errors
• 8b/10b Decoder-related code violation errors
• Elastic Buffer overflow or underflow caused by loss of symbol(s)
• The packet received is not consistent with the packet format rules

De-scrambler:
• It is used to reverse the scrambled code generated by scrambler.
• The de-scrambler circuit is same as scrambler.
• The 8-bit decoded input is given to de-scrambler to get original data which was initially
given to the scrambler.

Fig. De-scrambler.

You might also like