Selecting PCI Express IP for Your Design

by Stephen Peltan, R&D Sr. Staff Engineer - Synopsys

PCI Express, the next generation of the PCI bus, is being widely adopted in todayâ€™s high-performance PCs, servers and embedded applications. This high bandwidth protocol keeps the same software interface and many of the key features of PCI, but has a number of differences and new features. The biggest changes with PCI Express are the use of serial data transfers and gigahertz clock speeds, making the protocol more complex, but providing significant improvement in data throughput.

This application note provides you with a very brief introduction to the emerging PCI Express protocol and explains how selecting the right digital and mixed signal IP can accelerate the implemention of this new standard into your designs.

PCI Express Overview

PCI Express provides low pin count, high reliability, high speed data transfer at a rate of 2.5 GBits per second and up, for serial links on backplanes and printed wiring boards.

PCI Express System Example - PC Motherboard Based System

In this example of a PCI Express system, the dashed lines represent PCI Express links, the purple boxes indicate plug-in cards, and the other boxes are components found on a system card.

The black boxes represent PCI Express IP, which is comprised of a digital component and a mixed signal component.

The digital portion may implement one or more of the following:

Root Complex Port (RC) initializes and manages the PCI Express fabric
Switch Port (SW) routes data between multiple PCI Express links
Endpoint Port (EP) are associated with I/O devices and terminate a PCI express hierarchy

The mixed signal block is:

PHY which performs analog /digital conversion as well serialization/deserialization

You can choose the bandwidth of each PCI Express link by selecting the number of lanes.

The PCI Express IP handles link initialization, error recovery, power management, data buffering, etc.

PCI Express System Example - Chip to Chip System

In this example, all circuits are on a single printed wiring board. The â€œcomplex endpointâ€ chip include two independent PCI Express links. The â€œendpoint / rootâ€ chip includes an additional type of digital PCI Express IP, DM (dual mode) which can operate as a root port, as shown here. It can also operate as an endpoint, for example, when plugged into a PC motherboard slot.

PCI Express Protocol Stack

The base PCI Express protocol stack is common to all of the IP shown in the examples. PCI Express IP hides most of the complexity from your application logic. See the following diagram.

PCI Express Links

As shown in the diagram below, each PCI Express link:

is dual unidirectional with no sideband signals
is serial, with differential signaling
includes embedded clocking
operates at a scalable frequency (2.5 Gb/sec. initially)
can be implemented in low voltage, silicon at 0.25u, and beyond

Data is transferred in packets which include an address, and a variable size data payload.

Reliability is provided by character checks, format checks, CRC code, automatic retransmission in case of error, and exchange of buffer space information, in the form of credits.

A handshake protocol to power down an idle link is included, as well as messages to handle interrupts, error reporting, and hot-plug events.

Configuration registers are available to customize link behavior.

Wide ports automatically configure to narrower ports, as required.

Quality of service and isochronous traffic are supported via optional virtual channels.

Digital IP to PHY Interface (PIPE)

Transmit and receive data, as well as status and control, are transferred between the digital and mixed signal IP on the PIPE interface. There are two standard options for transferring data across the PIPE interface:

One byte per clock cycle per PCI Express lane. In this case, the interface and the digital IP operate at 250 MHz, or
Two bytes per clock cycle per PCI Express lane. In this case, the interface and the digital IP operate at 125 MHz.

Selecting PCI Express IP for Your Application

To add a PCI Express port to your chip you must:

select the lane width
select mixed signal IP including PIPE interface frequency and optional features
select digital IP type (RC, EP, etc.) and optional features

Selecting the Lane Width

An N lane PCI Express port provides N x 2.5 GBits of raw throughput in each direction. Because of 8b10b encoding, packet payload size, and link overhead, the actual throughput varies. As an example, you can use the following table.

TABLE 1-1: PCI Express Link Throughout Examples

	Real Throughput Examples (GBits per Second) --------------Packet Payload Size in Bytes--------------
Lane Width	16	128	256	4096
x1	0.5	1.7	1.7	2.0
x4	2.0	6.8	7.0	7.8
x8	4.0	13.7	14.0	15.7
x16	7.9	27.4	28.0	31.4

Remember that these are only reasonable examples; another PCI Express application note will explain in detail how to calculate these numbers for your application.

Selecting Mixed Signal PCI Express IP (PHY)

Mixed signal IP is generally sold as a â€œhard macroâ€, which is tailored to your chip manufacturerâ€™s process.

Both the PCI Express and PIPE interfaces are standard. You can usually choose:

the number of PCI Express lanes
either a one byte or two byte PIPE interface. As explained above, you will determine the PIPE interface and digital IP operating frequency when you make this selection.

In addition to these standard feature, your PHY vendor may offer additional features:

lower cost ICs via smaller die size (some PHYs are 50% smaller than others)
better performance margin (some have twice the sensitivity and jitter margin)
yield and reliability
built-in diagnostics and system test

For example, the Synopsys PCI Express PHYs includes built in, unique diagnostics which provide on-chip visibility into the actual performance of your 2.5 GBit per second links. The diagram at the right shows actual scope data from the Synopsys PHY.

Selecting Digital PCI Express IP Types

Digital IP for Basic Applications

Based on your application, operating frequency and lane width, you may select PCI Express digital IP optimized for your application. Here are some examples of the range of Synopsys endpoint (EP), switch port (SW), root port (RC), and dual mode (DM) digital IP available to you.

TABLE 1-2: Synopsys Digital Implementation IP Examples

IP Type Example (EP, SW, RC, DM)	Explanation	Comments
32-bit optimized x1	32 bit data path, one lane (x1). One lane only operation allows extra multi-lane logic to be removed for lower gate count and power.	Supports a single lane (x1) with: 8-bit PHY interface @250MHz 16-bit PHY interface @125MHz
32-bit	32 bit data path, one lane (x1) to four lanes (x4)	x1 to x4 with 8-bit PHY interface @250MHz x1 to x2 with 16-bit PHY interface @125MHz
64-bit	64 bit data path, one lane (x1) to eight lanes (x8)	x1 to x8 with 8-bit PHY interface @250MHz, or x1 to x2 with 16-bit PHY interface @125MHz
128-8 bit	128 bit data path, one lane (x1) to 16 lanes (x16)	Supports 8-bit PHY interface @250MHz
128-16 bit	Pin selectable - root port or endpoint, one lane (x1) to eight lanes (x8)	Supports 16-bit PHY interface @125MHz

You can trade off gate count, operating frequency, and power versus maximum lane width, i.e. throughput.

Digital IP for Advanced Applications - Overview

It is relatively simple to select among root, switch, endpoint, and dual-mode ports if your application fits one of the examples shown at the beginning of this application note.

However, if you have an advanced application, you may want to know more about the differences. The following sections are a summary; see the DesignWare IP product databooks for complete information.

Digital IP - Upstream vs. Downstream Differences

A PCI Express hierarchy contains one or more PCI Express links, and includes a root port, optional switch ports, and endpoint ports.

Each link in the hierarchy must include exactly one â€œdownstreamâ€ facing port and one â€œupstreamâ€ facing port.

Root ports are always downstream
Endpoints are always upstream.
Switch ports may be configured to be either - see the PC system example at the beginning of this application note. In a switch device, the switch port closest to the root is always an upstream port; all the other switch ports are downstream.

Why does this matter?

When the digital IP initializes the PCI Express link, the upstream and downstream ports uses a slightly different protocol to automatically configure the link for width, and for lane and data reversal.
During idle times, the digital IP can autonomously transition the link into a deep low power state. This transition is requested by the downstream port, and â€œapprovedâ€ by the upstream port.

Digital IP - Configuration Registers Differences

Each instance of PCI Express IP contains a set of configuration registers:

Root and switch ports contain â€œType 1â€ configuration registers
Endpoint ports contain â€œType 0â€ configuration registers

Type 0 configuration registers:

indicate to system software that this device is the â€œendâ€ of a PCI Express hierarchy
contain a full set of so-call Base Address Registers (BARs) that help you filter and address map received packets

Type 1 configuration registers:

indicate to system software that there are more devices to discover beyond this device
contain limit registers to assist in packet routing to other devices
contain only minimal BARs for packets mapped to this device

Some other configuration register differences:

Endpoint devices may contain multiple copies of the configuration registers. This is used to implement â€œmulti-functionâ€ devices.
Root ports include extra registers to summarize error status for a PCI Express hierarchy.
Root and switch ports contain registers to manage â€œhot-plugâ€ events

Digital IP - Configuration Transaction Differences

Configuration transactions can only be initiated by root ports, and can only be responded to by endpoint ports and upstream root ports.

Configuration transactions are used to:

Determine the topology of a PCI Express hierarchy
Initialize the configuration registers after a PCI Express link is initialized. Many values can be initialized in hardware, e.g., using Synopsys coreConsultant. However other values, such as memory space enable, and base addresses, must be initialized via configuration transactions
Change the power state of a device
Read error report registers

Digital IP - Interrupt and Error Message Differences

As described in detail in another application note, PCI Express devices emulate PCI interrupt wires (INTA, INTB,....) by sending messages towards the root port:

Endpoints and upstream facing switch ports may initiate these messages
Downstream and upstream facing switch ports pass these message through to switch core logic
Root ports may receive these messages

Note that other types of interrupt messages (MSI, MSI-X) do not have these restrictions.

Error messages are sent by PCI Express devices in response to link errors. Endpoints initiate these messages, switch points initiate them and pass them on, and root ports receive them.

Implementing PCI Express into Your Design - An Introduction

The following diagram shows the major features of a simple endpoint design. See the DesignWare PCI Express IP databooks for details.

The Replay Buffer and Rx Buffer are respectively single and dual port RAMs. All logic to manage these buffers is included in the endpoint IP. For the receive (Rx) buffer, you may choose store and forward, cut-through, and bypass (no buffer) packet storage.

The Tx Fifo is optional - if your Tx DMA can continuously supply the data for an entire packet, the Fifo is not necessary.

The Internal Bus Adaptor is optional - it is only required if you wish to update so-called â€œread onlyâ€ fields in the IP configuration registers before link communication begins. As an alternative, all of these fields can be configured at synthesis time with the Synopsys coreConsultant tool.
The Digital IP interfaces shown in the diagram may be configured to fit your application:

The Tx Client builds packets for you from data, address, and other attributes presented by your logic. It also gates your packets according to the PCI Express rules for buffer space (â€œcreditsâ€) at the other end of the link. You can configure the IP to have up to three TX Clients.
The Rx Target interface disassembles validated packets and presents them to you as data, address, and other attributes.
Use of the External Local Bus interface is optional - it provides a convenient way for the processor at the other end of the link to read and write your local application control registers. No additional application hardware is required, in this case. The endpoint IP can be configured to map these register read/writes to the Local Bus interface.
The Data Bus Interface provides â€œback-doorâ€ local access to the endpoint configuration registers. Usage of this interface is also optional. It was discussed above with respect to the Internal Bus Adaptor.

Summary

PCI Express is a robust interface and selecting the right IP can help solve the complexities of implementing the protocol into your designs and accelerate your development process. The DesignWare IP for PCI Express is silicon proven in customer designs and is the industry standard, powering the PCI-SIG protocol test card and the first to pass the compliance test.

The DesignWare IP has gone through extensive interoperability testing with third party PHYs, verification IP and hardware. By providing a complete solution for PCI Express including digital controllers, verification IP, and mixed signal PHY IP, Synopsys helps lower your integration risk and overall deployment costs, while saving you significant time and effort.

For more information on DesignWare IP, visit www.synopsys.com/designware

Industry Articles