Enter the Inner Sanctum of RapidIO: Part 2

Enter the Inner Sanctum of RapidIO: Part 2
By Greg Shippen, Motorola, Inc., CommsDesign.com
November 24, 2003 (9:31 a.m. EST)
URL: http://www.eetimes.com/story/OEG20031124S0020

As designers begin evaluating new interconnect options, the RapidIO specification has once again emerged as one of the front-runner technologies for next-generation communication architectures. But, to successfully implement the technology, designers must first understand the key technical elements that make this specification come to life.

This is the second installment in our detailed inside look at the RapidIO specification. In Part 1, we examined the three main layers that make RapidIO work. We also looked at the serial and parallel interfaces. Now, in Part 2, we'll further the discussion by detailing the bandwidth requirements and flow control mechanisms. We'll also look at the key requirements for building RapidIO switches and endpoints. Let's kick off the discussion by looking at the bandwidth requirements or both the serial and parallel interfaces.

Bandwidth< br>The RapidIO serial and parallel interfaces offer a range of bandwidth options. The 8-/16-bit parallel interface peak bandwidth ranges from 4 to 32 Gbit/s in each direction depending on width and applied clock rate. The 1X/4X serial interface offers a peak bandwidth of 1 to 10 Gbit/s in each direction depending on link speed and lane width.

An early goal for the protocol was to minimize overhead. The parallel interface efficiency ranges from 48 to 87% for data payload sizes between 32 and 256 bytes. Over a similar payload size range, the serial interface efficiency ranges from 53 to 88%, not counting the 8B/10B encoding. These numbers include acknowledgement overhead.

Given that PCI-64 reaches an efficiency of only 49 to 69% over a similar transfer size, there is evidence to suggest the efficiency goal was successfully met—an impressive feat for a system-level packet-oriented protocol.

Ordering, Flows, and Deadlock Avoidance
Most transactions in a system do not have specific or dering requirements. However, some operations impose specific ordering requirements. For example, the order of writes to an I/O device or being able to read data updated by a preceding write may be critical to correct operation. Deadlock avoidance is another important case in which ordering in a system is important. Unfortunately, order can impose significant performance limits on a system. For this reason, it is important to provide ordering only when needed and allow system resource the freedom to reorder transactions for performance or quality of service purposes.

For ordering purposes, the concept of flows is defined at the logical layer of the specification. A flow is a sequence of ordered non-maintenance requests between a specific endpoint pair. Request transactions from the same source but targeting different destinations exist in unrelated flows and have no ordering requirements between them. Response transactions are not part of any flow and there is no ordering between them.

Multiple prior itized flows may exist between a source and destination pair. Request packets in flows of higher priority may pass those of lower priority flows. Packets in a lower priority flow must never pass those of a higher priority. Prioritized flows are defined to allow different classes of service between endpoint pairs. The degree of differentiation in service between prioritized flows depends upon the implementation of the endpoints and switches along the flow.

Within a flow, strict ordering of request transactions is required. This means writes may not pass writes and reads push writes ahead. Because responses are not part of any flow, read and write responses may be serviced by an endpoint out-of-order. In practice, this means read requests may be performed out-of-order (though a read request must still push writes ahead).

Ordering and flows are defined at the logical layer but implemented at the physical layer. Both physical layers define three flows through the use of a 2-bit packet priority field. Ea ch packet is assigned one of four priorities. Request packets are assigned a priority based on flow level. Requests in the lowest priority flow are assigned the lowest priority; the next highest priority flow is assigned the next priority and so on.

Maintenance transactions have a priority field but exist outside of other request flows. When routed through the network, maintenance packets with the same path may never pass maintenance packets of equal or higher priority. This effectively strictly orders maintenance packets between source/destination pairs.

Deadlocks occur when a dependence loop exists in the system. Deadlock exists when forward progress at any point in the loop requires progress to be made ahead of it and no place in the loop can make forward progress. While network topologies in which loops exist for response-less transactions are forbidden, some transactions do require responses and thus have the potential for creating dependency loops.

Provision for deadlock avoidance for requ ests with responses must then be provided. The approach can be summarized as creating the circumstance in which responses can always make forward progress in the system regardless of the presence of other transactions. This is accomplished at the PHY by assigning responses a priority one higher than the priority of the associated request and optionally allowing endpoints to promote the priority of their response even higher until the packet can make forward progress. In order for this approach to work, all devices in the system must implement buffer management schemes that always prevent higher priority packets from becoming blocked by lower priority packets.

Controlling Flows
RapidIO defines both link-level and end-to-end flow control. Link-level flow control exists at the physical layer and creates back-pressure from the receiver back to the transmitter side of the link. Both receiver control and transmitter-based flow control are supported. Receivers retry instead of acknowledge a packet wh en buffers temporarily fill. Alternatively, transmitters may avoid retries by using returning receiver buffer status from control symbols to send packets only when buffers are available.

End-to-end flow control is supported at the logical layer and is used to control congestion when it occurs in the network. As traffic sources increase the amount of data they inject into the network, the capacity of some links can be exceeded and cause buffers behind the link to fill. This not only causes congestion along these primary pathways but head-of-line blocking can also cause congestion in unrelated paths that share common links.

Flow control is accomplished using a congestion control packet (CCP) that is generated by a switch or endpoint experiencing congestion. This packet functions as an XON/XOFF and is sent backward to turn off the source of packets and later to reenable that source as congestion abates.

CCP packets exist within their own flow and are independent of request and maintenance flows. CCP packets are ordered within their flow and are always sent at the highest physical priority, thus allowing it to pass requests from all other logical flows.

CCP packets control logical flows not physical priorities. Often, CCP packets are generated for congestion caused by non-maintenance requests and not, for example, responses since they resolve congestion by releasing resources at their destination. Because logical flows are encoded in physical priority bits and promotion of priority can complicate matters, a reverse mapping of priority to flow must be done.

Unlike other packets, the CCP packet may be dropped by switches should buffers be filled. If this occurs for an XOFF packet, subsequent congestion backward from the original congestion point will cause additional CCP packets to be generated. If an XON packet is dropped, a timeout mechanism is provided to turn disabled flows back on.

Notes on Endpoints
Packets are created and consumed by endpoints, and they are associated with a variety of functions including processors, bridges, and slaves. In addition to managing the RapidIO protocol itself, an endpoint's most important task is often translating to the transaction types and data sizes of some other protocol.

Each endpoint in a system is assigned a unique device ID at initialization. This ID represents the routing address used as packets make their way through the network to the desired endpoint. Associated with each device ID is a set of capability, command, and status registers. Similar to those defined in PCI, these registers allow system software to identity the capabilities of the device as well as give access to control and status information. In addition, register space is set aside for extended and implementation specific features.

A set of required registers is associated with each device ID in the system. When endpoints have more than one device ID associated them, they are required to duplicate required registers. As a result, it is likely most implementations w ill allocate one device ID per endpoint.

Buffer sizing and management is implementation dependant but must in general follow deadlock avoidance rules that require that packets and their associated operations cannot be blocked by lower priority packets and their associated operations.

Endpoint designs that support end-to-end flow control at a minimum disable packet transmission for flows turned off by incoming CCP packets and associate a time-out counter in case the associated XON is lost in the network. XOFF CCP packets for a given flow are counted and the flow turned back on only when the corresponding number of XONs have been received. Endpoints may also issue CCP packets when internal buffers reach critical levels much as switches would.

Notes on Switch Designs
Switches are key elements when creating large-scale systems, and the protocol was designed to minimize the burden placed on them. Simple switches need only examine the priority bits to decide how to order transactions in thei r buffers. However, more sophisticated switches could examine the source and destination ID fields, as well as determine when request packets are in the same flow and ordering must be maintained. The priority mechanism saves the switch from having to account for packet type, function or interdependencies when making ordering decisions.

Switches are not endpoints and thus have no device ID. In general they do not source or sink packets. The only exceptions to this rule are that switches must source and sink maintenance transactions and may optionally generate congestion control packets.

The transport layer defines a destination-based routing scheme where each switch examines the destination ID of an incoming packet, finds the ID in a routing table and routes the packet to the output port indicated. With the exception of the link-specific AckID bits, packets proceed through the switch unmodified. No modification is necessary because the AckID bits are not covered by packet CRC.

Switch implementatio ns can vary widely in complexity. Both store-and-forward and cut-through operation is supported. Cut-through is aided by the early CRC that allows a switch to have confidence in the integrity of the header (and thus priority and destination ID) before it has been fully cut-through routed to its destination port. The amount of buffering, arbitration policies and level of service each flow receives are implementation specific.

Some switches may elect to support end-to-end flow control. When supported, switches monitor the state of internal buffering and when selected water marks are reached, issue congestion control packets to turn off associated flows at the source. Switches keep track of outstanding XOFF packets on a per flow basis and turn that flow on again when the buffer falls below relevant watermarks.

Hot Swap
RapidIO switches and endpoints can support hot swap and high availability features. However, the specification intentionally avoids limiting the design space and leaves some o f the implementation detail to switch and endpoint implementations. Necessary protocol is present though. For example, the maintenance port write packet type allows switches to communicate hot swap events in band to system hosts. The physical layers define how to detect when a link has failed and disable and re-enabling links once the swap occurs. For high availability, multiple redundant links in both active and passive standby states are supported.

PCI Compatibility
One of the biggest questions surrounding RapidIO is whether it is compatible with existing PCI interconnects. While PCI support is not native in the protocol, RapidIO does support both transparent and non-transparent PCI/PCI-X bridges. Transparent bridging is possible when a consistent mapping exists between PCI and RapidIO transaction types and PCI ordering rules are maintained across the RapidIO network. The RapidIO interoperability specification defines these transaction mappings as well as the mapping of PCI transactions to RapidIO physical priority in order to adhere to PCI ordering rules.

Wrap Up
Many viable interconnect solutions exist for chip and board-level interconnects. However, at the system level, requirements converge to present a more demanding picture. To effectively address system-level architectures, arbitrary topologies and peer-to-peer operation must be supported. Reliable service for control plane traffic, as well as effective congestion control for the data plane, must be available. A variety of transaction types that can support existing software paradigms are required. In addition, an ideal system-level interconnect would also offer a straightforward and efficient protocol using industry-standard PHY technology.

RapidIO technology addresses each of these requirements while offering the lowest overhead and widest functionality. RapidIO technology presents for the first time the opportunity for designers to leverage an open industry standard at the system level for both control and data pl ane applications.

Author's Note:To find out more information on the RapidIO specifications, visit the Trade Association's web site at www.rapidio.org.

Editor's Note: To view Part 1 of this article, click here.

About the Author
Greg Shippen currently serves as chairman of the Technical Working Group for the RapidIO Trade Association. He is also a system architect for Motorola Semiconductor, where he has been involved in PowerQUICC processor design and other RapidIO-related product definition and development. Greg holds an M.S.C.E. from University of Southern California and a B.S.E.E from Brigham Young University. He can be reached at greg.shippen@motorola.com.

Industry Articles

Enter the Inner Sanctum of RapidIO: Part 2