Non-transparent bridging allows multiprocessor design with PCI Express
EE Times: Latest News Non-transparent bridging allows multiprocessor design with PCI Express | |
Larry Chisvin (08/02/2004 9:00 AM EDT) URL: http://www.eetimes.com/showArticle.jhtml?articleID=26100745 | |
There are significant differences between the PCI Express architecture and its future cousin Advanced Switching Interconnect (ASI), but contrary to popular belief, the ability to build robust multiprocessor systems is not among them. The PCI Express standard offers a state-of-the-art serial interconnect technology with the same software architectural world view as PCI, and as such benefits from the same system-level capabilities. PCI-based systems today offer multiprocessor support, and PCI Express-based systems can and will be built with this same capability.
PCI-based systems that today provide multiprocessor capability use a mechanism called non-transparent bridging. Simply put, a transparent PCI-to-PCI bridge isolates two PCI bus segments, offering the ability to extend the number of loads, or to match different operating frequencies, bus widths or voltages. A non-transparent PCI-to-PCI bridge adds address-domain isolation between the primary and secondary bus segments. The bridge masquerades as an end point to discovery software, and translates addresses between the two domains.
To understand the importance of domain separation to multiprocessor systems, consider what would happen without it. Single-processor PCI-based systems depend on a host processor to enumerate and configure the system, and to handle interrupts and error conditions. The host, in effect, owns the system. If there are two or more processors in the system and no special attention is given to separating them, each one will try to provide the host function and the processors will battle for control of the system.
In a multiprocessor system, non-transparent bridges place all but one of the processors in its own address domain. One processor, behind a transparent bridge, remains the fabric manager, enumerates and configures the system and handles serious error conditions. The fabric manager in this type of system provides many of the functions normally associated with a host. The intelligent subsystems behind the non-transparent bridges enumerate up to the bridge, and do not directly sense that there is a larger system beyond.
Once the main issue in PCI-based multiprocessor design — namely address-domain separation — has been solved, other important matters need to be considered. These include inter-processor communication, peer-to-peer data transfer, distributed interrupt handling and host or fabric manager fail-over.
The processors in a system need to communicate with each other in some manner to share status and data, and in a PCI Express-based platform they accomplish this through address mapping and translation. This mapping mechanism allows specific, protected interaction through the use of base address and offset registers. Each intelligent subsystem opens a matching window in the other subsystem's memory. Communication can also occur through the use of so-called doorbell registers that initiate interrupts to the alternate domain, and scratchpad registers accessible from both sides.
Peer-to-peer transfer is easily accomplished in both PCI and PCI Express, since all traffic flow is address-based. If data needs to be sent from one subsystem to another, the source merely has to target the data at the address range of the destination subsystem.
Distributed interrupt handling takes a little more effort. There are several different interrupt mechanisms in PCI, one of which is called message signaling interrupt (MSI). This addition to the PCI-X specification is the main interrupt method for PCI Express, since interrupts are handled in-band.
An MSI interrupt in PCI Express uses a posted memory write packet format that allows it to be address-routed to any processor in the system. Non-transparent bridges also include incoming and outgoing mailbox interrupt registers that allow use of specially formatted interrupts to be sent from one processor to another.
The final multiprocessor issue here is fabric manager fail-over, dealt with in a well-defined and time-tested manner. Since each subsystem sits behind a bridge, each processor node can be either transparent (fabric manager) or non-transparent (other intelligent node). The other intelligent nodes, behind non-transparent bridges, can be intelligent I/O (such as a RAID controller), backup fabric managers or active processor subsystems that operate in parallel.
Universal bridges that can operate in transparent or non-transparent mode offer the ability to move the fabric manager to another subsystem should it become necessary. Heartbeat messages are sent from the fabric manager to the backup processors, and checkpoint mechanisms provide a clear path to resuming operation, should fail-over become necessary.
Finally, the end point associated with a non-transparent bridge provides the opportunity to load a filter driver that can hide the non-transparent bridge and associated multiprocessor issues from standard software.
The anticipated release next year of early ASI silicon has created excitement in the high-end communications and storage community, since it promises a powerful mechanism to provide very large, scalable multiprocessor systems. It is worthwhile to explore how ASI accomplishes this, and compare it to the use of non-transparent bridging using PCI Express silicon.
ASI eliminates the need for non-transparent bridging by dispensing with address routing entirely. Instead, it uses routing based on the path through the switches in a system. Each end point has a network map, and the packet header contains the directions that the packet needs to get to its destination.
This is an elegant method that can be used to scale high-bandwidth, complex multiprocessor systems uniformly, but since it breaks with the address-routing paradigm of PCI, it also breaks the software's backward compatibility. It also means that new silicon, aware of the new protocol, is necessary.
The good news is that since ASI is compatible with the PCI Express architecture at the physical and data-link level, systems that contain a mixture of the two can reasonably be constructed, with the two interconnects complementing each other. By the time ASI silicon has been sampled and is ready to be integrated into actual systems, there will be a wide variety of native-mode PCI Express silicon products, including every major type of I/O and most of the major CPU architectures.
The CPUs that will be available with integrated PCI Express ports are especially important, since the initial deployment of ASI-based systems will rely on connecting multiple processors, none of which will understand the ASI protocol. These processors will connect to the ASI backbone through non-transparent PCI Express bridges, and will then co-exist with the native ASI silicon. Such powerful, scalable ASI networks connecting PCI Express-based subsystems will be the norm for the foreseeable future.
Larry Chisvin (lchisvin@plxtech.com) is vice president of marketing at PLX Technology Inc. (Sunnyvale, Calif.).
| |
All material on this site Copyright © 2005 CMP Media LLC. All rights reserved. Privacy Statement | Your California Privacy Rights | Terms of Service | |