HyperTransport: an I/O strategy for low-cost, high-performance designs

HyperTransport: an I/O strategy for low-cost, high-performance designs
By Brian Holden, Principal Engineer, PMC-Sierra MIPS Processor Division, Santa Clara, Calif., Technical Working Group Chair, HyperTransport Consortium, EE Times
January 17, 2003 (11:21 a.m. EST)
URL: http://www.eetimes.com/story/OEG20030116S0039

At a time when embedded processors are topping GHz clock frequencies and 64-bit processors are coming into their own in embedded systems, the choice of the chip-to-chip I/O technology can be a significant factor in the success of a given embedded system. The reason is partly technical and partly business. On the technical side, whatever chip-to-chip interconnect technology is chosen to interconnect high-speed processor complexes (often consisting of multiple 64-bit processors) it must deliver high data throughput and it must be compatible with existing legacy I/O protocols, i.e., PCI.

On the other hand, in today's business environment in telecom and the embedded market in general, performance must be delivered in a cost-effective manner. This means interconnect technology must deliver superior price/performance results than the buses it replaces and be scalable in affordable ways.

On both counts, HyperTransport technology is a signifi cant improvement over alternative I/O strategies. On the one hand it can deliver up to 12.8 Gbytes/second bandwidth to support multiple GHz+ 64-bit processors as well as emerging I/O technologies such as InfiniBand and 10 Gigabit Ethernet. On the other it can be implemented in scalable format through the system so that the wider and faster links are in place between processor complexes and narrow links are in place between slower legacy I/O devices and the main subsystems.

HyperTransport chip-to-chip interconnect technology lowers system cost in a number of ways. It is scalable, enabling designers to apply just the right amount of bandwidth at each point in the system. Also, it uses enhanced, low-power 1.2V, point-to-point low voltage differential signaling (LVDS) technology to save on power consumption. In addition, the buses are specifically designed to enable the use of low-cost 4-layer printed circuit boards. And, it is widely used in high volume products such as personal computers, servers, and even game consoles.

Finally, access to HyperTransport IP is inexpensive as any member company of the HyperTransport Consortium gets a license free use of the technology as part of the modest membership dues. As a result there are a number of companies supporting HyperTransport technology and thus, there is a thriving HyperTransport ecosystem. This in turn reduces the price of HyperTransport components and lowers the cost of developing embedded systems that utilize HyperTransport technology.

HyperTransport interconnect technology was designed to replace and improve upon the existing multilevel buses used in systems such as personal computers, servers and embedded systems while maintaining software compatibility with PCI, the most ubiquitous I/O bus in use. To improve on speed and manufacturability, HyperTransport was defined as a dual, unidirectional point-to-point link as opposed to a multi-drop bus structure. Low-power 1.2V enhanced LVDS signaling was specified.

Dual-data rate data transfers increase data throughput as information is exchanged on both the rising and falling signal edge of each clock. This yields a 1.6 billion data transfers/second data rate per data pair. Finally, a packetized data protocol was chosen that eliminates many sideband signals (control and command signals) and makes possible variable width, asymmetric data paths.

These electrical and protocol choices provide several benefits to the embedded system designer. HyperTransport interconnect technology is low-power, high throughput and easily configured to fit the bandwidth requirements throughout the system. For example, in high-performance systems that use multiple processors, full 32-bit wide links can be employed to provide the maximum processor-to-processor communications path.

In the same system, 8-bit wide HyperTransport links can be deployed to connect to older PCI-based I/O subsystems through HyperTransport-to-PCI bridges. These narrow links are compatible with the wider links becau se HyperTransport moves its data through packets. The packets can travel over 2, 4, 8, 16, or 32 bit wide paths and move as easily (although at a slower data throughput) over 2 bit wide paths as 32 bit wide paths.

Speed and elegance
As compared to older I/O buses such as PCI, HyperTransport interconnect technology is faster, more elegant and less expensive to deploy. In addition, it is PCI software compatible so that for systems using PCI I/O devices, there is no additional overhead at the operating system or driver level. For example, HyperTransport devices are configured using the PCI configuration header protocol where information specific to the device is contained in the PCI configuration space.

Unlike traditional parallel multi-drop buses like PCI, HyperTransport needs far fewer signal lines and the clocking scheme is simplified because of the use of differential signaling. This technique uses two wires for each signal with the result being the difference between the two si gnals. This approach is free from the problems associated with single-ended signaling of high-speed parallel buses such as bouncing signals, interference and cross-talk from adjacent lines.

Where the PCI bus may require 32-bits of address and 32-bits of data using multiplexed, bidirectional signal lines, the same amount of I/O data can be piped through HyperTransport links using as few as 2 signal pairs (4 lines total, 2 per data bit) or as many as 32 signal pairs (64 lines total, 2 per data bit). Since the lines are unidirectional, clock skew is minimized.

Because the command and address information is encoded in the four-byte data packets, there is no need for additional timing and control signals. An 8-bit signal path can be constructed using 16 signal lines, a control signal and a single clock signal. This simplifies the function of the I/O link to high-speed data movement while the protocol encodes data, address and command information. By simplifying the electrical characteristics of the link, HyperTransport makes implementation of high-speed links on a standard four-layer printed circuit board simple and low cost.

Another concern for high volume embedded systems is designing low-cost printed circuit boards that are not susceptible to clock and signal skew. In addition to enabling much narrower links using easy to implement low-power 1.2-V LVDS signals, the HyperTransport specification defines an independent clock for every grouping of up to eight LVDS signals. This greatly reduces clock skew. Signal skew is reduced by keeping propagation delays equal, by reducing electrical issues, and by minimizing routing distance variations. To keep trace lengths the same, designers often have to resort to "tricks" such as snake-like patterns or swizzles along the signal trace that wastes board space and is inefficient. HyperTransport technology has improved upon this by designing a route layout from die pad to die pad instead of forcing the designer to figure out the best way to minimize skew. This technique is called naturally compensating trace length matching.

Naturally compensating trace length matching involves designing the transmitter and receiver so that the package traces are routed to lengths either matched or mismatched based upon the device ball out or pin out. Inside the package, a true signal — each HyperTransport signal pair consists of a "true" and "complement" signal, the data value being the difference between the two — is connected to a ball one row further away from the die than the complement in such a way that there is actually a mismatch in the package.

When signal traces are then connected from package to package, that mismatch is offset exactly by the mismatch on the board. This creates a zero mismatch and reduces skew inherent in traditional designs. HyperTransport signals can be laid down in straight lines betw een packages while still meeting the stringent timing requirements of the high-speed I/O technology. Other I/O technologies require much more complex routing of signals to meet the electrical specifications, making them more costly and less reliable.

These electrical features make HyperTransport very easy to implement in low-cost four-layer printed circuit board technologies, greatly reducing the cost of high volume embedded systems while maintaining a high data throughput and excellent signal integrity.