Why Interlaken is a great choice for architecting chip to chip communications in AI chips
By Chip Interfaces
Modern Artificial Intelliegence (AI) chips are designed to meet a variety of rigorous requirements to handle the intensive demands of artificial intelligence and machine learning workloads. These chips need high computational throughput to execute numerous parallel operations efficiently, with low latency to support real-time processing. Energy efficiency is critical, balancing high performance with manageable power consumption to reduce operational costs. High bandwidth is essential for rapid data transfer between processing units, minimizing bottlenecks. The architecture must also support scalable and flexible designs to accommodate various AI models and deployment environments, from edge devices to large data centers. Efficient data handling and communication capabilities are vital to ensure swift data movement across different chip sections and between multiple chips, supporting heterogeneous computing environments. Additionally, robust error handling features are necessary to maintain data integrity. These architectural needs drive the development of sophisticated interconnect solutions to maximize the performance and reliability of AI chips.
The Interlaken protocol is an advanced interconnect technology that effectively addresses the architecture and design requirements of AI chips. It provides high bandwidth through multi-gigabit-per-second lanes, facilitating the handling of large data volumes and sustaining high computational throughput. Its design minimizes latency, ensuring efficient, low-overhead communication crucial for real-time AI applications. Interlaken is optimized for energy-efficient data transfer, reducing power consumption by maximizing payload efficiency. The protocols scalable architecture allows for configurable bandwidth, making it suitable for a range of AI deployments from edge to cloud. Interlaken’s high-speed serial links ensure fast data movement across different chip sections and between multiple chips, supporting heterogeneous computing environments. Robust error detection and correction mechanisms enhance data reliability. By fulfilling these critical requirements, Interlaken enhances the performance, efficiency, and reliability of modern AI chips.
Case study of what makes Interlaken great for chip to chip communications for AI
To understand what makes Interlaken great for chip to chip communications let’s explore an AI (Artificial Intelligence) application case study where multiple AI chips are placed on a single card. This use case also scales equally well to a case of multiple AI Chiplets within a single die with the difference there being the type of Serdes being used. Short distances in the chip matrix require a light weight point to point communication without the need for routing. In this use case Interlaken with its low overhead and small footprint is the perfect candidate.
Interlaken Controller | Related |
Card to Card communications from the chips on the edge of the matrix can be efficiently carried out with the use of Ethernet IP and when connected via switches it further allows for message routing and reduces the need for the number of communication lines to reach multiple targets on other cards. While high speed Ethernet MAC solutions offer a latency of around 200ns, Chip Interfaces Interlaken can offer a Ultra Low latency solution of sub 100ns round trip, that AI applications can truly benefit from.
Low latency and small footprint are not the only benefit of Interlaken. The ability to do flow control on individual source data channels, and the capability to recover from errors either via the Retransmission mechanism of with the use of Reed Solomon Forward Error Correction all happening in the Controller IP transparent to the application is a great benefit to solution performance.
Interlaken can operate on the same serdes technology as 100G Ethernet effectively allowing for an architecture where the limited device I/O can be shared over the same PHY via Ethernet and Interlaken. This allow for a truly flexible design of a multiplexed architecture, when building an array of chips or chiplets, where devices are interconnected with Interlaken and the edge devices of the matrix make use of the Ethernet capabilities for further Ethernet Switched networking. A single card will carry multiple AI chips, and Card to Card connectivity will be carried out over ethernet with routing.
Chip Interfaces Interlaken IP
The Chip Interfaces Interlaken IP core is a highly optimized silicon and PHY agnostic implementation of the Interlaken Protocol version 1.2 targeting both ASICs and FPGAs. Our Interlaken controller supports up to 2.6 Tbps high-bandwidth performance and comes with an integrated Media Access layer. The Interlaken Controller can be widely used in chip-to-chip transfers, it has an extensive feature-set available and allows scalability in number of logic channels (up to 2048), lanes (up to 48) and lane speed (up to 116 Gbps). Features the RS FEC extension when operating PAM4 Links and the Retransmit capability removing the need for the Application layer to handle data retransmission. In band and Out of Band flow control is available to backpressure channels and allow for rate matching on the links.
Key Features:
- 48 Lanes up to 116 Gbps per lane.
- 2.6 Tbps total bandwidth per IP instance.
- Fast Amba CXS Application side interface 256-8192 bit wide with 2-8 segments
- FEC Extension, for use on noisy links or with PAM 4 signaling
- Retransmit Extension to handle data errors without Application assistance
- Up to 2048 Channels, multiple sources of data, allows burst interleaving
- In Band and Out of Band, per channel, Flow Control allowing Rate Matching
|
Related Articles
- Why Software is Critical for AI Inference Accelerators
- Why the Memory Subsystem is Critical in Inferencing Chips
- Hidden Signals: The Memories and Interfaces Enabling IoT, 5G, and AI
- Architecting hardware, software & communications for the electronic battlefield
- Options emerge for 10-Gbits/s chip-to-chip interfaces
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |