|
||||||||||
Enabling Composable Platforms with On-Chip PCIe Switching, PCIe-over-CableBy PLDA, Inc. 1. Introduction Modern enterprise workloads in AI and data analytics are driving the need for new compute and storage architectures in IT infrastructures. The growing use of accelerators (GPUs, FPGAs, custom ASICs) and emerging memory technologies (3D XPoint, Storage Class Memory, Persistent Memory), and the need to better distribute and utilize these resources are fueling the transition to composable/disaggregated infrastructures (CDI) in data centers. With this changing landscape, a number of interconnect protocols have emerged (NVMe-oF, CCIX, Gen-Z, CXL) promising to address the challenges introduced by the composability model. While these interconnect technologies mature and make their way towards mainstream adoption, system vendors still have various options that leverage the well established PCI Express protocol to enable scale-up and scale-out composable fabrics. In this article, we describe the most common options and present a new trend that involves combining on-chip PCIe switching and PCIe transport over cable to build intelligent, scalable, high-performance composable systems. 2. Fabric Composition with PCIe Switch ICs PCIe switch semiconductor integrated circuits (ICs) have been available for over a decade, from vendors like PLX (Avago) and Microsemi (Microchip). These PCIe switch ICs have evolved to allow a variety of use models, ranging from simple PCIe fanout expansion using PCIe transparent switches, to more complex PCIe fabric topologies using Non-Transparent Bridging (NTB) as illustrated in Figure 1. Figure 1 - Example PCIe Switch topologies While OEMs, ODMs, and many system vendors widely employ transparent PCIe switches for fanout expansion, non-transparent fabric switches are intrinsically more complex, rely on custom NTB software to operate, and are therefore more difficult to integrate and deploy. Even as some companies like Liquid and Dolphin Interconnect Solutions are bringing PCIe fabric based solutions to the market, using discrete switch ICs for building PCIe interconnect fabrics presents several limitations:
For those technology companies that are designing their own chips, it makes sense to look at embedding PCIe switching capabilities into their SoCs as a way to differentiate, future-proof their designs, and implement the exact feature set required by their applications. 3. Fabric Composition with On-Chip Switch IP SoC architects now have the option of integrating PCIe switch IP into their designs. The main benefits to this approach are:
Figure 2 - SoC with transparent switch IP and embedded endpoint The capabilities of the endpoint functions can be further expanded with the use of virtualization, allowing resource sharing among multiple Virtual Machines. Multiple host domains can be supported with multiple switch IP instantiated in the SoC, along with a NTB mechanism allowing communication across the different PCIe domains, as shown in Figure 3. Figure 3 - SoC with two PCIe domains connected via NTB 4. Expanding Server Reach with PCIe-over-Cable 16GT/s PCIe 4.0 signals can only travel 3 to 5 inches on standard FR4 PCB, and without going through any connector. Moving to MEGTRON 6 PCB and adding retimer ICs help improve the travel distance, however with a significant cost increase. With the commoditization of optical communication, it is now possible to deploy, at scale, the infrastructure necessary to transport PCIe 4.0 signals at 16GT/s over hundreds of feet. Our latest experiment, pictured in Figure 4, has shown a PCIe 4.0 x4 link connecting two peers over 330 feet of optical cable with a slight latency penalty but no impact on data throughput. Figure 4 - PCIe switching over optical cable demo setup 5. Putting the Pieces Together We are seeing an increase in the number of IC designers looking to integrate intelligent switching capabilities into their PCIe based SoCs. For the type of architecture outlined in Figure 2 and Figure 3, PLDA XpressSWITCH transparent switch IP is the de-facto solution, deployed since 2016. XpressSWITCH key features include:
Figure 5 provides an architecture overview of XpressSWITCH IP. Figure 5 - XpressSWITCH IP architecture XpressSWITCH IP is at the core of PLDA’s INSPECTOR for PCIe, a host platform with diagnostics capabilities used at PCI-SIG Compliance Workshops since 2016 for PCIe 4.0 FYI interoperability testing. By coupling XpressSWITCH IP with PCIe transport over optical media, as demonstrated using Samtec FireFly™ Micro Flyover System™, system builders can further expand the reach of the fabric to build fully disaggregated PCIe based platforms. 5. Conclusion Disaggregated, composable infrastructure is defining the next wave of data center architecture. While new memory semantic fabrics and communication protocols have emerged to enable this paradigm shift, data centers are years away from seeing these technologies deployed in silicon. Meanwhile, SoC designers are finding ways to leverage the well established PCIe protocol to build intelligent high performance fabrics. With the commoditization of optical communication, SoCs are also able to transport PCIe traffic off-chip across long distances, with minimal impact on performance, thus enabling disaggregated PCIe-based architectures. By using off-the-shelf PCIe switch IP, such as PLDA XpressSWITCH, IC designers have found a flexible way to build composable PCIe systems, allowing them to define and control every aspect of the solution in terms of capabilities and features, and ultimately create differentiated products.
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |