Integrating a PCI Express Digital IP Core into a Gigabit Ethernet Controller
Jing-fan Zhang, Director, Business Development, Synopsys
Abstract
A Gigabit Ethernet controller incorporating a PCI Express interface takes advantage of the high-throughput, low latency capabilities of PCI Express to deliver true gigabit performance. These enhanced capabilities allow for optimal sizing and utilization of on-chip memory, providing significant power and area savings. This paper discusses the integration and system verification challenges encountered when integrating a PCI Express digital intellectual property (IP) core into a Gigabit Ethernet design. Techniques for configuration of the PCI Express IP are presented that achieve the lowest power, lowest latency and smallest memory size, as well as optimal system performance.
Introduction
The PCI® interface evolved from the original ISA bus to work with new and faster processors. PCI was introduced with 33- and 66-MHz versions, and as performance requirements increased PCI-X® was introduced, running at speeds of 133- to 266-MHz. PCI Express is the next development in PCI technology.
PCI Express is a high-performance, general-purpose interconnect technology targeted at a wide variety of computing and communication platforms, including the next generation of PCs, servers and embedded applications. PCI Express is fully compatible with PCI in terms of its usage model, load-store architecture and software interface. New PCI Express performance-enhancing features include support for high-speed serial point-to-point links, a switch-based topology, and a packetized protocol. Other advanced features include power management, quality of service, Hot-Plug, data integrity and error handling.
Figure 1 depicts the evolution of the PCI family and, as a reference, indicates bandwidth requirements for a number of applications.
Figure 1: PCI Family History
PCI Express was introduced in 2003. Initial adoption rates have been high for two reasons. Firstly, it results in smaller system printed circuit boards (PCB) with fewer layers and a simplified routing thus reducing costs compared to PCI and PCI-X. Secondly, it supports higher speeds and new value added features. IDC projects significant growth associated with the new standard as illustrated in Figure 2 and Figure 3.
Figure 2: PCI Express System Shipments (source: IDC)
Figure 3: PCI Express Semiconductor Revenue (source: IDC)
PCI Express provides data transmission speeds of up to 250 MBps per lane, almost twice as fast as traditional PCI (133 MBps). These higher transfer rates help raise the efficiency of peripheral devices such as Gigabit Ethernet network interfaces. PCI Express is rapidly replacing PCI and PCI-X in Gigabit Ethernet controllers because it enables superior performance and low cost. PCI Express Gigabit Ethernet is able to achieve transmission speeds close to 2000 Mbps full-duplex. This enhanced performance can greatly improve and the productivity of inter-connected teams. The compatibility of PCI Express Gigabit Ethernet controllers with PCI and PCI-X software models eases network and system upgrades.
Figure 4 shows a high-level block diagram of a PCI Express Gigabit Ethernet controller design based on several pieces of digital and analog IP, both internal and licensed and with different levels of maturity and complexity. The Gigabit Ethernet PHY IP is based on complex proprietary digital and analog technology. The Gigabit MAC digital IP is a very mature piece. The PCI Express SerDes IP is mostly a high-speed analog IP block. This article focuses on the challenges and experience in integrating a licensed PCI Express digital IP into such a design.
Figure 4: The PCI Express Gigabit Ethernet Controller Diagram Highlighting IP Blocks
Challenges
Non-technical challenges
IP vendor selection
In today’s fast changing markets, there are many factors that affect the commercial success of a silicon product. The economic stakes are high, involving both time-to-market and direct costs.
Direct costs are easily measured. A partial or complete mask set can run from the hundreds of thousand of dollars in 130-nanometer technology, to over one million dollars at 90-nanometers. Time-to-market schedule delays result in lost market share. These costs can be in the hundreds of millions of dollars depending on the product area and competition. With these costs, it is vital to minimize the risk of failure in the design phase. One of the ways to reduce the risk is to use IP that is already proven. However, choosing the right IP and IP provider then becomes a key issue.
The Gigabit Ethernet controller’s specification includes a PCI Express interface to provide connectivity to a host computer system. After weighing up the balance of acquisition costs against developments costs, the Gigabit Ethernet chip development team chose to license commercially available IP from a provider with a proven track record. However, selecting the IP and vendor is never an easy task. Selecting PCI Express IP in 2003 when the technology was relatively new was even more challenging.
A number of IP vendors were evaluated across a range of criteria:
- The state of the quality of the external IP
- Ease-of-use and integration
- FPGA support and silicon proven
- Certification for interoperability, including PCI-SIG® compliance
- Support and maintenance
Beyond these criteria, it was also important to evaluate the IP vendor’s long-term commitment to the technology and the technical strength of its development team. This is crucial for early-stage IP development when proven silicon may not yet be available. In this situation, working closely with the vendor in partnership can be beneficial.
Partnership with IP provider
When the Gigabit Ethernet Controller’s engineering team first decided to incorporate PCI Express into the device, the PCI Express 1.0a specification was still in development. This created an additional challenge for both the IP developer as well as the engineering team. Any changes in the specification could complicate the integration efforts and the IP release process, which would directly impact the project schedule.
In order to reduce the risk of IP integration, the Gigabit Ethernet Controller design team formed a partnership with the IP provider. Engineers from both companies worked closely to address the ambiguity within an early version of the PCI Express specification, seeking clarification when necessary from the PCI-SIG PCI Express workgroup.
Technical challenges
Design integration challenges
Integration of the PCI Express IP had to take into account the fact that the PCI Express specification was not yet finalized. One of the main challenges was to preserve PCI compliance in the course of the integration, as compliance requirements extend beyond the boundaries of the IP.
Integration of the PCI Express PHY with the PCI Express digital controller IP presented several technical issues that had to be addressed. Numerous asynchronous interactions occur at this interface. The Link Training and Status and power management state machine implementations had to be robust to many unlikely, but possible, corner cases. Physical Layer handshaking protocol signaling can be impaired on the serial link in many subtle ways. The PHY Interface for PCI Express (PIPE) architecture defines a standard interface between the PCI Express MAC and the Physical Coding Sublayer. This effectively insulates highspeed and analog circuitry issues from the digital design, and proved very effective in enabling the coordination of the work of the Gigabit Ethernet Controller’s analog and digital sub-teams.
Implementing PCI Express power management was especially challenging. A complete PCI Express power management solution touches multiple standards (ACPI, PCI Express and PIPE).
Understanding the platform context in which these protocols are exercised is essential. This includes the platform hardware hierarchical structure, and a generic model for platform software operation. The PCI Express specification broadly describes the former, but avoids precise references to the latter.
PCI Express IP customization and configuration
A black-box approach to IP integration can sometimes be taken with proven, mature technologies. This was not the case for PCI Express at the time of the Gigabit Ethernet Controller’s development. The priorities of the PCI Express IP team had to be focused on delivering a working functional block, which meant having to leave out some specific customer requirements.
The design team had to customize the IP to meet all of the Gigabit Ethernet Controller’s product requirements. Without care, customization of the IP can produce a non-compliant solution.
In order to support the design trade-offs allowed by the standard, PCI Express IP needs to be parameterized. The selection of the digital IP configuration parameters was critical to the design of a high-performance PCI Express Gigabit Ethernet network interface with small on-chip memory and low power. Though the IP provider offered reference settings, this task could not be offloaded from the design team to the IP provider, whose charter was to create a generic and fully operational functional block.
Verification challenges
Functional verification is a critical part of any complex SoC design, and in this case the fundamental requirement was to ensure that the Gigabit Ethernet port worked as expected with the PCI Express serial link. Even though most of the IP used within the design had already been verified in isolation, with some parts already validated in silicon, it was still necessary to prove the entire design as a whole. The verification strategy took into account the functional partitioning, project schedule in terms of IP availability, and the resources available.
The options available for verification were either to use a simulator-based verification combined with an FPGA implementation, or to focus all efforts on developing a single simulator-based verification environment.
Taking a purely simulator-based approach would raise further questions over choice of stimulus generation and checking methods in order to find the fastest path to reaching 100% code coverage and a high level of functional coverage.
Silicon validation and PCI-SIG compliance
Ultimately, the device has to operate flawlessly within complex platforms such as desktop and laptop computers. These platforms typically comprise of components sourced from a multitude of vendors, all constrained by additional standards and specifications. Design teams need to appreciate the system issues for a successful integration of PCI Express IP. Taking this approach offers the best chance of delivering silicon that works first-time, and a flawless total solution including silicon chip, drivers and reference design.
Certification and interoperability are critical to a successful product launch and acceptance. In the area of Gigabit Ethernet, the University of New Hampshire (UNH) certification is so extensive that it provides more confidence than proving interoperability with a given set of link partner vendors. Because PCI Express certification procedures and compliance platforms are still being developed and augmented, interoperability testing and claims remain vitally important. Beyond certification and interoperability, performance measurements and benchmarking allow the assessment of architectural and customization decisions.
Early validation and interoperability testing were very challenging for a technology such as PCI Express. Initially, it was difficult to find platforms supporting the new technology. It was also hard to find compliant and reasonably bug-free link partners designed and manufactured by root-complex/north-bridge chipset vendors.
PCI Express IP Integration and Verification Methodologies
This section provides an overview of the solutions the design team implemented when faced with the PCI Express digital IP integration and related Gigabit Ethernet Controller verification challenges.
Design tectonic and integration flow
Integration of the PCI Express IP had to be performed in precisely defined phases. The initial integration phases were the responsibility of the Gigabit Ethernet Controller PCI Express sub-system team – part of the overall Gigabit Ethernet Controller design team. The PCI Express sub-system team released the IP to the overall controller design team after carrying out preliminary integration steps. These steps paved the way from the PCI Express IP to the PCI Express sub-system:
- Review of the latest IP release
- Integration of the code base in the design database
- Customization and configuration of the released IP code base
- Integration with the PCI Express PHY (Physical Coding Sub-layer and SerDes)
Figure 5 presents an overview of the integrated (and customized) IP along with the surrounding Gigabit Ethernet Controller functional modules.
Figure 5: Customized PCI Express IP and surrounding Gigabit Ethernet Controller blocks
The PCI Express sub-system would ultimately be integrated with the Gigabit Controller, MAC and PHY, Clock and Reset, External ROM Controller and Test sub-systems. These additional integration steps would involve the entire Gigabit Ethernet Controller design team to produce the full system shown in Figure 4.
For successful integration, the PCI Express IP and its surrounding blocks have to follow the same interface rules. Seamless integration relies heavily on clear communication between teams, and precise documentation. While a standalone IP component may work perfectly well in isolation, it is clearly unacceptable if it is non-compliant when instantiated into the sub-system. Two examples illustrate key issues that must be considered during integration.
A first example can be found in the definition of the IP interface with the application logic (Gigabit Ethernet controller direct memory access – DMA – engines). Its data section is logically segmented in signal sets for the transfer of PCI Express packet header fields as defined by the PCI Express specification. The content of these fields has to conform to high-level formation rules also defined in the PCI Express specification. Assuring that these rules are followed is down to the application logic, which converts raw data to and from a PCI Express packet format in a way that is most efficient to its basic data transfer functions.
Another example comes from the implementation of PCI Express resets. Reset mechanisms require a thorough understanding of PCI as well as PCI Express reset categories. These resets need to be properly mapped to the reset signals of the PCI Express IP for graceful transitions. Each reset signal covers a specific reset domain (set of configuration registers, datapath logic, etc). This mapping is the responsibility of a Reset Module that was designed by the Gigabit Ethernet Controller team. Reset domain definitions were extended to the entire design (Gigabit Ethernet PHY, MAC, bridge controller and PCI Express SerDes PHY). Improper mapping or extension would result in malfunction, which can range from mildly impaired device operation to complete platform failure.
PCI Express core customization and configuration
Figure 5 provides a visual summary of the customization work that was performed on the PCI Express core to create the Gigabit Ethernet Controller’s PCI Express sub-system. The individual customization tasks can be prioritized in order of increasing technical risk:
- Test and debug interface addition to support the chip validation phase
- Configuration space customization to meet some of the fundamental software programmability needs of the Gigabit Ethernet Controller
- SerDes control, status and test interface addition
- Local bus extension for EEPROM boot-load of default Configuration register values and PXE (Preboot Execution Environment) support for network remote boot
- Sizing of memory structures and substitution of high-level behavioral memory models by actual memory models
Verification strategy and action plan
This section explores the functional verification methodology in light of the predominant design approach, which uses a mix of commercial and proprietary IP.
The chosen verification strategy was based on extensive use of simulation. The availability of a centralized compute farm was essential in determining the verification approach. All of the design team efforts were concentrated on the creation of a single testbench. Verification was driven by the verification environment and testbench architecture definitions, the selected stimulus generation strategy and coverage model definitions.
Any complex verification environment is formed from a multitude of verification components. Each verification component contains one or more bus functional modules (BFM) and monitoring elements. The BFM translates high-level stimulus structures into low-level bit streams and drives them into the device input ports. Monitors contain automated checks and functional coverage items. The structural organization of the verification environment is naturally dictated by the structure of the device under test and its hierarchy of sub-systems and modules. From that respect, the Gigabit Ethernet Controller’s verification environment can be seen as a hierarchy of verification environments. Sub-system environments are reused to form the full chip environment by turning verification components drivers at the boundaries of the sub-system off and leaving monitors on. With this approach, monitors and their internal automated functional checks and coverage items are ported in the next level up. Some of the sub-system stimulus that used to be provided by sub-system drivers now comes from other parts of the design through internal interfaces. Beyond this plain structural integration, the stimulus generation component needs to be expanded in order to synchronize the actions of imported sub-system drivers to model synchronization of higher-level processes, or be able to explore specific corner cases. These considerations were all applied to the PCI Express sub-system verification environment design and its integration to the overall Gigabit Ethernet Controller’s environment.
There was a clear need to have a verification environment for the PCI Express sub-system itself, even though this part of the design was derived from the PCI Express IP, and the IP vendor was already putting significant effort into testing it. The first reason was the fact that the IP was being customized and configured for the needs of Gigabit Ethernet Controller. The second was the absolute need to thoroughly test integration of the PCI Express digital logic with the PCI Express PHY prior to its integration to the full design. The associated verification test plan focused mainly on the area of customization and PHY integration without trying to re-verify basic parts of the core such as the PCI Express Data Link Layer functionalities.
For the creation of stimulus, the Gigabit Ethernet Controller team used a directed-random approach. It is based on a random stimulus generation engine which is constrained using statements written in an abstract language. Basic constraints define data structures (for instance PCI Express TLP header fields formation rules). Additional constraints served to implement test scenarios as defined in the team’s test plan. The Gigabit Ethernet Controller’s design benefited from the advantages of such an approach:
- It “evenly” explores the huge functional coverage space with a limited amount of effort, since there is a minimum amount of test code to write and maintain
- It can generate combinations the test writers might not have thought of, or would have overlooked when writing tests following a directed test generation strategy
- It still allowed the team to ensure that specific areas were tested by applying additional code writing efforts in order to direct the stimulus generation engine into corner cases
Progress of the simulator-based functional verification task could not be measured simply by the amount of CPU time utilized. A combination of code coverage, functional coverage and embedded assertions were used for that purpose. Code coverage provided a good measure of how extensively the design had been stressed. It also helped in identifying untested areas. The functional coverage model was realized by performing a high-level partitioning of the stimulus set. Functional coverage measured how extensively device functions had been exercised through pseudo-randomized test cases that were run with a variety of seeds. The Gigabit Ethernet Controller design team also complemented their coverage model by leveraging the assertions embedded in the code of the PCI Express IP.
Core Configuration for High Performance, Small Footprint and Low Power
The Gigabit Ethernet Controller team had to make trade-offs in order to achieve the lowest power, lowest latency, smallest memory size, and optimal system performance when customizing and configuring the PCI Express IP.
Payload size, DMA engines and TLP segmentation
Figure 6 illustrates the effect of each overhead type on the available data bandwidth for different TLP maximum payload sizes. Data Link Layer management overhead is of a different nature compared to packet structure overhead. In particular, the impact on one link direction is generally associated with the operation of the other direction (e.g. Ack/Nak DLLPs sent upstream for downstream traffic). The simplifying assumption in this figure is that transmitted TLPs are interleaved with DLLPs with a one for one ratio.
Figure 6: Effect on the maximum available bandwidth of different components of the PCI Express overhead (SEQ: Sequence number; FS/E: Frame Start/End; FLOW: Flow control)
When using the PCI Express IP data segmentation is performed by the application logic according to the PCI Express Max_Payload_Size parameter value and other rules, such as naturally-aligned 4kB address boundary crossing avoidance. From the application logic standpoint, the choice of Max_Payload_Size Supported is based on a trade-off between the maximum available peak throughput desired and the maximum link access latency experienced by each of the internal PCI Express Requester agents (DMA engines here). Even for the minimum value for Max_Payload_Size of 128B, the maximum available bandwidth is well above 1 Gbps, the Ethernet maximum data rate in a given direction. Yet the PCI Express maximum available bandwidth is not solely used to transfer network data to and from host memory. It is also used to support the network data transfer process based on the interaction between the software layer DMA management and the hardware DMA engines. The Gigabit Ethernet Controller DMA architecture and logical structure were optimized for PCI Express. In particular, the structure of the DMA descriptors was chosen to allow descriptors to pack all relevant information in a single Max_Payload_Size packet.
Since PCI Express is based on a serial duplex somewhat mirroring the Gigabit Ethernet duplex the PCI Express Gigabit Ethernet controller can be based on two fully independent DMA engines: a Tx DMA engine for transmitting data onto the network and an Rx DMA engine for receiving data from the network. Beyond descriptor structure optimization, the PCI Express Gigabit Ethernet controller can exploit other PCI Express performance features.
To fully leverage the split transaction paradigm each engine can keep track of its multiple PCI Express outstanding transactions when performing read operations in host memory. This feature applies particularly to the Tx DMA engine, the main task of which is to move data from host memory to the network through the controller. It requires the Tx DMA engine to be able to re-assemble data in the order it was requested even for a total data amount corresponding to an Ethernet Jumbo frame. The PCI Express specification does not actually require (and thus does not guarantee) the preservation of the order of Completions to Read Requests within the PCI Express fabric. The high-throughputs supported by PCI Express make possible the implementation of a cut-through datapath on the Rx side. Data can then be transferred from the network to host memory without unnecessary on-chip buffering thus reducing the required size for on-chip memory.
Silicon Performance
Beyond silicon features, driver software is a critical requirement to fully realize the capabilities of the Gigabit Ethernet Controller. Standard throughput benchmarking tools and platforms indicate sustained peak throughputs can easily get above 940 Mbps for Rx and Tx. Power consumption can be below 900mW even during continuous full-duplex data transfers in gigabit mode. Being based on a serial technology the device can fit in a low pin-count package such as 68-pin MLCC or 64-pin QFN resulting in a very small footprint.
Conclusion and recommendations
This paper presented the key challenges when integrating PCI Express digital IP into a high-speed interface device such as the PCI Express Gigabit Ethernet controller and some of the methods that were used to overcome them.
Selecting the right PCI Express IP from the right IP provider was the first critical step. Successfully integrating the IP required partnership efforts with the IP provider and a carefully managed IP release process for enhancements and bug fixes. This was particularly true when the PCI Express technology was still at an early stage.
For successful integration of the IP, the system-on-a-chip design and verification team needed to have a clear understanding of a complete PCI Express system. Lacking such an understanding could have easily led to a non-compliant, non-functional solution.
The Gigabit Ethernet Controller design team opted for simulator-based verification relying on the use of an abstract verification language and directed-random test to leverage the availability of extensive computing power. This approach was used to carry out testing of the IP once customized and connected to the chosen PCI Express PHY to form the controller’s PCI Express sub-system.
Gigabit-class throughput can be achieved without necessarily sacrificing area and power if the design architecture leverages the PCI Express performance features (high-throughput, split transactions, power management, etc.) and the IP is properly configured for that purpose.
Thanks to its scalability, PCI Express is able to support the product roadmap of communication and networking silicon vendors and the requirements of future network interface devices with increased throughputs and low power consumption, enabling feature-rich content delivery in a flash on fixed and as well as portable platforms.
About the Authors
Jing-fan Zhang joined Synopsys in October 2004 with the acquisition of Cascade IP Inc. where he served as CEO since 2002. Prior to founding Cascade, Mr. Zhang served in various technical and management roles at Intel Corp and Tektronix since 1991. Mr. Zhang was a key contributor to IEEE 802.3 (Gigabit Ethernet) and 802.11 (Wi-Fi) standards. He led Intel’s 1st generation Gigabit Ethernet Controller development. He holds a MSEE and BSEE in Electrical engineering, and was a PH.D Candidate in mixed-signal IC design. Mr. Zhang holds multiple US patents.
Fadi Saibi is a Senior Member of Technical Staff with the Architecture and Intellectual Property Department in the Telecommunications, Enterprise and Networking Division of Agere Systems. He started as a research trainee in the High Speed Communications VLSI Research group at Lucent Bell Laboratories in Holmdel, NJ. He joined the group in 2001 as part of the Lucent Microelectronics Division spin-off, later renamed Agere Systems. He has been working on signal processing for optical and copper wireline communications, multi-gigabit backplane equalization, Gigabit Ethernet, PCI Express and VLSI integration since then. He received the Ingénieur Degree from Ecole Polytechnique, Paris, France and the Ph.D. in Electronics and Communications from ENST a.k.a. Télécom Paris, France.
For more information on Synopsys DesignWare IP, visit www.synopsys.com/designware
|
Synopsys, Inc. Hot IP
Related Articles
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |