Verification of USB 3.0 Device IP Core in Multi-Layer SystemC Verification Environment
Update: Cadence Completes Acquisition of Evatronix IP Business (Jun 13, 2013)
By Ireneusz Sobanski, Evatronix S.A.
Wojciech Sakowski, Institute of Electronics, Silesian University of Technology
Abstract
The paper describes the methodology used for functional verification of the USB 3.0 device controller core. The core model has been developed at two different levels of abstraction: RTL model for synthesis and SystemC TLM model for high speed simulation, early software development and early test-bench creation.
The described verification environment, based on SystemC methodology, has been used in the process of functional verification of the both models. The work presents how such multi-model (RTL/TLM) design can be created and verified in configurable, multi-layered and coverage driven verification environment with third party verification components.
Characteristics of the Design Under Test (DUT)
USB 3.0 Specification
USB 3.0 (USB SuperSpeed) is the next generation of the USB industry- standard, operates with the speed as high as 5Gbit/s (effectively about 3.2 Gbit/s, taking in to account protocol overhead).
The SuperSpeed devices [1] communicate through dual- simplex connections, where two data paths independently carry data traffic in each direction. Such approach does not only increase available throughput, but also influence heavily architecture of the devices. The devices can send and receive data in the same time. The block of packets (burst) can be sent/received without waiting for acknowledge. The acknowledge packets still must be issued, but that can be delayed after the burst is finished.
SuperSpeed bulk transfers have been extended with Streams (the additional level for endpoints). The streams significantly increase available data source/sink number, but also further increase complexity of the architecture.
To fully utilize dual-simplex connection, SuperSpeed devices communicate through the pipes, connecting point-to-point host software and device endpoints or streams.
New power management features have been added and three levels of low power modes have been defined.
Physical layer is similar to the single channel (1x) PCIexpress. It uses 8B/10B encoding schemes, data scrambling and spread spectrum, but also introduce low frequency periodic signaling which is not present in other (similar) specifications (PCIexpress, SATA etc.).
Evatronix USB SuperSpeed Device Controller (USB3_DEV)
Device described in the paper has been designed to fully utilize USB 3.0 specification (available SuperSpeed bandwidth) and to minimize required resources (gates as well as on-chip buffers).
The endpoints data are stored in the main system memory (RAM) and are fully described with the linked descriptors list. The device constantly follows the descriptors list and read/write the endpoint (stream) data to/from the buffer if needed. Thanks to such approach the device buffers only the data which will be required by the host (IN transfers) or has been received from the host (OUT transfers).
The device registers access as well as DMA interfaces are based the on OCP specification (emerging as new universal IP interface), and can be supplemented with additional wrapper (provided by the company) for easy access to popular on-chip buses (e.g. AMBA, OPB).
Access to the physical medium (here: USB copper cable) is made through external physical layer (PHY) providing all low level functions (like: data serialization, 8b/10b coding, scrambling/descram-bling etc.).
As fully functional USB 3.0 physical layer chips were not available at the development time, standard RocketIO transceiver/serializer has been utilized (available in large FPGAs). Although the RocketIO do not natively support USB SuperSpeed specification, most of the functionality (data rates, coding schemas etc.) are similar to other serial interfaces (like: PCIexpress or SATA).
Low frequency signaling (LFPS) functionality that is not available in RocketIO has been implemented and attached in parallel to the transceiver. The LFPS functionality has been excluded from the rest of the IP and can be removed when USB3.0 physical layers will be available.
Protocol and Link Layers
Specification divides the USB 3.0 devices into three, highly independent logical layers (physical, link & protocol) plus device framework. The project concentrates only on link and protocol layers, leaving physical layer as out of the scope (in fact due to lack of proper physical layer, some PHY functionality had to be implemented in the core – LFPS signaling, SKP insertion etc.).
The link layer is responsible for establishing and maintaining the link connectivity as well as for successful data transfer between two link partners. Besides that, the link provides also facilities for link training, link testing/debugging and power management.
The protocol layer is responsible for the end to end flow of data between a device and its host. The layer is build on assumption that the link layer guarantees delivery of single packets (carrying by the link) and this layer adds reliability for the (higher level) USB transactions.
Both layers are complementary and may work separately. Link layer is not aware of the data transferred between higher layers and protocol layer may fully assume that packets send to and received from the link has been transferred correctly. Each packet transferred by the link layer has the unique identification number (separate to the protocol one) and has to be acknowledged by the link partner. Even isochronous data, unacknowledged at protocol stage, have to be confirmed at link level.
Such multi layer architecture greatly simplifies design partitioning and verification process. Both layers can be designed and verified separately and what is even more important, the architecture simplifies reuse of the verification environment components.
The issues with multilevel modeling (RTL/TLM)
The project has started far before the final USB 3.0 specification has been released, and some specification changes have been expected during development process.
To minimize the impact of late specification changes on the project, the design work started with development of the TLM model [2][3], which was used to help learn the new specification and was used to make some architecture explorations. For instance endpoint and streaming data buffering mechanism, based on linked descriptors list, has been first validated on high level TLM model. Later in the design process the TLM model was used for software API development and as golden reference model for the RTL core.
Both models take full advantage from using OCP interface [6][7]. For RTL model OCP provides the kind of middleware interface, which can be easily wrapped to any other standard if needed (AMBA, OPB etc.). The OCP consortium provides also third-party verification modules (like: transactors, assertions set), which can be easily used in verification process.
Another issue is transaction level modeling. TLM 1.0 specification defines data passing mechanisms, but do not cover interoperability issues. TLM 2.0 specification tries to address this issue, but without clear interface definition (preferably by specification maintainer) still leave a broad space for the developers. Generic payload [4] only partially resolves the problem, even if it’s the step in the right direction.
OCP consortium rises to the challenge and provides not only signal level specification [7], but also fully defines the other abstraction levels (TL1, TL2, TL3; TL0 means RTL level) which can be used as TLM interfaces [3]. Such approach makes the TLM model immediately available (without any adaptors) for all the users/systems using OCP interfaces.
Virtual prototyping is constantly gaining on its importance, but also still lacks in number of fully compliant RTL/TLM models. The SPRINT [8] project shows how cores equipped in multiple views can be used to accelerate development and verification of the multi-core SoCs. The proper environment modules (like transactors etc.) enable very flexible view switching between different levels of abstractions and choose the best combination between performance (TLM) and design details (RTL) [9].
Functional Verification
The testbench environment is based on multi-layer SystemC infrastructure supporting constraint-random verification techniques. Because the project contains synthetisable RTL model as well as High Level TLM model, the TB was developed in a way enabling easy verification of both of the models.
Only one verification environment has been created for both models. The TB works at TLM level and RTL model can be connected to it through appropriate transactors (in-house developed or third-party)[11][12][13].
The Figure 1 shows the basic architectures of the TLM and RTL models, with its layered structure, and the relation between the appropriate interfaces in both models.
As it has been presented in the SPRINT project [8][9], such conformable models may be seen as two separate views of the same model (here: usb3_dev controller) and are fully interchangeable in the design process.
Figure 1 Block Diagrams of TLM and RTL Models
Multilayer test environment
The verification environment has been architected in a similar way as the DUT itself. Such approach makes it possible to effectively verify both layers of the controller separately, with minimal number of additional components.
Additional requirement to the testbench were to maximally reuse not only the components (BFMs, scoreboards, transactors) but also the test cases. The goal has been satisfied and nearly all the test cases can be used on subcomponent level as well as on full design (Figures 2, 3).
The link layer as well as protocol layer tests are created in the way which enables their independent operation. The layers communicate only on packet levels and pass only minimum of other information.
The link layer is responsible for the connection establishment but the protocol does not have to be aware of the all details of this action. So the link layer informs the higher layer only whether it is active (can transfer SuperSpeed data) or not, and whether USB reset shall be applied. Besides that basic information, the link layer receives from and transfers to the protocol layer all valid SuperSpeed packets. In this aspect the Link Layer behaves as transactor and just passes the data between its interfaces.
To enable more advanced Link Layer tests, the component is able to generate automatic packets without protocol layer information. The packets are generated in the (Host) link layer and received only by companion link layer (device). Usually they are not passed further (to the protocol), but there are also situations where the packet is correct (from link layer point of view) and is passed to the protocol. In such situations, the generator has been constrained to generate only invalid protocol packets (e.g. not known packet type etc.). The functionality was introduced to simplify link only tests, but it also proves its high value with the protocol layer connected. In this case invalid and failed packets could be massively inserted between valid protocol data.
Valuable function available in our test environment is also possibility of removing link layer from the DUT. As it has been mentioned before, the link layer is mainly responsible for connection, recovery and data consistency, and the protocol layer can successfully work without the layer attached downward. It needs only basic information about the link state (SuperSpeed active, USB reset), and the layer may operate just with the USB packets exchanging.
Removing the link layer from the communication chain, gives us reasonable gain in simulation speed. It is just like raising the level of abstraction - low level details are removed. And what is more important here, it can be used for TLM as well as for RTL model.
Figure 2 Separately layers verification
Figure 3 Final DUT verification
The figure 2 shows the block diagram of TB environment where protocol and link layers are verified separately.
The Protocol (+Regs+DMA) module is directly connected to the software API and to Protocol Host module. The Software API as well as Protocol Host modules are configured (driven) by TestCtrl#1 module which contains all the information about test case.
TestCtrl#1 just calls the functions defined in Software API (available for the device firmware creation) and in the Host (Protocol), which mimics USB 3.0 Host functions. Additionally (what has not been presented on the picture) some environment signals, like connection signaling, are passed directly from the test controller module.
The link layer on the figure 2 is connected to Link Host module and to simple data generator, playing the role of the device protocol layer (which just generates random data for link layer). Because the Host Link layer provides also random data generator (as it has been described before) do not have to wait for protocol data. The Link Host module and Data generator are configured and driven by TestCtrl#2 module, which is fully responsible for the test case.
When the DUT protocol and link layers are assembled together (Figure 3) the general TB architecture is not changed. From the TB point of view, the Host subcomponents are put together. Software API and tests are remains without change and Data Generator (used for link layer verification) is removed. Test controllers (#1 & #2) must be merged, and all the tests can be run on complete device (with just minimal changes in TestCtrl module).
Although in this configuration the (Host) link layer take its data mainly from the (Host) Protocol module, the automatic packets a generator are still working and inserts incorrect data packets into true protocol data stream.
Use of TLM model
The main idea behind developing of the TLM model in the project was to learn new specification, make some architectural exploration and to speed up software development process. Nevertheless, the model has been also extremely useful during verification environment development and test case creation.
Most of the physical layer related tests (usually based on LFPS signaling) need very long simulation times. The time periods for the functions were counted in tens to hundred of milliseconds and simple bypassing of counters in RTL code not always were practical. TLM model works as event driven, without the clock signal and all such low frequency signaling were processed at maximal speed rate. In the other tests it is very similar. TLM model always were significant faster in simulation speed.
Another advantage of using TLM model in verification process is early availability of the model for the verification team. The test cases can be prepared before RTL code is available. Although finally we failed in creation of the entirely High Level model before RTL code and the TLM model has been developed mostly in parallel. The TLM model always has been a step or two before synthetisable core, and could be used to early test- bench creation.
Interfaces and Transactors
The testbench works on TLM level and is attached to RTL DUT through proper transactors, which translate high level commands into bit level sequences (Figure 1). TLM model can be attached to the Testbench directly. Transaction level environment is based on SystemC TLM library and all communication mechanisms (interfaces) follow TLM 1.0 specification.
Most of the USB traffic is generated and verified automatically in the BFMs and appropriate scoreboards. RTL model is additionally verified with SVA assertions [10][13], put to monitor internal components as well as external interfaces (OCP, PIPE).
The device natively support OCP interface for register set interface and DMA access, and slightly modified PIPE (PCIexpress) interface for USB SuperSpeed PHY communication. OCP represents one of the well defined standards with its own TL1, TL2 and TL3 transaction level specifications [7]. The device follows TL1 and TL2 definitions. The OCP consortium provides also SystemC transactors which can be used to connect TL1 interface with the RTL (TL0) what significantly decreases required development effort.
The PIPE interface defines physical layer connection and is borrowed from PCIexpress specification. The interface is not fully compliant with USB 3.0 specification and don't provide all necessary functionality (e.g.: LFPS signaling). That indicates that missing, low frequency signaling must be supported externally to the interface. The transactor supporting such modified PIPE interface has been developed internally and has been based on SystemC TLM 1.0 specification [3]. When USB3.0 physical layers will be available, the interface will be updated to the new specification.
Constraint Based Verification
Constraint Based Verification is one of the most recognized and accepted functional verification methodologies in industry. All major verification languages (SystemVerilog and e/Specman) [12][13] support this idea and use constraints random solvers for data generation. In SystemC environ-ments, SCV library provides full set of functions and methods supporting this methodology [5].
The main advantage of constraint-random verification methodology is automatic test generation. The user does not have to explicitly write test sequences or data values, but just configure data/sequence generators and wait for results of simulations. The results of generations very often drive the DUT into unexpected conditions and enable discovery of even deeply hidden errors.
Constraints-random generation has been fully utilized in our verification environment. Nearly all aspect of DUT activity has been randomized and can be configured through constraints mechanism. Exemplary all packet parameters, possible errors (and its distribution) and delays as well as generated answer for the DUT requests are randomize with SCV library. The other activity, like forcing USB reset, forcing recovery or other low level actions in general has to be explicitly run in the test controller, but even their parameters can be randomized to simplify exercising some corner cases.
Coverage Driven Verification
Constraints-random verification needs additional monitoring mechanism which provides the way to determine the verification status:
- Code Coverage
- Functional Coverage.
Coverage metrics, such as Functional Coverage and Code Coverage are commonly used to determine the phase/level of verification process. But on the same way, they can be used to instruct the verification team where and how to improve constraints for randomization engines [13].
Such approach to the coverage data puts additional requirements on the simulation process. Coverage information should be collected always when it is possible. Even on early stage of the verification process they can be interesting for verification team and may point out some randomization holes [13].
Valuable information from the coverage information is how the coverage data are changing during the simulation time. Constraint random simulation usually are working on always loop and can be run on very long (formally infinite) simulation times. The question appears here how long simulation time makes sense.
The SystemVerilog [10] specification defines functions for access to coverage database: $coverage_get etc. Nevertheless, at the moment there is not appropriate functionality for other environments (here: SystemC).
To enable at least basic coverage feedback to the tests, the components were equipped with simple logging mechanism based on C++ STL library [14]. The components log all actions and conditions related to them, and provides (export) simple API for access to this information. The implementation (based on STL map<xxx,yyy> construct) is very simple, but proves its high value in regression tests, where the simulation times could be easily bind to coverage goals.
Another useful function introduced to the logging mechanism, is “assert” functionality. The component is not only aware of the goal of the test, but can also break the simulation when testing sequence sidetracks from the expected test path. For instance if the test configuration guarantees that the link layer do not enter Recovery state, the user can just define such assertion in the test and not have to look into SVA or PSL assertions logs.
Such dynamic coverage and assertion definition adds to the standard assertion mechanism (rather static in its idea – always/never) new possibilities. The goals and guards (assertions) may be changed during the simulation time and follows the functionality of the test case. Such goals sequence has been successfully used in many very tricky corner case tests.
Summary
Development of the TLM model together with the RTL one helped us in many aspects of development process. The TLM model is about 3 times less complex than the RTL (RTL:32k/TLM:12k lines of code). The number of code lines may not fully express the complexity of the models, but we found out that development time is comparable to the ratio (TLM development took one third of that of RTL model).
We showed that the fast architectural exploration and the facilities in software creation (single language environment is used) are not the only advantages in using TLM models in functional verification. Early TB and test cases creation as well as significant simulation speed up (at least 30 times and exceeds 100 when link layer is removed from the simulation) reduce nicely functional verification effort. The test cases may be created immediately when TLM model provides particular functionality and very long tests may be run with accelerated simulation speed.
The layered architecture of the modern communication protocols itself turned out helpful in structuring verification effort to enhance it. The layers can be verified separately with minimal overhead in the test development effort. Most of the verification components, used at subcomponent level may be easily reused at chip and system level. We have shown also the possibility of reuse of the test cases developed for submodule verification at higher level.
Our experiences showed also that coverage information may be extremely useful for the test cases definition. Even simple coverage status feedback may be used to change simulation time. This is even more useful in case of constraint random simulations, where it is very hard to determine the end of the test.
Literature
[1] Universal Serial Bus 3.0 Specification, Rev 1.0, 2008
[2] IEEE 1666-2005, IEEE Standard System C Language Reference Manual, IEEE, 2005
[3] TLM Transaction Level Modeling Library, Release 1.0, OSCI, 2005
[4] TLM-2.0 Language Reference Manual, OSCI, 2009
[5] SystemC Verification Library (SCV), Release 1.0p2, OSCI, 2006
[6] Open Core Protocol Specification, Release 2.2, OCP-IP, 2006
[7] A SystemC OCP Transaction Level Communication Channel, Version 2.2, OCP-IP, 2007
[9] Straightforward IP Integration with IP-XACT RTL-TLM Switching, M. Zys, E. Vaumorin, I. Sobanski, Whitepaper, 2008
[10] IEEE Std 1800-2005, IEEE Standard for SystemVerilog- Unified Hardware Design, Specification, and Verification Language, IEEE, 2005
[11] Advance Verification Techniques: A SystemC Based Approach for Successful Tapeout - L. Singh, L. Drucker, N. Khan, Kluwer 2004
[12] Open Verification Methodology Cookbook - Mark Glasser, Springer 2009
[13] Verification Methodology Manual for SystemVerilog - J. Bergeron, E. Cerny, A. Hunter, A Nightingale, Springer 2005
[14] Standard Template Library Programmer's Guide
|