Code coverage techniques -- a hands-on view

Code coverage techniques -- a hands-on view
By Alain Raynaud, EEdesign
February 28, 2003 (4:41 p.m. EST)
URL: http://www.eetimes.com/story/OEG20020912S0059

Coverage has become a key technology in the pursuit of efficient and accurate verification of large designs. Obviously, simulation is still the cornerstone of verification, but the time when a single designer could write exhaustive vectors for a chip is long gone. Fortunately, advances have been made that have streamlined the verification process. Accelerated simulation, random test generation and simulation server farms, to name just a few improvements in simulation methodology, have "solved" the initial problem: has the chip been tested? The answer is yes, but now the question is: how good are the tests?

In order to measure the progress of the verification effort, engineering managers are becoming more dependent on verification coverage metrics. These metrics provide an indication of the quality of the effort, and guide engineering teams' efforts in enhancing the test plans. The goal of verifying register-transfer-level (RTL) code with coverage feedback is to achieve confidence in the completeness of testing of the design prior to manufacturing, thereby ensuring that no functional bugs exist.

This paper discusses the use and goals of coverage at Tensilica on the Xtensa processor core. Use of industry standard coverage metrics and its relevance to the chip test plan are discussed. Specific examples of coverage and its limitations are also mentioned. The paper also provides insight into recent technology advances in this area, such as new observed coverage metrics and temporal based coverage.

RTL coverage
The easiest form of coverage to introduce into a verification methodology is register transfer level (RTL) coverage. Tools are available that, given an existing RTL design and a set of vectors, will provide valuable information. Apart from learning to use the tool and spending some time understanding the reports, RTL coverage does not fundamentally change the existing methodology and therefore can be added to any design project .

The basic idea behind RTL coverage is that, if the designer wrote a line of RTL, it was meant to be useful for something. While the tool does not know what for (a limitation that we'll discuss later), it can correctly identify problems when a line was not used for anything at all.

Thus, line or statement coverage measures whether each line of RTL code was exercised at least once. In the case of Tensilica's Xtensa processor core, most of the code is made up of Verilog continuous assignments, as opposed to procedural assignments. Technically, therefore, all those assignments get executed "all the time" and line coverage returns near-100% coverage. Better measures of RTL coverage were introduced, the main ones being path coverage and expression coverage. In our case, again, path coverage does not apply as our design contains very few paths. Expression coverage, on the other hand, is where we spend our time.

Expression coverage looks at the right-hand side of an assignment and gives a more c omplete picture of the circumstances that cause an assignment to actually execute.

Don't forget the test plan
RTL coverage users should heed one major warning: it is tempting, whenever the tool finds one case that was not covered, to add some vectors to cover it. While doing so will quickly increase the coverage results to whatever management asks for, it is the wrong approach from a methodology standpoint.

There should be a test plan that describes the kinds of tests that are required to fully exercise the design. Any hole that RTL coverage finds should be seen as a criticism of that test plan. The important questions to answer, therefore, are: why didn't we think of exercising that case in our test plan? Did we miss some functionality in the chip that we are not testing? In many cases, the corrective action will indeed be to add one more tests for that case, just like what the inexperienced verification engineer would have done. But sometimes, the hole detected by the RTL coverage tool w ill have a much greater meaning that would have been missed otherwise.

Limitations of RTL coverage
Often overlooked, the first limitation of RTL coverage tools is that they do not know anything about what the design is supposed to do. Therefore, the tools can only report problems in the RTL code that has been written. There is no way for them to detect that some code is missing. This simple fact means that RTL coverage will never find more than half of any design's bugs. Some bugs are due to incorrectly written RTL, while some bugs are due to RTL that is simply not there.

The second limitation of RTL coverage is the lack of a built-in formal engine. An expression coverage tool would see no problem in reporting that certain combinations of the expression inputs were not covered, even though by construction, they are mutually exclusive and therefore formally unreachable. This "spurious" reporting adds to the confusion and limits the effectiveness of coverage. Constant propagation and effecti ve dead code detection would also benefit from such an analysis.

Figure 1 - Instruction decoding logic

In this example, expression coverage on assignment to iAND_R reports that QRST_R was never false when RST0_R was true. This case cannot happen because RST0_R depends on QRST_R. The logic in this case is redundant, but was written this way to make it more regular and therefore reduce errors and typos.

The third limitation of RTL coverage tools is much more fundamental: what does achieving 100% coverage mean? As long as the tool finds holes in the test plan, this information is valuable. However, once coverage has reached 100%, it does not mean that verification is complete. On the contrary, it is the "unofficial" way for the tool to say that it can't provide any useful information any longer.

A common fallacy is to hope that 100% coverage brings some sort of guar antee about the quality of the vectors that have been run. This is misleading. The fact that all the lines were executed does not mean that all the lines have been tested. A line that is an input of a multiplexer could have been executed once but the multiplexer was selecting the other input during that cycle. Therefore, whether the line was feeding correct data or not, the simulation results would never have seen the difference.

Managing coverage data
From a manager's perspective, RTL coverage is a very convenient tool because its results can be summarized in one number. The challenge rests with the design and with verification engineers. As with any other tool, the amount of useful information it produces must be higher than the amount of work to extract that information. Otherwise, the tool will simply not be used.

Some of the limitations of RTL coverage discussed above, and especially the lack of a formal engine, mean that a verification engineer must analyze the RTL coverage report be fore passing it to the designer. The verification engineer has only so much design knowledge and therefore asks questions such as "Why do we not have the case where signal bufferIsFull is true at the same time as signal killStage?" The designer may answer, "Because the buffer cannot be full by definition if this pipeline stage is empty." Such questions illustrate that no matter how well the verification engineer knows the design, for tricky cases, only the original designer (and sometimes not even the original designer) will know that certain cases can't happen.

The challenge is not to waste precious designer bandwidth chasing problems that could be automatically eliminated. Indeed, most of the time, a formal engine could have automatically determined that such cases don't make sense due to the way the logic is implemented. We have had a case where the original designer was convinced that a certain hole in the coverage could be covered by more vectors. After spending a day trying to come u p with such vectors, we changed the approach and manually ran a formal check. It proved that the specific piece of RTL was redundant and could actually be eliminated. It's these kinds of stories that build respect for a tool and ultimately decide whether the tool will be given attention and taken seriously.

Observed coverage
Over time, various approaches have tried to correct the fundamental flaw of RTL coverage. The idea is to find a way to make sure that a given statement actually contributes to the design behavior. Functional fault simulation is one solution. By injecting faults into a gate-level simulation and running functional vectors, one gets very interesting insights into the design.

For each gate, this practice says whether the simulation would have produced a different result had the gate not been there. The provocative "do you mind if I remove this gate from the design, it's useless anyway" gets the attention of the designer every time. Because functional fault simulation consu mes so much time, only a few companies that could afford simulation accelerators have used it successfully. It never became mainstream.

Tensilica has been evaluating a new technology from Synopsys called Observed Coverage (OBC) that achieves the same accuracy, from the RTL, within a reasonable simulation time. Tensilica has had early access to this technology, prior to it being incorporated into Synopsys' upcoming release of its VCS Verilog simulator.

For each line of code, the official definition of OBC is "stimulation of a line whose effect can be subsequently observed at a user-specified point." In essence, OBC will report that a line is not covered if it could be removed from the source code without impacting the simulation. This is an extremely powerful result. For the first time, 100% coverage actually has a positive meaning. Designers spend time chasing real problems, not artifacts of other coverage methods.

Figure 2 - Loop termination logic

For the example above, based on our tests, traditional line coverage reports that the assignment to signal countIsOne_I is "covered" whereas OBC reports that the assignment is not "observed." The reasons the signal is not observed are as follows:

The signal countIsOne_I fans out to only one (complex) expression
This expression is assigned to signal iterEnd_I
This signal iterEnd_I fans out to a couple of places
After 3 levels of logic, a redundancy is created. Logic redundancy is why the original signal could not be observed.

Most likely, neither traditional RTL expression coverage tools nor synthesis tools would have figured out that signal countIsOne_I is unobservable because of redundancy. Because the logic is so complex, we would not have easily determined that countIsOne_I produces no effect on the design functionality without OBC.

Other examples of OBC's usefulness we re on tests that created much internal activity, but did not result in any visible activity and points of interest. These points consist of the processor major states and the locations of our monitors. In cases of low OBC coverage, we examined our test plans and modified the tests so that they resulted in activity propagating to specified points.

Usage of OBC in this manner significantly increases the chance that potential bugs will be made visible to the verification engineer. As a result of our experience with OBC, we have incorporated it into our deliverables and run it regularly. Because it provides a more accurate measurement of the completeness of verification stimulus on the design code, OBC allows us to produce higher quality designs compared to traditional coverage tools.

Functional coverage and data mining
OBC is definitely a significant step forward in the coverage world. However, it is only a building block upon which we would like to create a more sophisticated coverage infrastructure.

RTL is one implementation of a design specification. In the case of our Xtensa product, for instance, the RTL implements a processor according to a certain pipelined micro-architecture. Designers tend to think at the architectural or micro-architectural level when debugging the chip, not in terms of lines of RTL. The most obvious property in a processor is the notion of instruction.

Which instructions are in which pipeline stages is critical to understanding whether the design behaves correctly or not. More generally, coverage at this level means having a clear picture of what the processor is doing. This is where a temporal assertion language becomes handy. A hot topic today is whether a standard will be achieved soon and tools will start using temporal assertions to describe all kinds of coverage data.

In our case, we would want to define many temporal assertions corresponding to interesting architecture situations - for example, if a certain buffer is full, if some state machin e is in an interesting state, and so forth. We would then query this "database" of coverage events with sophisticated requests. For instance, was the state machine for the Instruction Fetch module in state A when an interrupt occurred? Furthermore, under the previous scenario, was the interrupted instruction aligned on a 32-bit boundary? Did that happen when this other buffer in the LoadStore unit was full? Each of these properties can be expressed fairly simply using a temporal assertion language. The step of building such a coverage database and using it would mean entering the world of data mining.

Figure 3 - Cross-product coverage table

This example shows how we wrote some monitors that extract coverage information and perform a cross-product between two states. The columns represent different alignment possibilities, the rows correspond to instruction classes. The different values in the table (Miss, Hit, Both, or eXecute) add a third dimension to the table. We would like to generalize this concept using data mining and perform more sophisticated analysis or our micro-architectural coverage. The challenge lies in visualizing and navigating data in multi-dimensional forms, where the number of dimensions (or independent variables) can easily reach 5 or 6.

Conclusion
As design verification becomes more challenging, coverage is about to play a more central role in the verification methodology. In the coming years, it could be expected that coverage will interface with a variety of tools such as formal engines, temporal assertion languages and data-mining tools, ultimately building what could be called the "Verification Grail."

In the short term, coverage is another means that contributes to improved verification. Observed coverage is a continued step in the direction of better verification technology. It is our experience that having different ways of questioning the design, by using dif ferent approaches, is the best guarantee of finding all the bugs. Each time a new methodology offers a fresh way to challenge the design, we usually gather interesting information that had been overlooked by other methods. Don't throw away your old simulator, as it will still find 80% of the bugs -- but the time to consider coverage has come.

Alain Raynaud is in charge of exploring advanced verification technologies at Tensilica in Santa Clara, CA. Previously Alain was with Mentor Graphics' Emulation Division in Paris, France where led the front-end group and obtained a patent on RTL debugging. Alain holds a MS from the University of Illinois and Ecole Superieure d'Electricite.