Benefits of Reconfigurable IP in the Back-end SoC Development Process

by Kenn Lamb & Tony Stansfield, Elixent Limited
Bristol, UK

Abstract :
This paper compares reconfigurable IP – a class of processing cores that provide high-performance, low power, and run-time flexibility – with other forms of intellectual property, and explains how reconfigurable IP, by acting as a platform for the implementation of arbitrary soft IP blocks can simplify some significant issues in the back-end of the SoC design process, issues that are becoming worse as process geometry continues to decrease.

Overview
Discussions of the use of Reconfigurable IP cores normally concentrate on the benefits that they can bring in terms of late design changes and post-fabrication (even in-the-field) changes in functionality. However, this is only a part of the benefit of using Reconfigurable IP – there are also speed, area and power savings when compared to other programmable approaches. In this paper we also consider the benefits that Reconfigurable IP can bring to the back end of the SoC design process – placement, routing, timing closure, and Design for test/Design for manufacturability. These back-end issues are becoming increasingly important in deep submicron design flows, and are unaddressed by traditional soft IP approaches. A combination of soft IP (to speed design capture) and reconfigurable IP as a platform to implement that soft IP, and to ease the backend problems can therefore bring benefits across the entire design flow.
To illustrate this point, the paper is divided into three main sections:

An outline of the SoC design process,
A discussion of the types of IP that typically feed into this process, and
A description of the way that reconfigurable IP both feeds into and changes the process.

Design Flow
The SoC development process is typically viewed as consisting of the stages of Architecture definition, RTL implementation, functional verification (both of the individual blocks in the system, and of the system as a whole), Synthesis, test generation/test logic insertion, Placement & Routing, timing verification, silicon manufacture, and silicon validation/debug.

Historically, the major bottlenecks in the design process were associated with the early stages – the designer’s most difficult problem was producing an architecture and implementation that was small enough to be cost-effective to manufacture with the available process technology. In the deep-submicron era the focus has shifted to later stages – to the verification of the very complex systems that can now be made on a single chip, and to the complexities of placement, routing, and timing for small process geometries. This trend shows no sign of changing – finer geometries have worse crosstalk problems (which in turn increases the effort required for timing closure), more complex DRC and Design-for-manufacturability issues to consider, and also commonly have lower supply voltages, which increases the risk of problems due to IR drop.

The Contribution of Intellectual Property

Types of Intellectual Property

The current IP market is very diverse, containing companies selling a wide variety of types of product. Rather than trying to understand it as a whole, it is better to subdivide it into 3 parts:

Star IP – the key enabling technologies that determine a large part of the system design, the classic examples being processor cores (as supplied by companies such as ARM, and MIPS). The choice of such IP determines much more than the instruction set of the embedded processor – the choice of software development environment, operating system, and on-chip bus are all heavily influenced by the processor core. Although processor cores are the classic examples of star IP, there are others – e.g. reference designs for communications subsystems, with significant hardware and software components.
Block Level IP – Synthesizable logic blocks and verification suites that provide point components of a system. These components have little impact on the definition of the overall system architecture, and connect into the framework created by the star IP.
Libraries – Logic libraries, RAM generators etc. that provide the physical components from which the Soft IP will be built.

Gartner Dataquest IP revenue data over the last 4 years consistently shows IP suppliers of all three types in the top 10, with the “Star IP” suppliers typically having higher revenues than the others.

Use of IP in the design flow

These three forms of intellectual property are typically used to address different issues in the design process, which can be summarized as follows:

“Star” IP
As mentioned above, “Star” IP is a major factor in determining key parts of the system. It can directly influence the overall architecture, indeed the existence of a largely predefined architecture for a large section of the system may be the reason for selecting a particular “Star” IP supplier. Star IP is therefore relevant to the architectural definition phase of the development process. Furthermore, since the supplier also provides a fully functional implementation (typically as RTL, plus any required software, and including any necessary synthesis scripts), it also assists with the design capture and block-level verification stages. System-level verification may also be simplified by the use of star IP – for instance by using the processor to run code to probe the behavior of other blocks in the system.

Block-Level IP
Block-level IP differs from Star IP in that it is not a significant determinant of the system architecture – it provides implementations of blocks identified in an architecture definition, but does not drive that definition. However, its other characteristics are similar to those of star IP:

Synthesizable logic blocks are typically provided as RTL code, and simplify design capture and block-level verification.
Protocol verification suites (another example of block-level IP) can be used to verify that multiple blocks in the design meet the required bus protocol, thereby simplifying system-level verification.

Library IP
Library IP – RAM generators, logic libraries etc. have very different characteristics to the other forms of IP. They make no contribution to system architecture, and very little to design capture and verification. Library IP is focused on the implementation phase:

Logic libraries are the key components required for synthesis, placement and routing, and a well characterized library can ease problems in this area.
RAM generators provide a fast way to create memories that have been optimized for a particular manufacturing process, and may also include BIST (Built-In Self Test) features that help with chip test.

Overall, the use of library IP is a way to achieve a degree of manufacturing process independence – if the supplier can provide the same libraries for multiple silicon processes then the task of design migration between processes will be eased.

Summary
The following table summarizes the areas of the design flow where these different types of IP impact the design process:

From this table it is apparent that existing IP products can be separated into two types – those that make their main contributions to the front end of the design process, and those that contribute to the later stages. Historically, most emphasis has been on the “Star” and “Block-level” IP, that contributes to the early stages of the design flow, but they make no contribution to the back-end of the flow, to the areas that (as noted above) are becoming more complex with each new process generation.

The Contribution of Reconfigurable IP
Definitions
For the purposes of this paper we will adopt the following definition of Reconfigurable IP:

Reconfigurable – the function can be rapidly changed, post fabrication.
IP – supplied in a form that can be integrated as a component of a larger SoC device.

Reconfigurable IP is therefore distinct from “Configurable” IP where parameters (such as filter coefficients, or even processor instruction sets) can be varied pre-synthesis, but are fixed in any given silicon implementation.

Reconfigurable IP is also normally distinguished from “Programmable” IP (processors and arrays of processors). In a reconfigurable device a large number of simple processing elements (PEs) are connected together to create a function-specific processing block, through which large amounts of data are then passed. The function of the block remains fixed unless and until the connection pattern between the PEs (and any configuration state internal to the PEs) is changed. Reconfigurable IP therefore alternates between two main states:

Configuration, when the function of the block is defined, but no data processing takes place, and
Processing, where that function is applied to large amounts of data, and no further configuration information is required to enable this processing to proceed.

This dual-mode operation is a significant difference from a processor, where instruction fetches and data movements are intermingled, and may even both occur in the same cycle.

The separation of configuration and data processing means that a larger proportion of the available memory bandwidth, and power budget, is devoted to working on data in a reconfigurable device than in a processor. As an example, consider Elixent’s D-Fabrix Reconfigurable Algorithm Processor (RAP). A D-Fabrix array contains hundreds or thousands of 4-bit ALUs, (the precise number can be tailored to the performance requirements of an application) and requires a few thousand cycles to reconfigure, but is then capable of high performance on complex algorithms – such as compressing or decompressing complete multi-megapixel JPEG images at a rate of one color component per cycle. Configuration therefore accounts for well under 1% of cycles, compared with an instruction fetch per cycle for a CPU. There are also very significant power savings available – D-Fabrix uses an order of magnitude less power than a DSP processor for JPEG-style applications.

Reconfigurable IP in the design process

The key contribution of Reconfigurable IP to the SoC design process is that it allows for a much greater separation of application development from chip design than would otherwise be the case. Reconfigurable IP provides a platform for the implementation of arbitrary soft IP, and can be used to separate the development and/or integration of that IP into the application from the creation of the chip netlist and floorplan.

As an example, consider a digital still camera (DSC) controller chip. Such a device will typically consist of a CPU, both on-chip memory and an interface to off-chip memory (such as RAM, Flash, and smartMedia or other removable storage), a compression engine, and data formatting and interpolation engines to connect to both the image sensor and the display. Early on, in the architectural stage of the design process, the CPU core and memory interfaces will be well defined, as will be the compression engine. However, the formatting and interpolation engines are heavily dependent on the details of the sensor and display, which may not be settled until much later in the project, and which potentially depend on the results of negotiations with 3rd party suppliers. Adding a Reconfigurable IP block as a platform that can be used to implement an arbitrary interpolation engine allows the SoC hardware architecture to be fully specified well before these details are settled, and means that the detailed layout of the SoC can commence much earlier than would otherwise be the case.

The only other way to achieve such flexibility would be to use a processor – again as a platform that separates the task to be performed from the hardware required to perform it. However, using a processor may impose unacceptable constraints on the application, in terms of throughput or power dissipation. As mentioned above, a reconfigurable core like D-Fabrix can have much lower power requirements than a processor, and can also have much higher peak performance due to its inherently scalable architecture. Taken together these factors mean that there are situations where processors cannot be used, but reconfigurable IP can.

This ability to defer the need to finalize decisions of implementation details means that Reconfigurable IP is commonly viewed as a way of isolating the chip development from late specification changes, and as making it possible to design platforms that can be used in multiple products (to continue with the DSC example – in multiple cameras with different sensor and display options). However, this is only a part of the benefit of a reconfigurable approach. The ability to define the SoC netlist and floorplan early means that the backend design process can be started earlier, and any problems identified (and fixed) earlier. Furthermore, Reconfigurable IP does more than just expose the problems earlier, it also brings reductions in the complexity of the backend design process:

Verification

With the use of reconfigurable IP, verification can be separated into two largely separate activities:

Verifying that the reconfigurable IP block works correctly in the SoC context, and
Verifying that the application IP works correctly on the reconfigurable IP.

This is almost equivalent to the separation between block-level and system-level verification – the alignment is not exact because multiple blocks may be combined into a single configuration for the reconfigurable IP.

In the DSC example, system-level verification is no longer about verifying that the processor, sensor pipeline, display interface and memory system all work correctly together, but becomes a check that processor, memory and reconfigurable IP interact correctly. This is a check of the behavior of standard components, not of code specially developed for the particular application, and is likely to be able to reuse results from previous designs that used similar components. Further, since system-level verification is largely decoupled from the details of the application, it can start earlier, and proceed in parallel with block-level verification.

Synthesis

Synthesis is also separated into two separate components – generation of the chip-level netlist, and generation of the “compiled” application code to run on the reconfigurable IP. This separation is analogous to adopting a hierarchical methodology for chip-level synthesis – some of the levels in the hierarchy correspond to separate configurations for the reconfigurable IP blocks.

Test Generation

In manufacturing test it is the actual hardware that is being tested, not the application that will run on it. In the context of reconfigurable IP, this means that it is the reconfigurable core that is to be tested.

The test program for a Reconfigurable IP block will be provided by the supplier of the block, and all that remains for the SoC integrator to do is to merge that test program into the one for the whole device. There is no need for scan chain insertion or other DFT techniques to be applied to any of the application code that will be configured onto the reconfigurable IP.

Similarly, design changes to enhance manufacturing yield (DFM – Design For Manufacturability), which is of increasing importance at finer process geometries, also applies at the level of the reconfigurable IP, not the application.

Placement & Routing

Just as for verification and synthesis, placement and routing is separated into two stages – one stage corresponding to placing and routing reconfigurable IP blocks in their SoC context, and the other being part of the “compilation” of the application to run on the blocks. It is only the first of these stages that needs to be complete before SoC tapeout – the other can be finished later.

Timing verification

A reconfigurable IP block will have its clock tree already defined, and already balanced. It will also have power and ground distribution defined and characterized, so that timing problems due to clock skew and/or IR drops in power and ground grids will already be taken into account by the supplier of the reconfigurable IP. Thus timing verification is also separated into two components, and a significant amount of the work is already taken care of by the IP supplier.

Silicon manufacture

Silicon debug

The ability to change the function of the Reconfigurable IP can be used during silicon debug, for instance by loading bus-monitoring code into the SoC.

Reconfigurable IP and System Architecture

The preceding section concentrated on how reconfigurable IP could fit into a standard SoC flow, and the benefits that it could bring. However, reconfigurable IP also opens up a new range of architectural possibilities, and can therefore contribute to the architecture development stage of the design process.

To return to the DSC example, a camera is essentially a modal device – it can be used for both image capture and review, but only one at a time. Capture and review can themselves be further subdivided into mutually exclusive tasks. For image capture these include:

“Viewfinder” operation prior to image capture
high-resolution image capture
Compression

And for review:

Thumbnail generation for indexing
Decompression for local (low-resolution LCD) display
Decompression for external NTSC display
Decompression for external PAL display

In a conventional SoC design, several of these tasks would have dedicated hardware to support them, and this hardware would be dedicated to a single function. Once reconfigurable IP is included the possibility arises of using a single reconfigurable block whose function is changed according to the current mode of operation of the device. As mentioned above, a D-Fabrix array can completely change its function in a few thousand cycles, a timescale that is imperceptible to a typical user of a modal device. Reconfiguration thus becomes another tool available to the system architect, not simply a way to defer decisions about hardware implementation. In our initial description of the DSC camera chip, reconfigurable IP was presented as a solution to the problem of an undefined specification for the sensor pipeline, while the compression hardware was considered as a well-defined block that would be implemented as gates in the normal way. However, it is possible to decide to use the reconfigurable block for the compression too – saving the chip area that would otherwise have been occupied by compression hardware.

Summary
The following table indicates those areas in the design process that Reconfigurable IP can bring benefits to. It is separated into two areas – the application development benefits, and the SoC development benefits:

(several items in the application column are marked as not applicable – n/a – because the use of reconfigurable IP removes the need to apply them to the application code)

Thus we see that Reconfigurable IP can bring across-the-board benefits to the system integrator. These benefits arise for two main reasons:

Reconfigurable IP provides a packaged solution to some significant backend problems, and
Reconfigurable IP simplifies the problems of integrating Soft IP into the design, by providing a standard platform that isolates it from irrelevant details of the SoC context.

If this table is compared with the earlier one that compares Star, Block-level and Library IP, we can see that reconfigurable IP supplies the application designer with similar benefits to Star IP – it provides a design methodology that can simplify architectural definition, and also speed up design capture and verification. Simultaneously, reconfigurable IP provides the chip integrator with similar benefits to those provided by Library IP – it provides a building block that is well matched to the target manufacturing process, and which provides ready made solutions to test, timing, and manufacturability problems.

Website: http://www.elixent.com