|
|||
Benefits of Reconfigurable IP in the Back-end SoC Development Process
by Kenn Lamb & Tony Stansfield, Elixent Limited
Bristol, UK Abstract : This paper compares reconfigurable IP – a class of processing cores that provide high-performance, low power, and run-time flexibility – with other forms of intellectual property, and explains how reconfigurable IP, by acting as a platform for the implementation of arbitrary soft IP blocks can simplify some significant issues in the back-end of the SoC design process, issues that are becoming worse as process geometry continues to decrease. Overview Discussions of the use of Reconfigurable IP cores normally concentrate on the benefits that they can bring in terms of late design changes and post-fabrication (even in-the-field) changes in functionality. However, this is only a part of the benefit of using Reconfigurable IP – there are also speed, area and power savings when compared to other programmable approaches. In this paper we also consider the benefits that Reconfigurable IP can bring to the back end of the SoC design process – placement, routing, timing closure, and Design for test/Design for manufacturability. These back-end issues are becoming increasingly important in deep submicron design flows, and are unaddressed by traditional soft IP approaches. A combination of soft IP (to speed design capture) and reconfigurable IP as a platform to implement that soft IP, and to ease the backend problems can therefore bring benefits across the entire design flow. To illustrate this point, the paper is divided into three main sections:
Design Flow The SoC development process is typically viewed as consisting of the stages of Architecture definition, RTL implementation, functional verification (both of the individual blocks in the system, and of the system as a whole), Synthesis, test generation/test logic insertion, Placement & Routing, timing verification, silicon manufacture, and silicon validation/debug. Historically, the major bottlenecks in the design process were associated with the early stages – the designer’s most difficult problem was producing an architecture and implementation that was small enough to be cost-effective to manufacture with the available process technology. In the deep-submicron era the focus has shifted to later stages – to the verification of the very complex systems that can now be made on a single chip, and to the complexities of placement, routing, and timing for small process geometries. This trend shows no sign of changing – finer geometries have worse crosstalk problems (which in turn increases the effort required for timing closure), more complex DRC and Design-for-manufacturability issues to consider, and also commonly have lower supply voltages, which increases the risk of problems due to IR drop. The Contribution of Intellectual Property Types of Intellectual Property The current IP market is very diverse, containing companies selling a wide variety of types of product. Rather than trying to understand it as a whole, it is better to subdivide it into 3 parts:
Use of IP in the design flow These three forms of intellectual property are typically used to address different issues in the design process, which can be summarized as follows: “Star” IP As mentioned above, “Star” IP is a major factor in determining key parts of the system. It can directly influence the overall architecture, indeed the existence of a largely predefined architecture for a large section of the system may be the reason for selecting a particular “Star” IP supplier. Star IP is therefore relevant to the architectural definition phase of the development process. Furthermore, since the supplier also provides a fully functional implementation (typically as RTL, plus any required software, and including any necessary synthesis scripts), it also assists with the design capture and block-level verification stages. System-level verification may also be simplified by the use of star IP – for instance by using the processor to run code to probe the behavior of other blocks in the system. Block-Level IP Block-level IP differs from Star IP in that it is not a significant determinant of the system architecture – it provides implementations of blocks identified in an architecture definition, but does not drive that definition. However, its other characteristics are similar to those of star IP:
Library IP Library IP – RAM generators, logic libraries etc. have very different characteristics to the other forms of IP. They make no contribution to system architecture, and very little to design capture and verification. Library IP is focused on the implementation phase:
Summary The following table summarizes the areas of the design flow where these different types of IP impact the design process: From this table it is apparent that existing IP products can be separated into two types – those that make their main contributions to the front end of the design process, and those that contribute to the later stages. Historically, most emphasis has been on the “Star” and “Block-level” IP, that contributes to the early stages of the design flow, but they make no contribution to the back-end of the flow, to the areas that (as noted above) are becoming more complex with each new process generation. The Contribution of Reconfigurable IP Definitions For the purposes of this paper we will adopt the following definition of Reconfigurable IP:
Reconfigurable IP is also normally distinguished from “Programmable” IP (processors and arrays of processors). In a reconfigurable device a large number of simple processing elements (PEs) are connected together to create a function-specific processing block, through which large amounts of data are then passed. The function of the block remains fixed unless and until the connection pattern between the PEs (and any configuration state internal to the PEs) is changed. Reconfigurable IP therefore alternates between two main states:
The separation of configuration and data processing means that a larger proportion of the available memory bandwidth, and power budget, is devoted to working on data in a reconfigurable device than in a processor. As an example, consider Elixent’s D-Fabrix Reconfigurable Algorithm Processor (RAP). A D-Fabrix array contains hundreds or thousands of 4-bit ALUs, (the precise number can be tailored to the performance requirements of an application) and requires a few thousand cycles to reconfigure, but is then capable of high performance on complex algorithms – such as compressing or decompressing complete multi-megapixel JPEG images at a rate of one color component per cycle. Configuration therefore accounts for well under 1% of cycles, compared with an instruction fetch per cycle for a CPU. There are also very significant power savings available – D-Fabrix uses an order of magnitude less power than a DSP processor for JPEG-style applications. Reconfigurable IP in the design process The key contribution of Reconfigurable IP to the SoC design process is that it allows for a much greater separation of application development from chip design than would otherwise be the case. Reconfigurable IP provides a platform for the implementation of arbitrary soft IP, and can be used to separate the development and/or integration of that IP into the application from the creation of the chip netlist and floorplan. As an example, consider a digital still camera (DSC) controller chip. Such a device will typically consist of a CPU, both on-chip memory and an interface to off-chip memory (such as RAM, Flash, and smartMedia or other removable storage), a compression engine, and data formatting and interpolation engines to connect to both the image sensor and the display. Early on, in the architectural stage of the design process, the CPU core and memory interfaces will be well defined, as will be the compression engine. However, the formatting and interpolation engines are heavily dependent on the details of the sensor and display, which may not be settled until much later in the project, and which potentially depend on the results of negotiations with 3rd party suppliers. Adding a Reconfigurable IP block as a platform that can be used to implement an arbitrary interpolation engine allows the SoC hardware architecture to be fully specified well before these details are settled, and means that the detailed layout of the SoC can commence much earlier than would otherwise be the case. The only other way to achieve such flexibility would be to use a processor – again as a platform that separates the task to be performed from the hardware required to perform it. However, using a processor may impose unacceptable constraints on the application, in terms of throughput or power dissipation. As mentioned above, a reconfigurable core like D-Fabrix can have much lower power requirements than a processor, and can also have much higher peak performance due to its inherently scalable architecture. Taken together these factors mean that there are situations where processors cannot be used, but reconfigurable IP can. This ability to defer the need to finalize decisions of implementation details means that Reconfigurable IP is commonly viewed as a way of isolating the chip development from late specification changes, and as making it possible to design platforms that can be used in multiple products (to continue with the DSC example – in multiple cameras with different sensor and display options). However, this is only a part of the benefit of a reconfigurable approach. The ability to define the SoC netlist and floorplan early means that the backend design process can be started earlier, and any problems identified (and fixed) earlier. Furthermore, Reconfigurable IP does more than just expose the problems earlier, it also brings reductions in the complexity of the backend design process: Verification With the use of reconfigurable IP, verification can be separated into two largely separate activities:
In the DSC example, system-level verification is no longer about verifying that the processor, sensor pipeline, display interface and memory system all work correctly together, but becomes a check that processor, memory and reconfigurable IP interact correctly. This is a check of the behavior of standard components, not of code specially developed for the particular application, and is likely to be able to reuse results from previous designs that used similar components. Further, since system-level verification is largely decoupled from the details of the application, it can start earlier, and proceed in parallel with block-level verification. Synthesis Synthesis is also separated into two separate components – generation of the chip-level netlist, and generation of the “compiled” application code to run on the reconfigurable IP. This separation is analogous to adopting a hierarchical methodology for chip-level synthesis – some of the levels in the hierarchy correspond to separate configurations for the reconfigurable IP blocks. Test Generation In manufacturing test it is the actual hardware that is being tested, not the application that will run on it. In the context of reconfigurable IP, this means that it is the reconfigurable core that is to be tested. The test program for a Reconfigurable IP block will be provided by the supplier of the block, and all that remains for the SoC integrator to do is to merge that test program into the one for the whole device. There is no need for scan chain insertion or other DFT techniques to be applied to any of the application code that will be configured onto the reconfigurable IP. Similarly, design changes to enhance manufacturing yield (DFM – Design For Manufacturability), which is of increasing importance at finer process geometries, also applies at the level of the reconfigurable IP, not the application. Placement & Routing Just as for verification and synthesis, placement and routing is separated into two stages – one stage corresponding to placing and routing reconfigurable IP blocks in their SoC context, and the other being part of the “compilation” of the application to run on the blocks. It is only the first of these stages that needs to be complete before SoC tapeout – the other can be finished later. Timing verification A reconfigurable IP block will have its clock tree already defined, and already balanced. It will also have power and ground distribution defined and characterized, so that timing problems due to clock skew and/or IR drops in power and ground grids will already be taken into account by the supplier of the reconfigurable IP. Thus timing verification is also separated into two components, and a significant amount of the work is already taken care of by the IP supplier. Silicon manufacture Silicon debug The ability to change the function of the Reconfigurable IP can be used during silicon debug, for instance by loading bus-monitoring code into the SoC. Reconfigurable IP and System Architecture The preceding section concentrated on how reconfigurable IP could fit into a standard SoC flow, and the benefits that it could bring. However, reconfigurable IP also opens up a new range of architectural possibilities, and can therefore contribute to the architecture development stage of the design process. To return to the DSC example, a camera is essentially a modal device – it can be used for both image capture and review, but only one at a time. Capture and review can themselves be further subdivided into mutually exclusive tasks. For image capture these include:
Summary The following table indicates those areas in the design process that Reconfigurable IP can bring benefits to. It is separated into two areas – the application development benefits, and the SoC development benefits: (several items in the application column are marked as not applicable – n/a – because the use of reconfigurable IP removes the need to apply them to the application code) Thus we see that Reconfigurable IP can bring across-the-board benefits to the system integrator. These benefits arise for two main reasons:
Website: http://www.elixent.com |
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |