The Configurable VLIW Processor As The Base For A Cost Effective SoC Platform.

by Victor Berman, Improv Systems, Inc.
Beverly, MA USA

Abstract :
The concept of platform based design has been elaborated by many recent authors, perhaps most eloquently by A. Sangiovanni-Vincentelli, et al. [1]. This paradigm has been held as a key to handling issues of complexity, partitioning, and re-use. However, a significant contingent of industry analysts remain unconvinced. G. Smith rightly states that platform based design has been a "buzz term with no set definition" [2] and that building a platform is expensive and can take eighteen months to two years. They cite issues of cost and lack of flexibility that keep platform based designs from achieving cutting edge performance. This paper resolves these two divergent views by presenting an operational definition of a platform for SoC designs that enables cost effective and performance competitive designs for consumer electronics.

It is shown that in order for a platform to achieve the objectives of cost effectiveness and competitive performance a number of stringent requirements must be met. Among these requirements are:

1.   Software Reuse: applications code must be automatically ported across platform variants.
2.   Extensibility: the platform architecture must be easily configured for multiple applications.
3.   Implementation: the designed platform instance must have a known path to silicon implementation and testability.
4.   Design Environment: the platform system must include an integral user environment for the automation of code generation, platform variant generation, performance analysis, and SoC integration/implementation.

Economic models for platform development are explored. The tradeoffs between in-house development vs. third party licensing are looked at in terms of cost factors, time to market and product differentiation.

Introduction
The motivation behind the development and use of platforms for design is simply stated as the desire to reduce cost and time to market for products. This potential saving in cost and time may come from several factors which have been extensively discussed in the literature. These include component reuse, software reuse, and reduced need for verification at several levels of component integration. In order to achieve these benefits in the design process certain ‘costs' most be paid. These costs fall into two categories ; the constraint of design style imposed by the platform, and the actual cost of developing the platform. As is always the case a cost/benefit analysis most be used to determine the suitability of this approach for any design or family of designs.

Looking at the issue of design constraints clearly the minimal requirements is that the design requirements can be met within the constraints of the platform. But this is minimal indeed. Unless advantages can be shown relative to competing methodologies there is no reason to proceed down this path. With respect to cost, the equation can become complex since ideally issues of future products and families of products which stem from the same basic design should be considered. Since in many instances this data is not known with precision, decisions are generally made weighting current factors more heavily. One of the most important is maker vs. buy which we will discuss later on.

Platform Definition

A great deal has been written about the definition of platforms. While some of this discussion has illuminated the important characteristics of this design style, much of it has led to confusion and skepticism about how real or useful a concept this is. This is unfortunate since in the opinion of the author use of appropriate platfoms is a key to practical designs of future electronic systems. Since this line of reasoning can easily turn circular, I state for specificity that the systems I am referring to are complex embedded systems which are typical of current and near term consumer offerings such as xG (2, 2.5, 3, 4….) phones, multi-media settop boxes, and automotive telematics. The common thread in these systems is that they are complex; power, cost, and time sensitive; and short lived requiring frequent upgrades to meet new features and new standards.

The operational definition of a platform that addresses this design space is (borrowing heavily from the concept of a System Platform in [1}) is an extensible, configurable micro-architecture with a coherent programming model and the tools to support this model. I call this an Application Platform.

The Application Platform is not formally defined but recognized by its operational characteristics. Other groups such as VSIA are engaged in taxonomic explorations but for the present I propose that this simple definition is useful and timely. Notice that specific hardware implementations are not either prescribed or proscribed since a variety are applicable and more, perhaps not yet known, will become applicable. However, the critical components are the coherence of the programming model and the flexibility of the architecture.

The issue of flexibility is of interest since the stated lack of flexibility of platforms is a common criticism of platforms. In the author's view this criticism stems from experience with poorly designed platforms that constran without providing needed support for application implementation and optimization. The well architected platform on the other hand provides scructure and discipline without hampering productivity and creativiity. To accomplish this feat requires deep understanding of the application space as well as the willingness to undertake the extensive research and devlopment task in building the architecture with its support environment.

Platform Requirements

We posit that the criteria for a successful platform are such that the cost function for a set designs is minimized over the universe of available methods. While a formal development of this thesis is of interest, and should be persued in other venues, it is also beyond the scope of this work. Appealing to practical considerations and engineering experience we give an important but not exclusive set of criteria :

Software Reuse : We use the term software here in a general sense to mean the specification of the application. There is no implication about the ultimate implementation since it is quite likely that parts of the application will run as instructions on a processor while others will be implemented more directly in hardware. The effective mapping of the application to the architecture is arguably the most important determinant of the success of the platform since this will determine the efficiency of the design instance as well as the range of applicability for the platform.

For efficiency an embedded platform needs to support highly parallel execution ; for generality it must support heterogeneous elements. The implication of this combination is that mapping tool which we will call a compiler, needs to perform both traditional compiler tasks such as code generation and memory mapping, as well as synthesis tasks such as task allocatin and scheduling. An interesting starting point for such a tool is under development at UC Irvine [3}. This compiler must not only be retargettable in the traditional sense but be capable of reading configuration information describing the architecture instance and basing its code generation, task allocation, and scheduling on this fresh data.

Extensibility: If a platform is not useful over a broad range of applications it will be difficult to justify its development cost. While flexibility is important it must be accomplished with discipline if the programming model is to remain coherent and the mapping function is to remain feasible. This implies a family of processors, highly configurable for size and function but sharing a unified programming model. Practical constraints at this time may require the inclusion of components that do not fit this model precisely, in order to make use of an existing code base (legacy code). This implies that in addtion to the base architecture, interface elements must be provided to manage coherent connections to external devices. These elements should be designed-in parts of the platform, not add-on artifacts and the necessary converters and drivers should be controlled within the platform programming model.

Implementation: The designed platform instance must have a known path to silicon implementation and testability. While this may seem obvious, all to frequently IP cores that show high performance in simulation are either difficult to route or require custom cells that may be unavailable for expensive to fabicate.

Design Environment: The platform system must include an integral user environment for the automation of code generation, platform variant generation, performance analysis, and SoC integration/implementation. To take full advantage of the platform design capabilities the user design environment should be developed in tandem with the architecture, not as an add-on or after thought.

The compiler, in particular, must have a deep understanding of the architecture and its potential for application optimization.

Requirement Summary :
I have briefly enumerated some of the important characteristics of a useful application platform. At the risk of stating the obvious I point out that regardless of the sophistication and flexibility of the platform, unless it compares favorably to a conventional design in throughput, gate count and power consumption it will be a tough sell.

Platform Development Models

In order to meet these requirements we examine microarchitectures and design environments. Because of the need for flexibility and high levels of parallel execution two choices are configurable VLIW processors and PLD devices. These are not strictly analogous since a PLD may be used to impelement a VLIW processor. For this discussion the intent is that the VLIW be implemented in ASIC style technology ; either gate array or custom to take advantage of the lower unit cost and compact, low power designs achievable with this technoology. This style gives up the flexibility achieved by the PLD in favor of the characteristics just mentioned. However, there is nothing to prevent a mixed style being used reserving the use of PLDs for small portions of the design that are known to need late changes.

The advantages of the VLIW processor stem from its inherent capability to provide a flexible level of parallelism A configurabl e instance of a VLIW processor allows the addition of computation units to optimize particular applicatons. Further, if the microarchitecture supports a coherent programming model for an interconnected network of heterogeneous configurable processors we go a long way toward satisfying the requirments for extensibility and software reuse.

As an illustration we look at the Improv Jazz DSP processor.

Fig.1 Basic VLIW Processor

The processor is organized around a central Data Communications Module with a Control Unit and Task Queue to oversee execution sequencing. Depending on the instruction word length which is variable, 1 to n computation units and memory units execute in parallel. The number and type of units is configured as part of the design process. Selection of computation units is provided from a library of standard units such as ALU, MAC, Shift. Specialized units which are really accelerators for commun DSP operations such as FFT, DCT, FIR are also available. A mechanism is provided for designing custom instructions called DDCUs made known to the compiler through a configuration file.

Fig.2 Crescendo Media Processor

With the addition of interface blocks and standard bus/micro-processor support a subsystem to support media processing may be developed. In the example of the Crescendo system shown, the media processor engines are in the Crescendo Platform and are programmed within the platform programming model. The microprocessors are running legacy code to illustrate how a system can be built up using a combination of the coherent programming model and external units. The ideal future system would have all the units within the progamming model.

The multiprocessor media processing units are shown below

Fig 3. Multiprocessor for Media (MPEG4)

The subsytem for media processing, in this case MPEG4 is shown with its Host Bus Interface (HBI) and its Data Memory (DM) used for inter-processr communication. Since this application requires large quantities of data for the MPEG frame buffers, a high throughput Data Channel Interface is provided to talk to an external memory controller.

Fig 4. Design Environment

Design Environment and Implementation
As shown in Figure 4, the platform development environment supports a parallel hardware/software development process with the application as the driver. The application is described in a notation called Notation developed at Improv. It has Java and C based implementation and has a set of class definitions and APIs to supplement the existing language semantics for parallel and hardware related constructs. Ideally a standard system design language would be used but none have yet been adopted.

Generally the process goes through the following steps :
-   Verify the functionality of the application using the Notation functional simulator.
-   Compile for a nominally appropriate target platform
-   Analyze the result using a tool called Tuning Fork that provides execution profiles.
-   Optimize the software and/or hardware
-   Reflect any hardware changes back to the compiler and analysis tools through the automatically generated configuration file.
-   Repeat until requirements are met.

The analysis tool provides throughput information as well as hot spot locations. It makes suggestions for adding or deleting computation units depending on the utilization profiles.

When this process is complete, the design is synthesized and verified either separately or in conjunction with other components of the SoC.

Economic Models for Platform Development
As can be seen by the above discussion, the development of the tools, architecture, and flow for a generalized platform, in this case aimed at embedded DSP applications, is complex. The system described required more than three years of development at a cost of well over twenty five million dollars. While other systems might be less complex, they would lack the generality and coherence that make this system effective. Not surprisingly in-house systems, even at the largest companies, do not provide comparable capabilities. Only by amortizing the development costs of such a system over a large number of designs can the costs be justified. Especially given the current economic climate it is unlikely that many companies would choose to duplicate these capabilities internally. Those who try to graft on generic compiler and analysis tools to cut costs will, I fear, be vastly disappointed in the results. My conclusion is that to produce competitive designs for today's electronic products, licensing platform technology and IP is the practical course of action.

Bibliography
[1]K.Keutzer, S. Malik, R. Newton, J. Rabaey, and A. Sangiovanni-Vincentelli, "System Level Design: Orthogonaliztion of Concerns and Platform-Based Design". IEEE Transactions on Computer-Aided Design of Circuits and Systems, Vol 19, No. 12 December 2000.
[2]R. Goering, "Dataquest Predicts Automation of RTL", EE Times, June 10, 2002.
[3] Carrie J. Brownhill, Alexandru Nicolau, Steve Novack, Constantine D. Polychronopoulos, "The PROMIS Compiler Prototype" IEEE PACT, 1997.