The meaning of "system on a chip" (SoC) as it relates to design methodology has been debated for years. The ability to put complete systems on a single IC and design with subsystems has introduced profound changes to the semiconductor industry: design reuse, soft and hard IP, system verification, formal proof. The complexity explosion and its related issues have been debated at length. Rarely discussed, however, is the quiet revolution starting to appear in many of today's designs. The revolution is a direct consequence of the shift to the SoC design model. As we put complete systems on chip, we are moving from placing a few simple processing elements with off-chip memories to embedding the memories with the processing elements and the control logic on chip. The result — largely ignored to date although completely obvious — is an explosion in the number of "hard" intellectual property (IP) blocks (also known as "hard macros") on every IC. These hard macros are microprocessors such as ARM, MIPS and proprietary microcontrollers, signal processors, graphics and multimedia processors, and massive amounts of memories. Typically the memories are not coming in the form of a single, large, embedded block, but as myriads of smaller hard macros. The real numbers Though data has been readily available, it has not been the focus of attention and has not been correlated among the different sources. The data presented here is based on our customer base and has been validated by IBS International Business Strategies Inc. The average number of hard macros in an SoC has increased from an average of under 20 in 2001 to an average of almost 100 last year, and is scheduled to reach almost 200 this year. An extrapolation of that trend shows that the average will likely reach almost 400 by next year and 600 in 2006, as shown in Figure 1. Figure 1 — Growth in the number of hard macros in SoC designs The International Technology Roadmap for Semiconductors (ITRS) confirms this trend and adds the area parameter. It demonstrates that hard macros already account for well over 50% of total IC area, and are expected to reach 70% by 2006 (see Figure 2). Figure 2 — Hard macros vs. standard cell area Note that the number of macros is increasing at a rate that is much faster than the rate of area increase. This points to obvious growth in many smaller hard IP blocks, such as memories, versus the number of larger hard IP blocks, such as microprocessors. Not surprisingly, despite the increased complexity of the systems found on modern chips, it seems unlikely that more than a few processing units would be required. A new survey collected at the 2004 Design Automation Conference from over 175 design teams confirmed that the growth in the number of hard macros has been underestimated. According to the survey, the average SoC already has more than 100 macros and is expected to be over 200 next year (see Table 1 below). Table 1 — Results of a survey at DAC 2004 The amazingly high number of hard macros found at the top of the range comes mostly from large switching and communications applications — again pointing to the rapid increase in embedded memory hard macros versus traditional processing power IP. From "sea of cells" to "sea of hard macros" The impact of the growth in hard macros is more important than most people estimate. In fact, as the picture below shows, we are rapidly moving from ICs with a sea of cells and a few hard macros on the side, to ICs with a sea of hard macros and a few areas of standard cells that connect between them (see Figure 3). Figure 3 — Traditional "sea of cell" IC vs. "sea of hard macros" SoC Despite these profound changes, the placement of hard macros has until recently been viewed as a problem largely independent from the placement and routing of standard cells. In a traditional design flow, the placement of hard macros happens in an independent step — typically floorplanning — to which little time is dedicated, and which is never revisited after it has been completed. Such a pure "forward" flow does not consider the interrelations between the standard cell locations and the hard macro locations, but rather leaves to the standard cell place and route system the task of connecting the macros. The observation that we are dealing with a sea of macros has in fact vast consequences, not the least of which is that an optimized design flow becomes hard-macro-centric instead of the classical quick floorplanning followed by massive efforts in implementation. Because of the increase in hard macros and the area that they are taking, it is no longer possible to bring design closure and optimal die size through optimization at the implementation step. A bad floorplan cannot be compensated for at implementation when the area taken by the hard macros is half of the IC. Impact of macros on utilization and die size The impact on utilization — and therefore area — of hundreds of macros is far-reaching. Because hard macros are placed before and independently of standard cells, a complex network of interconnects gets created, trying to bridge between the location of the standard cells and the hard macros that they need to reach. The result is quite familiar to most physical designers: many long wires stretching across the design and high congestion as the standard cells all fight to be near the locations where the hard macros have been placed. The corollary is that a highly connected network of standard cells gets spread throughout the IC, adding to wire length and congestion. What has been observed is that on such ICs, the utilization drops dramatically as designers attempt to deal with the high congestion. In fact, very low utilization has recently plagued many of the advanced SoC ICs designed in new technologies, slowing the ramp-up and adoption of such technologies. Although this effect has not been clearly understood until recently, the cause is logical; lower utilization results in increased die size. Automation and exploration With the exploding number of hard macros in an SoC, the designer's ability to place them by hand has vanished. Though most experienced designers have typically a good understanding of the dataflow in their design and the logical location of their hard macros, this simply does not hold true over 100 hard macros. In addition to the location of hard macros, their orientation and possible flipping creates an exponential number of solutions. The problem is even larger when one realizes that most embedded memory hard macros have multiple aspect ratios and multiple pin locations. Given the critical nature of the hard macro on the overall utilization and die size, it is clear that automation is required to meaningfully explore the design space as the data set is simply to large for a human being to comprehend. Furthermore, with the continuous advances of high-level synthesis and hardware/software co-design, engineers have the luxury and the desire to quickly explore multiple high-level architectures. The high-level tools enable trade-offs of architectures that rely on different combinations of memory access, resource sharing, and multiplexing. But without an understanding of the impact of multiple solutions on die size, designers will make such tradeoffs blindly. For example, one might decide to share a significant resource between two otherwise separate parts of a system. Though obviously saving area at a high level of abstraction, such sharing can cause the type of interconnect concentration that could dramatically affect congestion, utilization and ultimately die size — making that sharing solution significantly less attractive at the implementation level than the non-sharing one, which would be uncongested and hence smaller. This combination — the explosion of the design space at the implementation level with the ability to explore architecture at a high level — clearly points to the importance of a flow that enables fast, accurate viewing of multiple solutions based on different hard macros and on different locations, orientations, and aspect ratios. The relative importance of this part of the flow versus the more traditional standard cell place and route is the essence of this revolution. Impact on cost and yield As utilization decreases, mechanically the area increases. However, with most ICs aimed at an extremely price-sensitive consumer market, it is important to re-examine the links between area and yield. A classical model of IC cost analysis links the cost of the IC to the cost of the die, using the formula below: Logically, the larger the die, the higher its cost as the number of dies per wafer decreases. The other critical parameter in this equation is die yield. This formula links die yield to die size: The parameter 'alpha' corresponds to process complexity, which is directly linked to the number of steps and is typically around 3.0. Note that despite the recent focus on design for manufacturing (DFM), area continues to have a much larger impact on yield than DFM efforts can possibly produce. The important conclusion from this analysis is that the larger the die size, the more costly the IC, in a relationship that is not linear but is often close to quadratic. The decreased utilization and larger die sizes resulting from the move to a sea of macros, combined with obsolete design approaches and design flows, have had a huge impact on costs. To further validate the model, we have observed the pricing curve in a large foundry currently manufacturing at 0.13um: Table 2 — Area reduction versus cost As shown above, the nature of the area impact on cost is always higher than linear. Translating this analysis into impact on margins, the table below shows that for even moderate production levels (for consumer markets), the inability to achieve smaller die size as a result of lower utilization is dramatic. Only a 10% reduction on a three-million-unit IC chip would result in a margin increase of more than $6M. Table 3 — Die cost reduction Consequences of the "sea of hard macros" revolution Having demonstrated the magnitude of the revolution and the dramatic impact on the cost of designs, we need to consider the critical factors that will immediately emerge from an understanding of this situation. An understanding of the problem, as usual, contains the seeds of the solution. Since we are now living in a world where ICs are dominated by hard macros, their selection and location become the most critical components of an efficient IC physical implementation flow. The two most important factors that will affect designs are as follows: - The ability to choose from a large set of flexible hard macro implementations, with emphasis on memory compilers.
- The ability to place hard macros on a chip so that congestion is minimized and utilization is maximized, resulting in the smallest die sizes.
Those two most important parameters of an IC will then drive the synthesis process with much more efficacy, as well as standard cell placement and routing. These parameters will enable to continue to collapse the design cycle through design at a higher level of abstraction using IP and SoC methodologies, yet maintain the cost effectiveness that is necessary to continue to adopt new technologies. In his presentation to analysts at the Design Automation Conference, Synopsys CEO Aart de Geus demonstrated — using actual customer data — a significant profit pickup available from collapsing the design cycle through the use of floor planning and third party IP. For such profit to happen, we must understand the quiet revolution resulting from the explosive growth in hard IP and the consequences it will have on chip design flow, utilization, area, and eventually on the margins of the IC industry. Jacques Benkoski is president and CEO of Monterey Design Systems. Prior to Monterey, Benkoski founded and headed European operations for Epic Design Technology and held various research and management positions at Synopsys, STMicroelectronics in France and Italy, IMEC in Belgium, and IBM in Israel. He is currently a director of the EDA Consortium and also a member of the Executive Council of the International Engineering Consortium (IEC). Enno Wein is chief technology officer of Monterey Design Systems. Prior to joining Monterey as principal methodologist in 2002, Wein worked with LSI Logic and Infineon for over 10 years designing ASICs, defining and developing ASIC methodology and managing chip design and software projects. |