Clockless IC designs are ready to compete
Clockless IC designs are ready to compete
By Andrew Lines, EE Times
June 6, 2003 (4:06 p.m. EST)
URL: http://www.eetimes.com/story/OEG20030606S0033
When semiconductor companies cite chip performance levels, the immediate follow-up question is increasingly, "But at what power consumption?" The performance/power efficiency debate is picking up as today's systems-from the data center to the shirt pocket-must be power efficient to either save on batteries, cut heat output or conserve electricity.
But can clockless technologies be considered in this debate? While such technology has rightly been viewed as a low-power technology, several advancements have helped to give clockless chips the performance levels they need to be considered outside specialized low-power, low-performance applications.
Although synchronous logic is the dominant design style, clockless design was a forerunner to these chips and has remained a subject of extensive research and development at major corporations and universities for many years. This research has yielded such clockless design styles as bundled data mo de (which provides high-speed pipelines within a synchronous chip), delay-insensitive mode (a design style that allows an arbitrary time delay for any logic block) and integrated pipelines mode (which combines domino logic with delay-insensitive technology for high performance).
The removal of the clock from an IC provides many power advantages. The average clocked semiconductor consumes about 30 percent of its maximum power draw at zero activity. Average power consumption rises linearly from there but is dependent upon both data activity levels and the mix of bits being processed by the chip (see Fig. 1). The average power consumption of a chip must assume a random mix of 0 and 1 bits, with 0 bits requiring no power to process. Power usage can increase over this amount with a higher concentration of 1 bits, which require power to raise a signal between logic blocks. Given a higher concentration of 0 bits, it is also possible for power consumption to be lower than the average.
Several things change when you compare this power consumption with clockless circuits. First is that at zero activity clockless chips consume only nominal leakage current, compared with roughly 30 percent power consumption in a synchronous chip for clock signal distribution. Second, processing both 0 and 1 bits consumes the same amount of power because of the four-phase handshaking process used in delay-insensitive clockless designs.
In its simplest form, when a bit is present, it is encoded on one of two rails to be passed from one logic block to the next. For example, Logic Block A will raise its 0 or 1 rail high when a bit is present and Logic Block B will then raise its acknowledgment high. Logic Block A lowers its signal and Logic Block B follows suit with its acknowledgment.
Power, performance gains
While this handshaking seems to double the power required, clockless technology features built-in circuit-level clock gating, which dramatically offsets the handshaking effect. In a clockles s chip each circuit powers up only when it is used, then immediately powers down. Aggressive clock gating in the synchronous world does bring power consumption down, but with current design technology it can be done only at a logic block level with a design effort that is extensive for most ASICs.
The net result of this comparison is that at 40 percent of activity-a typical usage level-a clockless semiconductor will consume about 50 percent less power than a synchronous chip. Other power-related benefits from clockless design include reduced EMI, built-in voltage scaling and "data gating."
Until recently, clockless technology didn't have a great reputation for performance. There is an inherent performance efficiency in clockless chips because without a clock, there is no need for added timing margin to accommodate slow logic blocks, clock distribution, clock skew or manufacturing process variation. But t raditionally, the logic on clockless chips has been conventional logic, which use mostly p transistors, resulting in a large feature size and slow performance.
In the ASIC world, clockless chips have competed successfully at the sub-100-MHz end of the performance spectrum, in applications where aggressive power management is more important than performance. Typically, any application needing more performance must turn to a synchronous design. But recent technology advances mean that system designers can now look at clockless chips for applications with performance requirements higher than 400 MHz.
With the new integrated pipeline design style, very fast domino logic is integrated into a delay-insensitive clockless foundation; the result is a combination of power-reducing capabilities of clockless robustness over power and process variations due to the delay-insensitive characteristics of the logic and performance that rivals a full-custom synchronous chip.
Domino logic has traditional ly been leveraged only in the most advanced high-performance synchronous designs. These properties stem from domino logic's use of a precharging phase during which all of the transistors in a circuit are loaded. The transistors are then ready to discharge very quickly, like dominos toppling, resulting in a very fast forward latency.
But in the delay-insensitive design, the precharge phase occurs during the handshaking between logic blocks so that when a bit arrives the transistors are set up to operate on it quickly. Delay-insensitive domino logic has proved to offer the same high performance as its synchronous counterpart, yet it has the robustness and power efficiency of clockless circuits. In 130-nanometer process validation chips, clockless performance has reached 1.4 GHz at nominal voltage and over 2 GHz at higher voltage.
Andrew Lines is co-founder and CTO at Fulcrum Microsystems Inc. (Calabasa s Hills, Calif.).I/i>
http://www.eet.com
Related Articles
- How small vendors compete in analog IC market
- Consumer IC Advances -> Set- top box SoC ready for high-speed demands
- Why Transceiver-Rich FPGAs Are Suitable for Vehicle Infotainment System Designs
- Creating SoC Designs Better and Faster With Integration Automation
- Speeding Derivative SoC Designs With Networks-on-Chips
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |