|
|||
Ultra Low Power Designs Using Asynchronous Design Techniques (Welcome to the World Without Clocks)
Mohit Arora, Freescale Semiconductor
Noida, India Abstract : Wire delay is beginning to dominate gate delay in current CMOS technologies. According to Moore’s Law by 2016 CMOS feature size should be on the order of 22 nm with clock frequencies reaching around 28.7 GHz. Essentially bus-based interconnects are being stretched to the point where they cannot be scaled further. This paper presents challenges with the synchronous (clocked) designs and describes the techniques to overcoming the same with asynchronous (Clockless) design methodology. The paper proposes to redesign the synchronous interconnect to an asynchronous interconnect that should cater to tomorrow’s needs of high speed and low power. These circuits work on Handshaking techniques. If not today SOC industry will be forced driven to this methodology tomorrow. CHALLENGES WITH THE CLOCKED DESIGN Most digital circuits are synchronous, which means that their operation is controlled by a clock. Although the use of a clock has certain advantages in the design of a digital circuit, it also introduces a number of significant problems that are becoming more serious and more prevalent as technology becomes smaller and faster. Following are some of the challenges with clocked design:
With all of the problems caused by the clock, it is very tempting to simply remove it from the system. This is the fundamental idea behind asynchronous design. However, it is not as simple as just removing the clock, since the operation of the circuit must still be controlled somehow. Asynchronous circuits essentially govern themselves, and are therefore called self-timed circuits. ASYNCHRONOUS DESIGN Much of today’s logic design is based on two major assumptions:
However, as with many simplifying assumptions, a system that can operate without these assumptions has the potential to generate better results. Asynchronous circuits keep the assumption that signals are binary, but remove the assumption that time is discrete. The problems with Synchronous designs being discussed in previous section can be avoided by Asynchronous Systems. Figure 1 shows an asynchronous system where the blocks communicate without any global clock. Figure 1 : Asynchronous System Highlights:
MOTIVATION FOR ASYNCHRONOUS DESIGN This section provides several possible benefits migrating to Asynchronous design/systems.
MULLER C ELEMENT: FUNDAMENTAL COMPONENT OF ASYNCHRONOUS CIRCUIT In a Synchronous Circuit, the role of the clock is to define points in time where signals are stable and valid. In between the clock ticks, signals may exhibit hazards and may make multiple transitions as combo circuit stabilizes. In Asynchronous System, situation is different. The absence of clock means signals are valid all the time, every transition has a meaning and consequently any hazard and races must be avoided. In the synchronous world, OR Gate only indicates that both inputs are LOW, when HIGH it does not indicate which one signal made a transition. Similarly AND gate only indicates when both inputs are HIGH but does not indicate which one does LOW when the output of AND gate is LOW. Knowing this transition is very important for Asynchronous circuits as these transitions may have a reverse impact or hazard/ Race condition and should be avoided. So a better circuit in this respect is Muller C Element shown in Figure 2. Figure 2 : Muller C Element and corresponding CMOS implementation. Muller C element is a state Holding element just like set-reset latch. When both inputs are LOW, output is LOW and when both inputs are HIGH, output is HIGH. For other combinations the output does not change. An observer seeing the output change from LOW to HIGH may conclude that both inputs are now at HIGH. Similarly an observer when seeing the output change from HIGH to LOW may conclude both inputs are LOW now. Figure 3 shows the truth table for the Muller C Element. Figure 3 : Truth Table for Muller C Element The C-element is a fundamental building block of many asynchronous circuits. It can be thought of as an AND-gate for events. This is also the State Holding Element in the Asynchronous world. DUAL RAIL ENCODING FOR HANDSHAKE COMMUNICATION In asynchronous circuit, clock signal is replaced by some form of handshaking neighboring registers. There exist several different protocols to implement a handshaking communication like 2-phase dual rail encoding, 4-phase dual rail encoding, bundled data etc. Figure 4 shows an example of 4-Phase Dual rail encoding. Figure 4 : 4-Phase Dual Rail Encoding Two parties can talk to each other reliably regardless of delays in the wires connecting the two and hence the protocol is also called delay insensitive encoding. Highlights:
LEGACY SYNCHRONOUS INTERCONNECT The initial design approach of SoC designers was to select the IP blocks needed to meet application requirements, place them on silicon and connect them with a standard on-chip bus. As was the case with multimillion-gate ASICs containing many connected IP blocks, today’s SoC cannot be built around a single bus. Instead, complex hierarchies of buses are used, with sophisticated protocols and multiple bridges between them (Figure 5). Figure 5 : Synchronous IP Interconnect Communication between any two IP blocks can be via several buses, which places a lot of strain on meeting timing requirements. Essentially bus-based interconnects are being stretched to the point where they cannot be scaled further. SoC designers face a basic paradox in today's environment: rather than enjoying significant time savings by using acquired IP blocks, they spend additional time in learning the function of the blocks in order to build the logic and test vectors for these blocks. Except for the vendors of processor cores, IP vendors typically provide little of the detailed documentation designers need. Consequently, designers find they have to acquire some level of application expertise or use consulting resources to understand the IP well enough to complete these tasks. This additional design and verification burden currently adds months to SoC design projects. Besides imposing a drain on resource-strapped projects, the additional logic inevitably degrades performance and increases chip area, while the additional test requirements further complicate final test stages. CMOS feature size is decreasing and would be, according to Moore’s Law, it is clear that interconnect speed is not keeping up with increase in transistor speed. This means that in future circuits wire delay will no longer be negligible, but play a major role in deciding the maximum frequency at which a circuit can operate. In line with the clocking trends, global clock skew becomes an increasing fraction of clock period. Examining all these issues makes it clear that a new interconnect strategy is required to bring design risks back under control: large high-speed integrated circuits will eventually need to be designed without global clocking. Per the International Technology Roadmap for Semiconductors (ITRS, ex. SIA), 1999 edition: “With clock speed possibly exceeding 5 GHz, and across-chip communication taking upwards of 5 to 20 clock cycles, an approach is needed to building a hierarchy of clock speeds with locally synchronous and globally asynchronous interconnects. Tools to handle asynchronous, multi-cycle interconnect as well as locally synchronous, high performance near neighbor communication are needed.” HIGH SPEED ASYNCHRONOUS INTERCONNECT: THE CONCEPT Figure 6 shows the concept for a system designed around an asynchronous interconnect bus. Figure 6 : High Speed Asynchronous Interconnect (The Concept) The goal is to design a high-throughput, flexible and low-power digital crossbar. Asynchronous circuits can interconnect multiple Synchronous cores in an SoC design, eliminating global clock distribution and simplifying clock domain crossing. Following are some of the highlights
With the Asynchronous Design methodology, the IP is completely de-coupled from the interconnect bus. This makes it possible to integrate asynchronous communication within an existing synchronous system. Due to the delay insensitive encoding, the wires supporting the communication at physical level do not have to be balanced. Unlike the “legacy” technique for IP integration, Asynchronous communication does not require large clock tree buffers due to the IP being de-coupled from the interconnect bus. This saves a considerable amount of power, which can be extremely important for handheld devices that operate on battery power. The Asynchronous approach means that the interconnect bus can run at a much higher frequency, thus increasing overall system performance. Last but not least, the Asynchronous approach can simplify system level verification of the IP block. With the IP block being completely decoupled from the interconnect bus, verification can be performed at the asynchronous dividing point. In the case of third party pre-verified IP, IP level verification can be completely eliminated. DESIGN TOOLS FOR ASYNCHRONOUS DESIGN METHODOLOGY Few commercial available CAD tools for asynchronous design implementation
“Balsa” is built around the Handshake Circuits methodology and can generate gate level netlists from high-level descriptions in the Balsa language. Both dual-rail (QDI) and single-rail (bundled data) circuits can be generated. The approach adopted by Balsa is that of syntax-directed compilation into communicating handshaking components and closely follows the Tangram system of Philips. CONCLUSION Asynchronous design is a rich area of research, with many different approaches to circuit design. This paper describes limitation/challenges with the synchronous/clocked design and motivation to migrate to an asynchronous design for a higher performance and power efficiency. This paper also proposes to replace the current existing synchronous interconnect to an asynchronous interconnect catering to tomorrow needs of high speed and low power. If not today SOC industry will be forced driven to this methodology tomorrow. REFERENCES [1] “Principles of Asynchronous Circuit Design” by Jens SparsØ and Steve Furber [2] “Clockless Logic or How do I make hardware fast, power-efficient, less noisy, and easy-to-design?” by Montek Singh [3] “The Middle Path: Globally Asynchronous Locally Synchronous (GALS) Design” by Scott F. Smith , Boise State University. [4] “Extension of Asynchronous Design Automation Tools” by Michael Boyer, Steinmetz Symopsium 2005. [5] “Asynchronous IC Interconnect Network Design and Implementation Using a Standard ASIC Flow” by Bradley R. Quinton, Mark R. Greenstreet and Steven J.E. Wilton, Dept. of Electrical and ComputerEngineering, University of British Columbia [6] “The Asynchronous Logic Homepage” [7] “Asynchronous Design Methodologies: An Overview” by Scott Hauck, Department of Computer Science and Engineering, University of Washington. [8] “ARM offers first clockless processor core” [9] VLSI Research Group, Sun Microsystems Laboratories [10] “Asynchronous Design Methodologies: An Overview” by Scott Hauck, Department of Computer Science and Engineering, University of Washington. Proceedings of the IEEE, Vol. 83, No. 1, pp. 69-93, January, 1995. [11] “Micropipelines” by I. E. Sutherland, Communications of the ACM, vol. 32, no. 6, pp. 720-738, June 1989. [12] “Computing Without Clocks” by D. Pountain, BYTE, Vol. 18, No. 1, pp. 145-150, January, 1993 [13] “A Realization Algorithm of Asynchronous Circuits from STG” by K. J. Lin, C. S. Lin, in Proceedings of EDAC, pp. 322-326, 1992 |
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |