|
||||||||||
Tensilica Unveils Groundbreaking Next-Generation Xtensa LX Processor CoreUpdate: Cadence Completes Acquisition of Tensilica (Apr 24, 2013) Industry’s Highest Performance Core Will Replace RTL in SOC Designs SANTA CLARA, Calif. – May 18, 2004 – Tensilica, Inc. today unveiled its next-generation Xtensa® LX configurable processor, the highest performance processor core on the market, featuring both higher computational throughput and dramatically higher I/O (input/output) bandwidth. This record-breaking performance, combined with Tensilica’s patented automated design and development environment, makes Xtensa LX the only processor fast and flexible enough to replace register transfer logic (RTL) design methodologies in system-on-chip (SOC) designs, leading to reduced development time and risk along with dramatic increases in ROI (return on investment) for semiconductor and systems companies. Xtensa LX is also ideally suited as a traditional control processor in embedded applications. Tensilica expects that most of its customers will use multiple Xtensa LX cores in each SOC design, each tailored to speed a different part of the customer’s application. “With chip development costs now surging past $10 million, SOC development teams need to reduce project development time, risk and cost,” said Chris Rowen, president and CEO of Tensilica. “With the Xtensa LX processor, designers can configure optimized processors specifically tuned to their application in a fraction of the time that it takes to design and verify RTL, with comparable computational and I/O performance. The inherent programmability of the processor gives designers the flexibility to fix bugs and add features purely in software at any point – late in the design cycle or long after first shipment. This is impossible with hard-coded RTL.” The Xtensa LX processor core features significant innovations in four key areas:
Tensilica supports these technical innovations with a patented development environment that automatically and simultaneously generates an optimized hardware implementation, a corresponding tailored software tool chain, and a complete set of EDA models and scripts. Configuration and extension choices made by the designer to address requirements for a given application are immediately and automatically reflected in the entire software tool chain. With alternative approaches, this is typically a manual, error-prone task that requires extensive verification. Lower Power Consumption The Xtensa LX processor’s new architecture dramatically lowers power consumption in large configurations with many designer-defined functions. But even without designer modification, the Xtensa LX processor is designed to use power very efficiently. The minimum configuration of the Xtensa LX processor dissipates a miserly 0.05 mW/MHz in a representative 130 nm process technology. By comparison, the smallest member of the ARM synthesizable processor family, the ARM7TDMI-S, burns 0.11 mW/MHz in 130 nm technology – twice the power consumption of the Xtensa LX. I/O Throughput Improved By Three Orders of Magnitude Designers using the Xtensa LX processor can choose one or two 128-bit wide load/store units. Most standard embedded processors have only a single narrow (32- or 64-bit) load/store unit. However, many applications benefit from two load/store units for data-intensive inner loops – a standard feature of many high-end DSP processors. The Xtensa LX processor’s optional second load/store unit provides greater sustained general-purpose I/O bandwidth and an XY-style memory access for DSP applications. Additionally, at 128 bits, it’s much wider and can accommodate much more data than standard load/store units. The true breakthrough in I/O is the capability to add designer-defined ports and queues, which allow the Xtensa LX processor to communicate as fast and as flexibly as RTL blocks. Ports are wires that directly connect two Xtensa LX processors or an Xtensa LX processor to external RTL. Port connections can be arbitrarily wide, allowing wide data types to be transferred easily without the need for multiple load/store operations. As many as one million signals (1024 1024-bit-wide ports) can be used, and while this is an outrageous number, far exceeding the performance demands of real systems today (providing 350 terabits/sec of direct data flow per processor in a 130 nm CMOS process), this clearly demonstrates that old notions of the I/O bottlenecks inherent in a processor-based solution are now obsolete. While ports are ideal to quickly convey control and status information, queues provide a high-speed mechanism to transfer streaming data. Input queues and output queues operate to the programmer’s viewpoint like traditional processor registers – with the notable exception that data is always available without the need to load or store the data before and after computation. Queues can sustain data rates as high as one transfer every clock cycle or over 350 Gbits/sec for each queue added to an Xtensa LX processor. Custom instructions can perform multiple queue operations per cycle, perhaps combining inputs from two input queues with local data and sending the computed values to two output queues. The high bandwidth and low control overhead of queues allows the Xtensa LX processor to be used in applications with extreme data rates. Ports and queues specified by the designer are automatically added to the Xtensa LX processor and are 100% fully modeled by Tensilica’s Xtensa Processor Generator. The full behavior of the port or queue, just like any other modification made to the Xtensa LX processor, is automatically reflected in the custom software development tools, instruction set simulator, bus functional model and EDA scripts – within about an hour. And because it’s automated using Tensilica’s patented technology, it’s pre-verified and correct by construction – no need to re-verify the processor. Improved Compute Performance Better Interfaces to On-Chip Memories Leading Benchmark Scores The EEMBC Consumer benchmark “out of the box” score was 171.6 @ 330 MHz (0.51997 per MHz), nearly a 9X performance advantage over the ARM1020E. See separate press release issued today titled, “Tensilica’s Xtensa LX Processor Beats All Other 32- and 64-bit Processor Cores on EEMBC Consumer “Out of the Box” Scores.” The Xtensa LX BDTIsimMark2000 score of 6150 for a 370 MHz configuration is 70% faster than the score for the next-fastest licensable core benchmarked by BDTI, the CEVA-X1620.* See separate press release issued today titled, “Tensilica’s New Xtensa LX Processor Earns Top BDTIsimMark2000™ Score.” Specifications Pricing and Availability Xtensa LX is an addition to the Tensilica processor family, which includes the proven Xtensa V configurable processor. Customers will be able to continue to license the Xtensa V processor. The Xtensa V processor and the Xtensa LX processor both implement the common core Xtensa instruction set. About Tensilica * The BDTIsimMark2000™ provides a summary measure of DSP speed. For more information and scores see www.BDTI.com. Scores © 2004 BDTI. The Xtensa LX score includes use of 12 custom TIE instructions that expand the area of the core by 16%. Licensees may require greater or lesser degrees of customization. The scores for all other cores assume that no coprocessors or other customizations were used. The scores for the Xtensa LX and all other cores are for worst case operating conditions in a commercially available 130 nm process. Contact info@BDTI.com for more information. # # # Editors’ Notes:
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |