High-Performance DSPs -> DSPs tread many paths to raise performance

DSPs tread many paths to raise performance

DSPs tread many paths to raise performance
By Jeff Child, EE Times
November 15, 2001 (2:08 p.m. EST)
URL: http://www.eetimes.com/story/OEG20011115S0044

Makers of digital signal processor chips continue to push the limits of floating-point and fixed-point performance. But there's more to real performance metrics than simple clock-speed ratings on a data sheet. Advanced DSPs are tuned to specific applications, often with compute engines tailored to perform certain types of algorithms particularly well.

Meanwhile, system-level issues grow in importance as DSPs race to higher speeds. A DSP that crunches numbers at lightning speed is less than useful if the data can't be fed to it fast enough. Board-level DSP vendors are tackling that issue with a variety of bus and interconnect schemes, some employing sophisticated switch fabrics. Articles in this section explore the many issues, trade-offs and solutions associated with high-performance DSP.

The four major DSP chip vendors, Agere (formerly Lucent Microelectronics), Analog Devices, Motorola and Texas Instruments, continue to dominate the market for merchant DSP silicon. That said, a flurry of recent activity revolves around vendors of licensable DSP cores. The growing list includes a mix of old and new core vendors including 3DSP Corp., Equator Technologies Inc., LSI Logic Corp., Silicon Spice and Siroyan Ltd.

In fact, more than 80 companies are offering core products with DSP functionality, said Will Strauss, president of DSP market research firm Forward Concepts. "DSP may be the heart of these core products, but they're not always sold as a DSP," said Strauss. "Typically those 80 companies each specialize in one or two areas." In contrast, the big-four DSP IC vendors offer comprehensive lines of DSP products targeting diverse applications.

At the big-four DSP vendors, the idea of cores is also key. The latest ongoing trend is to offer DSP chips that embed multiple CPU cores. For years they've made incremental adds outside of the core, such as coprocessors, special extensions, addressing units and such. Now they are racing to squeeze several C PU cores and arithmetic logic units onto one device.

Via their StarCore Alliance, Motorola and Agere are doing just that with the SC140 core, an architecture that supports scaling up of the number of ALUs. The SC144, a version with four ALUs, is today used in two Motorola-announced products, the MSC8101 and the MSC8102. The SC140 core is also used in the Agere StarPro product. Agere has plans to spin versions of its StarPro DSP with three SC140 cores on-chip.

But multicore devices add complications, Strauss cautioned. "Everyone's decided to replicate lots of DSPs on a chip because they're so small," he said. "The real problem is how do you get them all talking to each other, and how to you parcel out the tasks to the various DSPs? There have been vendors with multiple DSPs on a chip who have folded their tent."

According to Strauss, applications that need to cram in lots of DSP functionality include wireless basestations, mobile switching centers, digital subscriber line access multiplexer rack s, media gateways and similar equipment. "Such systems need banks of boards with multiple DSP chips crammed onto boards. Making those DSPs work well together is very much dependent on the software linking them together," he said.

David Baczewski, strategic marketing manager of Motorola's Wireless Infrastructure Systems Division, said this trend toward multicore devices makes it harder for designers to make choices. "When you've got multiple execution units, some of them specialized, some of them special purpose, it's hard to compare performance," said Baczewski, who is also a member of the StarCore Alliance Marketing Council. "As an industry, we haven't come up with anything better then the traditional metrics like megahertz, megaflops and Mips."

With that in mind, vendors are now more focused on how DSPs perform in a particular application. "In media gateway applications, for example, DSP comparisons are made where it's no longer a matter of Mips or multiply-accumulates per second, but rather on proving how many channels of a particular speech coder your DSP can handle," Baczewski said.

Expanding on that same point, Ray Simar, a Texas Instruments Inc. fellow and manager of the company's Advanced DSP Architecture Development Group, said that one of the questions DSP architects face is not chip performance alone, but how system designers are going to use that increased performance.

Along those lines, Simar separated the types of advances needed in DSPs into three broad categories: channel stacking, function stacking and pure, raw-performance demand.

Channel stacking is where applications just do more of the same thing. "We see that a lot in wireless basestations, where designers want to know how many more channels they can do per processor," he said. In contrast, function stacking is where applications do a mix of processing, Simar said. This includes applications that must, for example, run multiple vocoders with multiple standards.

Finally, pure performance-enabled applications inc lude designs like DSL systems. There, designers are replacing prototype system architectures comprised of gate arrays that performed a hardwired implementation. Once they have a DSP with high enough performance that can still manage power well, the DSL function can move off the gate arrays into code running on a programmable DSP.

TI's C64x DSP architecture, for its part, incorporates features that target emerging applications such as video transcoding. Video transcoding is done in communications infrastructure systems where video content has to be converted to different formats depending on how it's going to be displayed. To support that, the 600-MHz, very long instruction word C64x offers either eight 8-bit multipliers or four 16-bit multipliers.

The C64X also integrates Viterbi and turbo coprocessors on-chip. To meet the needs of third-generation cellular basestations, the system typically streams data in through those coprocessors to do the error correction, while the DSP core does the more typica l DSP operations such as coding, filtering and echo cancellation. Also on board the C64 is high-performance direct memory access that lets these different peripherals run in parallel with the DSP CPU.

"The whole idea is to make it possible to keep the overall performance of the machine as high as you can," said Simar.

Industry Articles

High-Performance DSPs -> DSPs tread many paths to raise performance