MARGAUX, France While CMOS scaling will continue to reduce costs, system-on-chip (SoC) performance improvements will depend on innovations that produce integrated design systems, said IBM's Lisa Su at a keynote speech at the Multiprocessor SoC (MPSoC) forum here Monday (July 11). Su, vice president of technology development and alliances at IBM's Systems and Technology Group, pointed to the Cell architecture developed by IBM, Sony, and Toshiba as an example of an SoC platform that can bring supercomputing power to the desktop. In a second keynote speech, Alain Artieri, director of engineering for STMicroelectronics' application processor and portable platforms group, presented that company's Nomadik technology as an MPSoC solution for advanced multimedia. Su noted that today's top supercomputers are in the 10-100 teraflop range, about the equivalent of a rat's brain. By 2015, she said, supercomputers may reach 10,000 teraflops or more, approaching the human brain. But high-end supercomputers serve small application niches. What's possible in 5 to 10 years, Su said, are low-cost desktop systems and consumer devices that extend into the gigaflop range. What has allowed all this to happen is an exponential change in technology, Su said. But she noted that CMOS device performance is getting harder and harder to maintain. "A lot of physical phenomena are limiting scaled technologies," she said. "There will be new technology, but we will do a lot more with integration of circuits and systems." "Traditional scaling drove performance and cost, but today we're in an era where innovation drives performance," Su said. At 130 nm, she said, traditional scaling accounted for over 80 percent of transistor performance improvements, with innovation making up the rest. But at 90 nm, Su said, scaling only accounts for around 40 percent of the performance improvement, with innovation taking up the rest. One result is that "scheduled invention" is now the major component in IBM's technology plans. Su also noted that power has become the major limiting factor in processor design. While active power increases are "fairly well behaved," she said, the real problem is the dramatic rise in passive power through gate and source-drain leakage. One solution to this problem is strained silicon. Su also said that new materials, such as hafnium silicon oxide, will reduce gate leakage current. With all these limitations on silicon, more efficient architectures become key. Su noted that increased integration is driving processors to take on many aspects previously associated with systems. Su presented Cell as a flexible architecture aimed at digital media applications, especially games and movies. It includes a 64-bit Power processor for control elements, and 8 synergistic processor elements (SPEs) for data elements. The observed clock speed is over 4 GHz, and peak single-precision performance is over 256 Gflops. Further, she noted, the Power architecture is an open platform for innovation and collaboration, with a community evolving around the power.org web site. "There's a lot of work to do with the programming model, so an open community is important," she said. According to Artieri, the Nomadik architecture attempts to combine the best of two worlds consumer products and personal computers. Aimed at providing high-bandwidth multimedia content for applications such as cell phones, it also emphasizes a low-power design approach. Nomadik is a heterogenous, multi-processor SoC with a general-purpose ARM CPU, DSP subsystems, and hardware accelerators. The current 90 nm version includes audio and video DSP subsystems, and a 3D hardware accelerator. ST envisions 4 subsystems in 65 nm technology, and 6 at 45 nm, Artieri said. There are many benefits to multi-processor SoCs, Artieri said. He noted that high computing performance comes from "multiple, non-interfering domains of intense activity," each with its own processor and accelerators. Fine-grain power management is possible at the subsystem level, and leakage can be reduced by switching subsystems off and on. Multi-processor SoCs also offer software flexibility, he said. The Nomadik has three "views," Artieri said. The monolithic CPU view has maximum flexibility, but limited performance. Adding symmetric DSPs still provide high flexibility, but require a DSP tool chain. Adding accelerators gives maximum performance and the lowest power, but needs an advanced programming model. But multi-processor SoCs face a tough bottleneck. "Memory hierarchy and bus design are becoming a major design challenge," Artieri said. "Smart caching in embedded memory is key." The Nomadik SoC uses software-controlled caching to resolve latency problems. The programming model includes the Nomadik kernel, a set of system services and an API for drivers and firmware. A component manager provides a gateway to all subsystems. The DSP subsystems are programmed entirely in C. Comparisons between the Nomadik and Cell architectures aren't really meaningful, IBM's Su noted, because they target different markets. While the Nomadik targets embedded, mobile applications, the Cell is aimed more at standalone desktop systems. Now in its fifth year, MPSoC is an international forum focused on application specific multi-processor SoCs. |