John Bourgoin, chairman and CEO of MIPS Technologies, keynoted an analyst event recently. He outlined the history of RISC architectures in silicon and cited some problems that CPU designers face. It became apparent in his talk that the RISC revolution has been a mistake. The premise was right, but most of what has happened was wrong. To find the premise, let's go back to the guy mainly responsible, John Cocke of IBM. Working in the twilight of the minicomputer age, Cocke recognized that the speed of memory systems was close to the minimum cycle time of execution pipelines. This meant that, if memory cost was not an issue, compilers could render code into large numbers of short, simple instructions that could be executed in one cycle each. This both simplified the compiler's job and fully used the otherwise-excessive memory bandwidth. Unfortunately for the microprocessor industry, this balance between CPU and memory speeds was transient. By the time Hennessy and Patterson got around to writing their book enshrining RISC as the way to do microprocessors, they were already pointing out that memory speeds were falling behind processor speeds and that this was creating a problem. This observation should have led to the realization that a basic premise of RISC had changed. Instead, what commenced was a crusade to find more memory bandwidth. Rather than stay true to the RISC philosophy of letting the compiler explicitly handle resource issues, architects opted for a statistically based hardware approach: caches. The result has been an unequal arms race between the spiraling advance in CPU clock frequencies and ponderous growth in the size and sophistication of cache structures. Raw memory bandwidth has been left in the dust. What should have happened is the recognition that as CPU clocks grew faster, more of the work load should have been transferred to the CPU. Instead of massive numbers of simple instructions, we should have been learning more about how to make the representation of the algorithm in main memory as dense as possible, so that we had some hope of getting it into the CPU in a timely way. Complex instructions were well explored in the 1960s in the mainframe world, then forgotten as the orthodoxy of RISC crushed all before it. Forgotten, but not lost. It is exactly these techniques that are being rediscovered by developers of hardware accelerators. Living beyond the pale of RISC, accelerator developers have been free to start with the question instead of the answer. Ironically, the result has been that in most systems, the real work is done by purpose-built accelerators while the mighty RISC engine is left to the tasks that aren't time-critical. The accelerator people remind us that it may be CISC architectures and the other techniques that will let CPUs go forward. But even they can't make up for two decades of compiler research that didn't happen. Ron Wilson is semi-conductors editor for EE Times. Feedback and suggestions are always welcome at rwilson@cmp.com. |