High-Performance DSPs -> Software-defined radio infrastructure taps DSP
Software-defined radio infrastructure taps DSP
By Argy Krikelis, Chief Technology Officer, Azibananye Mengot, Application Engineer for Wireless Technology, Aspex Technology Ltd., Uxbridge, U.K., EE Times
January 4, 2002 (12:39 p.m. EST)
URL: http://www.eetimes.com/story/OEG20011115S0063
The announced delays in the rollout of third-generation (3G) wireless communication infrastructure and services, and the increased need to update the existing wireless infrastructure with General Packet Radio Service, Enhanced Data Rates for GSM Evolution and other intermediate solutions are underscoring the need for a flexible basestation architecture. The most promising approach to upgrading existing basestation equipment as well as introducing new equipment is in the use of software-defined radio, where software can be used to change parameters quickly, even instantaneously, according to traffic, transmission standards and air interference conditions. That approach, in effect, makes the basestation whatever the wireless operator wants it to be. The flexibility of a software-defined radio system resides in its capability to operate in multiservice environments without being constrained to a particular standard. In theory, software-defined rad io should be able to offer services for any already standardized system or future ones on any radio frequency band. The most attractive property of a software-defined radio system is its ability to adapt itself according to environmental conditions and traffic requirements, especially in the support of multimedia traffic. For example, a mobile operator would have the opportunity to configure the network to support the video, data or voice traffic streams that will maximize its income. Software-defined radio implies that the boundary between the analog and digital world in basestations moves as much as possible toward radio frequency, by adopting analog-to-digital and digital-to-analog wideband conversion as close as possible to the antenna; and the replacement of fixed-function dedicated hardware with technologies that can support as many radio functions as possible in software. Since the late '90s, mobile communication networks are increasingly deploying the code-division multiple access (CDMA) syst em with spread-spectrum and wideband receiver techniques. The spread-spectrum approach is ideal for secure communications. Since the signal is spread over a wide band, it is difficult to intercept or jam. Its inherent immunity to interference is also useful for commercial wireless systems that are often subject to noise from external sources. Good noise immunity ensures maximum system performance. Spread-spectrum radio systems, however require extremely high processing rates (several billion operations per second) and data throughput requirements. The receiver de-spreads each radio channel in a way that depends on the air interference conditions. Inside the the receiver a number of multipaths, or receiver "fingers," recover separate elements of the original signal. These elements include signals delayed by reflections of the RF signal in the transmission path. This type of receiver is often referred to as a Rake receiver because of its ability to track multiple paths. Within the fingers, data is convolut ed with a bank of filters, where each filter then maps the data to the transmission path of a separate multipath. This function forms the basis of the de-spreading process. The individual elements of the signal are re-assembled, maintaining maximum energy and thus signal integrity.
Unlike a fixed-circuit switched network, any wireless system is an unreliable network that is subject to transmission loss and/or interruptions. For that reason, error correction is a key element within wireless systems. Forward error correction plays a large part in the signal recovery process and, not surprisingly, can be a huge processing burden, since it potentially requires several billions of operations per second per channel.
Software-defined radio systems are expected to support turbo coding for data, Viterbi for voice, and possibly both turbo and Viterbi coding for video information. Genera ting codes for data error correction is extremely processor intensive and data dependent (where the sequence of operations cannot be determined in advance), with turbo coding being the most demanding.
Prior to connecting to a network, the digital signal needs to be coded again in an operation that is often referred to as transcoding. In data-rich environments where software-defined radio systems are expected to operate, the requirement is to accommodate many alternative network formats for inclusion of Internet Protocol (IP), video, Public Switched Telephone Network (PSTN), asynchronous transfer mode (ATM) or other packet-switched networks.
Depending on the flexibility, future-proofing requirements and principle reasons for the transition to 3G, a number of options exist. If the radio system is configured by the operator for subscriber capacity reasons, as is the case of the current 3G wireless trials in Japan, then voice may be the primary traffic. In this case, the transcode task may only need to l ink to the PSTN. Alternatively, the radio system may be configured as a multimedia gateway providing a dynamically reconfigurable resource with links to ATM, IP, PSTN or alternative packet networks as required. One thing is certain: Software-defined radio systems will need to support transmission over a packet-based network for data traffic as well as for routing voice calls over IP to reduce call costs and, therefore, to increase margins for the service providers.
A number of enabling technologies can be used in the development of platforms for software-defined radio systems. One of the most challenging requirements to those technologies is the support for scalable systems.
Scalability in software-defined radio systems defines the ability to independently vary the number and size of resources (memory, processing and I/O bandwidth) that is used to support the radio infrastructure. Scalable high performance is an intrinsic characteristic of software-defined radio: the ability to scale the architectura l components to meet evolving standard, traffic and service requirements, without the need to introduce new architectural components or changing the underlying infrastructure.
Those requirements call for a modular approach comprising a number of identical "processing channels." In such an architecture, each channel supports its own external I/O, which can be implemented to support any standardized or customized external interface. If a single-channel interface can cope with the external data bandwidth required by the media-processing application, then there is no requirement for additional processing channels. However, if a single interface is not adequate, then an appropriate number of processing channels can be included in the system to help balance the data bandwidth by evenly distributing the data stream amongst the channels.
Each processing channel comprises storage and processing power in the forms of storage module and processing module, respectively. The storage module in each processing chan nel is used to store media data for processing or results of media-related processing. The processing module implementation needs to support software-programmable, high-performance processing, which can linearly scale to match the continuously processing requirements of basestations used in software-defined radio infrastructure.
The combination of the high performance required for CDMA processing, the need to support hundreds of users per basestation and the lack of software-programmable devices that can support tens of billion of operations per second led to the extensive use of ASIC components in the early implementations of 2.5G and 3G infrastructure. Such components perform such fixed functions as correlation for rake receiver applications and Viterbi decoding for forward error correction.
Although an approach based on the use of ASICs seems to be a simple and direct one for solving the performance and power consumption concerns of software-define radio systems, it has serious inherent problems. As the performance and function support requirements of the infrastructure scale, the number of ASIC components increases linearly. This results in large silicon area; costs that include time-to-market associated with the development and debugging of new software; and power consumption difficulties.
An ASIC-based approach also is limited in its ability to support systems that need to adapt to air interfaces and traffic requirements while in the field. In general, an ASIC-based infrastructure for software-defined radio demands significant resources for development and support
The potential use of reconfigurable logic like FPGAs in wireless infrastructure is an alternative to the use of ASICs.
FPGAs usually are used for fast ASIC prototyping. In addition to the traditional logic blocks and interconnection resources, some new FPGA devices, like XtremeDSP by Xilinx Inc., integrate fixed-function units like multipliers. By designing systems to use the full capability of dynamically reconfigurable FP GAs, it is possible to create systems where silicon area is no longer a function of the number of modes supported in a basestation, while providing high flexibility in the field.
Configuration of FPGAs is typically performed when the system power is turned on. During operation the FPGA configuration is usually fixed, so the FPGAs do a fixed operation until the system power is turned off. However, recent FPGAs allow dynamic reconfiguration, where a portion or the entire device is configured on the fly while it performs processing functions. The FPGAs appear to be better suited than ASICs for software-defined radio since they support upgrades to future requirements much more efficiently. They take seconds to reconfigure, however a very long time for wireless infrastructure, which is likely to need reconfiguration times on the order of a few tens of miliseconds.
Finally, problems associated with the costs of using FPGAs have limited their use mostly to system prototyping. Among these costs are ve ry high device costs; lack of efficient tools or knowledgeable personnel to accelerate engineering development; and intellectual-property concerns where efficient hardware block implementations are patented and their use results in royalty payments.
The deficiencies of software-defined wireless infrastructure solutions based on pure ASICs or FPGAs led to a hybrid system architecture, where ASICs and/or FPGAs are used in combination with software-programmable, high-performance digital signal processors. The high-performance operations with regular throughput rates, such as filtering, are handled by ASIC or FPGA devices, while DSPs handle algorithms characterized by irregular throughput rates.
The modern DSP architectures, available from traditional DSP manufacturers such as Agere, Analog Devices, Motorola and Texas Instruments, are relying on high clock speeds (predicted to reach 1 GHz in the next two to three years) and parallel processing in the form of either very long instruction word (VLIW) or s ingle-instruction, multiple data (SIMD) approaches, where a small number of large-grain, rather powerful processing units (up to eight in current implementations) are integrated on a single device.
Both the wireless-infrastructure industry and the manufacturers of DSP devices are beginning to acknowledge that such devices cannot address the requirements of software-defined radio infrastructure. The degree of parallelism that is offered in DSP devices is neither adequate to deliver the high performance nor usable in nondeterministic types of operations characterized by frequent conditional execution and irregular control flow.
A problem associated with large-grain, multiprocessing systems is the overhead associated with keeping the units coordinated. This leads to a very low-level program development if the application needs to be implemented efficiently. However, there are not enough programmers who are confident with programming in such a low level, and it is very hard to debug. In addition, the t emporal approach (very high frequency) that current DSP designs are using for high performance cannot address the processing requirements needed in applying long filtering operations very common in CDMA systems.
The hybrid approach of software-pr-ogrammable DSP chips and ASIC/FPGA devices results in a heterogeneous system architecture, which is very complex, difficult to scale and difficult to maintain and upgrade. In an attempt to address some of these issues, some DSP manufacturers are introducing programmable DSP cores and fixed-function units on the same device. This is the case of the latest TMS320C6416 device announced by TI. In addition to the programmable VLIW core, it also integrates Viterbi and turbo decoding blocks. This type of approach was first introduced by Lucent with its 16xx series DSP that also integrated a Viterbi decoding block. Such integration only decreases the number of components, rather than addressing the most important issues of scalability and adaptability.
Finally, ther e is always a key question associated with the traditional high-performance programmable DSP architectures: future proofing. Traditional DSP companies introduce new architectures every three to four years at a great research and development expense, which in many cases is in excess of $100 million. These new architectures often have little resemblance to their previous versions. As a result, long-term development is becoming obsolete and systems that are employed for some time, as is the case of software-defined radio infrastructure, will require expensive upgrades.
Another type of a hybrid approach is the integration of a reconfigurable interconnection fabric that is based on the same principle as the FPGA reconfiguration, with arithmetic units that are interconnected using the interconnection fabric. The principle of this approach is that the interconnection can be programmed on the fly to interconnect the arithmetic units according to the processing requirements. The CS2112 device by Chameleon Systems is an example of this approach. Despite the potential high performance of such devices, achieved by the density of arithmetic units that can be integrated on a single device, their main disadvantage is the reconfiguration overhead.
Although downloading the reconfiguration information can be overlapped with data processing, updating the interconnection fabric requires a few tens of miliseconds, which is too long for wireless-infrastructure applications where the frame time is 10 to 20 ms. Furthermore, the reconfiguration information for the interconnection fabric needs to be known well in advance in order to compute the required vectors. That goes against the concept of software-defined radio, where air interface and traffic conditions are expected to be dynamically (and unpredictably) changed.
Aspex Technology Ltd. has developed a completely software-programmable technology that is most appropriate for the high-performance, scalable requirements of software-defined radio infrastructure. Aspex's Asso ciative String Processor (ASP) architecture comprises an SIMD parallel-processor core incorporating a string of identical processing units, a reconfigurable intercommunication network and a vector data buffer for fully overlapped data input-output.
The ASP architecture is different from mainstream high-performance architectures in its use of a fine-grain, bit-serial implementation for each processing unit, and the interconnection, which is reconfigurable through application software rather than through previously determined information.
In addition, each processing unit can perform logical and relational operations by employing associative processing techniques, also known as content-addressable processing. Most important, the support for associative processing offers deterministic performance for data dependent processing, a feature that is unique to ASP amongst the high-performance processing architectures. Although each processing unit is only capable of performing bit-serial operations, they are very simple to implement, requiring approximately 2,500 transistors each. Consequently, thousands of them can be implemented on a single device, thus providing very high performance.
The latest implementation of the ASP architecture, called Linedancer, implements 4,000 processing units that can deliver in excess of 100 Giga operations per second, operating at 266 MHz. The ASP's SIMD structure makes it suitable for supporting processing for software-defined radio infrastructures, since the available processing resources can be used to process either long filter sequences or long bit sequences of data decoding for one or more users simultaneously. Indeed, ASP implementations like the Linedancer device can deliver processing power that is typically associated with ASICs and performance flexibility that is characteristic of microprocessors. A single Linedancer device is capable of processing in true software-programmable fashion tens of CDMA users simultaneously.