|
||||||||||||||||||||||||||||||||||||||||
Using FPGAs to Improve the Performance of Radar, Navigation and Guidance SystemsBy AccelChip Radar, navigation and guidance systems process data that is acquired using arrays of sensors. The energy delta from sensor to sensor over time holds the key to information such as targets, position or course. This two-dimensional array of data, often referred to as an "observation matrix," must be solved as a set of linear equations to extract the desired information. Solution methods include matrix inverse, factorization, adaptive filtering and singular value decomposition and are typically performed using floating-point arithmetic to allow for sufficient dynamic range and precision of the input data. Doing so, however, limits the performance of a system. Today's DSP-oriented FPGAs such as Xilinx Virtex 4 and Altera Stratix II provide far greater performance than a floating-point DSP processor for this class of applications and offer the flexibility to extend the dynamic range of a fixed-point implementation significantly beyond the limitations of a fixed-point DSP processor. Singular value decomposition (SVD) for an 8x8 matrix can run over 50 times faster in fixed-point arithmetic on an FPGA than a floating-point implementation running on a TI TMS320C67x DSP processor. Achieving this performance requires a hardware architecture that utilizes 261 of the Virtex4 DSP48 multipliers running in parallel at 200 MHz. These are challenging applications to design on any hardware platform. Determining an FPGA architecture that effectively utilizes the DSP blocks to achieve a worthwhile performance advantage adds significantly to this design complexity. The type of architectural tradeoff analysis necessary to determine an optimal solution, however, is well-suited to a high-level DSP design methodology. The Fixed-Point Dynamic Range "Issue" MATLAB Example of QRD Matrix Inverse: This algorithm, when implemented with Givens rotations, requires 5 cascaded multiply and one divide operation to be performed on the input data. Fixed-point arithmetic dictates that the number of output bits of a multiply operation be equal to the sum of the two input operands if all precision is to be maintained. If left un-truncated bit widths can grow quickly as shown below in Figure 1. Figure 1 - Fixed-Point Bit Growth of Multiplies FPGA logic is not limited to specified bit widths for internal busses and may grow as needed to meet the demands of the application. This bit growth comes at the expense of added hardware which, if left unbounded, can be significant. Reasonable internal bit growth beyond 16-bits, however, can improve the dynamic range of a fixed-point implementation to provide a viable hardware solution for systems using up to 16-bits. Exploring the bit growth requirements of the QRD matrix inverse shows that quantizing the inputs to 16-bits signed offers an integer dynamic range between -32,768 to +32,768. Figure 2 shows the AccelChip DSP Synthesis tools "Fixed-Point Report" which lists the quantizations used for the QRD matrix inverse. In Figure 2 the "Quantizer" column nomenclature is as follows: "fixed" means signed twos-complement, "ufixed" means unsigned binary, "floor" is the saturate mode if the MSB and "wrap" is the rounding mode of the LSB. The number in square brackets represents the word length and decimal point location respectively. For more information on this nomenclature refer to the MATLAB help for the command "quantizer." Figure 2 - AccelChip Fixed-Point Report The variable "RDiagInv," which is the result of a divide operation, is quantized using 32 total bits with 17 integer bits. Maintaining an adequate number of integer bits here is critical to maintaining an acceptable response of the inverse function. The flexibility offered by the Virtex 4 FPGA allows for the necessary bit growth of the integer bits to occur while some reasonable trimming of the fractional bits may take place. Multiplying Operands Greater Than 16-Bits Figure 3 - 32 bit multiply implemented in Virtex 4 DSP48 Blocks The FPGA Performance Advantage Figure 4 - Sensor Array Processing Block Diagram Implementing this application in a floating-point processor requires either multiple chips or a significant compromise in performance to allow resource sharing of the limited multiplier resources between the multiple DSP operations. A single FPGA, however, supports the entire operation providing a performance advantage The 500 Multiplier Advantage! Design Challenges AccelChip® provides a high-level design methodology that greatly simplifies this process. Radar, navigation and guidance systems can be described in MATLAB using loops, and vector and matrix multiplies. These operations can be automatically "unrolled" during the algorithmic synthesis process providing designers a rapid way to explore the impact of parallelism on different blocks of the system without modifying their golden source. By using an automated flow the final solution can be easily tailored to maximize the available resources of the target FPGA. Table 1 provides an example of how design exploration can be used to tailor the performance of a QRD-RLS adaptive filter.
AccelChip Solutions Figure 5 - AccelWare IP Generation Form for QR Inverse AccelChip® DSP Synthesis provides complete flexibility to define and implement custom architectures for radar, navigation and guidance systems using floating-point MATLAB. AccelChip provides automated floating- to fixed-point conversion to assist in solving the complex quantization issues resulting from the cascaded multiply and divide operations used in matrix inversion and factorization. Once an acceptable fixed-point model is determined, users can rapidly explore performance verses hardware tradeoffs using algorithmic synthesis. Here the number of dedicated hardware multipliers used in the design can be quickly increased to improve performance and take full advantage of the flexibility of the FPGA architecture. Summary [1]TMS320c67x Floating Point DSP Performance [2]Report on NAG Benchmark Tests for SUN SMPs, The University of Liverpool [3]Comparing Fixed-and Floating-Point DSPs, Texas Instruments [4]A BDTI Analysis of the Texas Instruments TMS320C67x, BDTI
|
Home | Feedback | Register | Site Map |
All material on this site Copyright © 2017 Design And Reuse S.A. All rights reserved. |