NoC Silicon IP for RISC-V based chips supporting the TileLink protocol
Implementing DSP Functions Within FPGAs
EE Times: Design News Implementing DSP Functions Within FPGAs Cofer and Harding discuss how to implement DSP functionality, covering when FPGAs are a good fit for DSP algorithm implementation and important design decisions and considerations. | |
By: RC Cofer and Ben Harding (09/07/2005 4:39 PM EDT) URL: http://www.eetimes.com/showArticle.jhtml?articleID=170701252 | |
DSP processors have conventionally moved to higher levels of performance through a combination of the following techniques:
Each of these enhancements has contributed to increased DSP processor performance improvements and performance increases will continue. However, ultimately each of these design enhancements seeks to increase the parallel processing capability of an inherently serial process. The traditional approach for achieving performance beyond the current level of DSP processor performance was to transition the design to an ASIC implementation. The disadvantages of ASIC implementations include relatively large Non Recurring Engineering (NRE) costs, relatively high unit volume requirements, limited design modification options, and extended development schedules. Higher performance implementations of specific DSP algorithms are increasingly available through implementation within FPGAs. Ongoing architectural enhancements, development tool flow advances, speed increases and cost reductions are making FPGA implementation attractive for an increasing range of DSP-dependent applications. FPGA technology advances have increased clock speeds and available logic resources into and beyond the range required to implement many DSP algorithms effectively at an attractive price point. FPGA implementation provides the added benefits of reduced NRE costs along with design flexibility and future design modification options. If further performance improvements are required beyond the capabilities of current FPGA technology, a risk reduced path is available to transition the design into an ASIC implementation. Ultimately FPGAs provide a design platform which offers the flexibility of a general purpose DSP processor implementation with some of the performance increases available with ASIC technology. When to use FPGAs for DSP
Algorithm performance improvement in an FPGA-based implementation over the performance in a conventional DSP processor is usually based on a combination of factors. The most common are an increased data path width and/or an increased operational speed resulting in a higher overall performance. Another performance improvement is the ability to separate the data stream into multiple parallel blocks of data which have limited interdependence. Each data block can then be operated on independently, and the results combined, resulting in higher relative performance. Taking advantage of any architectural opportunity for maximizing the number or speed of operations is essential to maximizing the performance achievable within an FPGA. The critical architectural transformation necessary to maximize an algorithm's performance within an FPGA is the process of translating every serial operation or group of operations into the most parallel implementation possible up to the limits imposed by resources available within the target FPGA device for implementing a specific function. A further performance advantage can be gained if the FPGA can perform operations on multiple channels or streams of data. Example applications include Time Division Multiple Access (TDMA) multiplexing, multiple channel communication protocols, and I/Q math based algorithms. Since each channel can be processed in parallel the performance advantage associated with each channel can be multiplied times the number of channels implemented. Designs which require signal pre-processing can also benefit since filtering and signal conditioning algorithms are generally relatively straight-forward to implement within an FPGA architecture. When an algorithm is implemented in a structure which takes advantage of the flexibility of a target FPGA architecture, the benefits can be tremendous. Algorithms can be customized to adjust to system requirements on the fly. Filter coefficients, implementations, and architectures can be updated to reflect changing system conditions and user requirements. The implementation of an algorithm within an FPGA also provides a range of implementation options. The design team must determine and prioritize their design objectives. It is possible to implement an algorithm as a maximally parallelized architecture, or in a highly serial architecture using a single structure which is fed sequential data elements with the functionality of a loop counter implemented within a hardware counter. A hybrid architecture can also be implemented, which is a parallel implementation of serial structures or a serial chain of parallel architectures. Each of these design options will have its own set of characteristics including the number of devices required to implement a function, resource requirements within a device, maximum speed, and cost of implementation. The design team has the flexibility to optimize for size, speed, cost, or a target combination of these factors. An FPGA device also provides a platform for integrating multiple design functions into a single package or group of packages. Integration of functionality can result in higher performance, reduced real-estate requirements, and reduced power requirements. Resources integrated into the I/O circuitry of FPGAs can further improve system performance by allowing control of drive strength, signal slew rate, and implementation of on-board matching, resulting in fewer required system-level components. Further design integration can be implemented by incorporating hard or soft processor cores within an FPGA to implement required control and processing functionality. The availability of pre-verified design functionality through Intellectual Property (IP) availability can also be used to implement and incorporate common functionality. The ability to incorporate multiple system-level components and design functionality within a smaller quantity of components can potentially reduce risk, cost, and schedule. Critical Design Considerations Clock Sourcing and Distribution
Some of the advanced FPGA design issues which must be carefully evaluated and implemented from the earliest design stages include:
Synchronous Design Clock Boundary Transitions Pipelining Pipelining is an essential element of implementing DSP algorithms within FPGAS. This is a result of the register-rich architecture of conventional FPGA fabrics and the need to register data between operations to maximize speed and performance. FPGA device fabrics have been developed to support this tradeoff between heavy utilization of flip-flop resources to implement registers and the ability to handle wide data streams at increasingly high data rates. Critical Internal Signal Routing Numerical Representation Fixed-point numbers can be represented as unsigned integers or signed magnitude values. The two most popular signed representations are two's complement and one's complement. Two's complement is the more popular of the two formats due to its simplified implementation when considering arithmetic overflow conditions. One's complement has the characteristic that negative and positive numbers have identical bit patterns with the exception of the leading sign bit. There are also more advanced numeric representations and modified implementations of existing standards. Where possible it is desirable to implement designs based on existing defined standards. This supports simplified design comprehension for engineers new to a project and simplified modification for future modifications or enhancements. DSP-Oriented Architectural Features
DSP Intellectual Property (IP)
Table 1: IP categories Design Verification and Debug Simulation at the earliest phases of the design before significant effort has been expended on integrating design blocks can help avoid extended design debug and testing later in the design cycle. Further schedule gains can be achieved by implementing modular testbench blocks which can be scaled with the design to verify each subsequent design phase. Conclusion By developing an understanding of the overall design cycle, available development tools, and implementing trade-studies for critical design decisions, a design team can avoid common design implementations missteps. By implementing a system which takes maximum advantage of the resources available within an FPGA a higher level of performance is assured and aggressive schedules can be maintained. Taking advantage of the ability of FPGAs to implement an integrated, modular, high-performance design an adaptable, efficient lower-system cost design can be achieved. About the Authors Ben Harding has 15+ years of hardware design experience including high-speed design with DSPs, network processors, and programmable logic. He also has embedded software development experience in areas including voice and signal processing, algorithm development and board support package development for numerous Real-Time Operating Systems. Ben has a BSEE from University of Alabama-Huntsville with post-graduate studies in Digital Signal Processing, parallel processing and digital hardware design. Ben has presented on FPGA and processor design at several conferences and is the co-author of the book Rapid System Prototyping with FPGAs published by Elsevier.
| |
All material on this site Copyright © 2005 CMP Media LLC. All rights reserved. Privacy Statement | Your California Privacy Rights | Terms of Service | |
Related Articles
- Implementing analog functions in rugged, rad-hard FPGAs
- Implementing digital processing for automotive radar using SoC FPGAs
- Implementing floating-point DSP on FPGAs
- Implementing floating-point algorithms in FPGAs or ASICs
- Implementing custom DDR and DDR2 SDRAM external memory interfaces in FPGAs (part 1)
New Articles
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Synthesis Methodology & Netlist Qualification
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
- Demystifying MIPI C-PHY / DPHY Subsystem
E-mail This Article | Printer-Friendly Page |