Process Detector (For DVFS and monitoring process variation)
Built-In DMA Engines Unleash Power of PCI Express Switches
dspdesignline.com (November 11, 2008)
Direct memory access (DMA) technology has been around for more than 20 years. DMA has been used principally to offload memory accesses (reading and/or writing) from the CPU in order to enable the processor to focus on computational tasks and increase the performance of embedded and other system designs.
Traditionally, there have been many components with a DMA engine inside including microprocessors (CPUs), disk drive controllers, graphics processors and various end-points. The DMA engine in all of these devices is used to transfer data between memory and I/O devices without the involvement of the core central processing unit.
DMA is also used for intra-chip data transfer in the increasingly popular and widely used multi-core processors, especially in multiprocessor systems-on-chip applications. As will be shown later, its processing element is equipped with a local memory (often called scratchpad memory) and DMA is used for transferring data between the local memory and the main memory.
DMA is crucial for system input/output (I/O); without it, using programmed input/output (PIO) mode for communication with peripheral devices, or load/store instructions in the case of multi-core chips, the CPU typically is fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other, more crucial computational tasks.
With DMA, however, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is especially useful in real-time computing applications, in which it's critical the processor's primary job doesn't stall behind concurrent operations.
Another related application area can be found within various forms of stream processing where it is essential to have data processing and transfers in parallel, in order to achieve sufficient throughput.
![]() |
E-mail This Article | ![]() |
![]() |
Printer-Friendly Page |
|
Related Articles
- DSPs with PCI Express interface extend connectivity while improving performance and power efficiency
- Using nextgen PCI Express switches to eliminate network I/O bottlenecks
- Pushing the Frontier in Managing Power in Embedded ASIC or SoC Design with PCI Express
- PCI Express 3.0 needs reliable timing design
- PCI Express 3.0 needs reliable timing design
New Articles
- Why RISC-V is a viable option for safety-critical applications
- Dimensioning in 3D space: Object Volumetric Measurement by Leveraging Depth Camera-based Reconstruction on NVIDIA Edge devices
- What is JESD204B? Quick summary of the standard
- Post-Quantum Cryptography - Securing Semiconductors in a Post-Quantum World
- Analysis and Summary on Clock Generator Circuits and PLL Design
Most Popular
- System Verilog Assertions Simplified
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Method for Booting ARM Based Multi-Core SoCs
- An Outline of the Semiconductor Chip Design Flow