Built-In DMA Engines Unleash Power of PCI Express Switches

By Krishna Mallampati, PLX Technology
dspdesignline.com (November 11, 2008)

Direct memory access (DMA) technology has been around for more than 20 years. DMA has been used principally to offload memory accesses (reading and/or writing) from the CPU in order to enable the processor to focus on computational tasks and increase the performance of embedded and other system designs.

Traditionally, there have been many components with a DMA engine inside including microprocessors (CPUs), disk drive controllers, graphics processors and various end-points. The DMA engine in all of these devices is used to transfer data between memory and I/O devices without the involvement of the core central processing unit.

DMA is also used for intra-chip data transfer in the increasingly popular and widely used multi-core processors, especially in multiprocessor systems-on-chip applications. As will be shown later, its processing element is equipped with a local memory (often called scratchpad memory) and DMA is used for transferring data between the local memory and the main memory.

DMA is crucial for system input/output (I/O); without it, using programmed input/output (PIO) mode for communication with peripheral devices, or load/store instructions in the case of multi-core chips, the CPU typically is fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other, more crucial computational tasks.

With DMA, however, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is especially useful in real-time computing applications, in which it's critical the processor's primary job doesn't stall behind concurrent operations.

Another related application area can be found within various forms of stream processing where it is essential to have data processing and transfers in parallel, in order to achieve sufficient throughput.

Click here to read more ...