How to write DSP device drivers
How to write DSP device drivers As digital signal processors pick up peripherals, you'll need to write new device drivers. Here are some time-saving tips for writing them for platforms based on DSPs. Digital signal processors (DSPs) are now often integrated on-chip with numerous peripheral devices, such as serial ports, UARTs, PCI, or USB ports. As a result, developing device drivers for DSPs requires significantly more time and effort than ever before. In this article, we'll show you a DSP device-driver architecture that reduces overall driver development time by reusing code across multiple devices. We'll also look in-depth at an audio codec driver created using this architecture. The design and code examples are based on drivers developed for the Texas Instruments' DSP/BIOS operating system, though the same approach will work in any system. How DSPs differ Some microprocessors execute all code from external memory via an instruction cache (I-cache). I/O peripheral registers are memory-mapped and accessed like any other program data. In contrast, many DSPs don't provide I-cache, but do include high-speed, on-chip memory that supports efficient program execution. Even with the latest DSPs that do provide I-cache, the memory is configurable as either a cache or directly addressable memory or a combination of the two. It's a common practice to dedicate some of this memory to real-time I/O, critical code loops, and data to avoid potential nondeterministic behavior caused by cache misses. To meet real-time I/O needs, DSPs provide dedicated serial ports that connect to streaming peripherals, such as codecs and other data converters. The interaction between the codec and the serial port is synchronous and handled entirely in hardware, though initial configuration of the frame sync, transfer rate, sample rate, and sample size must be done by the DSP. Although the serial ports can interact directly with the DSP core, software developers avoid this approach for real-time I/O because of the frequency of interrupts. DSPs generally provide one or more direct memory access (DMA) controllers, the channels of which can be used to buffer multiple samples from the serial port and then transfer the full buffer into the on-chip memory without DSP involvement. The DSP is only interrupted once the buffer is full rather than on every data sample. After initial configuration by the DSP, the DMA and serial port interact without any processor intervention. Because of the efficiency gains, drivers for most DSP peripherals use DMA. Writing a codec driver for a DSP actually involves programming three different peripheralsthe codec itself, the serial port, and the DMA controller. Figure 1 shows the data flow between the different peripherals, the DSP, and the DSP's internal memory. Later, we'll show a much more detailed implementation of a DSP codec driver, but first let's discuss some driver-architectural issues that enable better code reuse both at the application and driver levels. Device-driver architecture Device configuration is, by definition, specific to a particular device. Data movement, on the other hand, is more generic. In the case of a streaming data peripheral like a codec, the application ultimately expects to send or receive a stream of buffers. The application shouldn't have to worry about how the buffers are managed or what type of codec is being used, beyond issues such as data precision (number of bits in the sample). Class drivers As Figure 2 illustrates, any driver can be divided into two parts: a class driver that handles the application interface and OS-specifics and a mini-driver that addresses the hardware specifics of a particular device. Because of large differences between devices like codecs and UARTs, you'll typically need to implement several class drivers to support all the peripherals used with a DSP. When designing the driver model for DSP/BIOS, two of the class drivers we implemented were: As can be seen from these definitions, the class drivers define the I/O models used by the application. Since the class driver is responsible for synchronization between the application and driver, it will determine whether the I/O is synchronous (where the application thread blocks on an I/O transaction so that another thread can run) or asynchronous (where the application thread continues to run and relies on a notification mechanism that will inform the application when the I/O transaction is complete). Although class drivers are device-independent, they're intimately associated with the operating system since they use operating system services, such as semaphores. Mini-drivers The most device-specific routines are those that initialize device control registers. These operations require the calculation of specific bit patterns to set the appropriate flag values in each control register. Although the device initialization routines will never be portable, implementation and maintenance can be made much simpler by the development of a basic hardware abstraction layer (HAL) for the device registers. Example: Codec driver To avoid getting lost in the DMA setup code, we'll demonstrate some of the driver concepts using a simple sample-by-sample audio codec driver that processes samples once per interrupt. This type of driver is somewhat simplistic because you'd almost always use a DMA to enable the DSP to process on frames of data rather than having to service an interrupt for every data point. The codec's class driver Listing 1: Application startup void main() /* /* /* The code starts, as all C programs do, in main(). The application uses the DSP/BIOS memory manager to allocate four buffers from the system's heap space and then two SIO objects are created to stream data to and from the device driver. The application uses the SIO_create() call to create the channels. The SIO_create() function arguments indicate some of the design decisions to be made when implementing this class driver. For instance, we'll decide here that an SIO stream can be opened for either reading or writing, but not both. If bidirectional communication is required, the application simply opens two channels (as in this example). This unidirectional channel implementation is more efficient, and many data converters operate in only one direction. In addition, because most codecs operate on fixed-sized frames of data, we'll program the class driver to optimally support this and avoid the overhead incurred if variable-sized buffers are assumed. We'll use the attributes field to specify the stream object's configuration parameters such as the number of buffers used. For the purposes of this example, we'll choose the default attributes, which specify that the application will use two buffers (in other words, double buffering). Once main() terminates, the DSP/BIOS scheduler will then activate and allow any tasks to start running. The task called echo will start to execute once main() has completed and will continue to run until the application is terminated, as shown in Listing 2. Listing 2: A simple application task void int sizeRead; // Number of buffer units read / /* /* SIO_reclaim(outStream, (void**)&outbuf, NULL); /* /* SIO_issue(inStream, inbuf, SIO_bufsize(inStream), NULL) The SIO class driver uses an issue/reclaim model of buffer submission. This means that the buffers that are used to transmit and receive data are all "owned" by the application, and so the creator of the stream is expected to supply all of the necessary buffers. Calls to SIO_issue() are all asynchronous (nonblocking), which enables an application thread to submit multiple buffers to the driver for either reading or writing data while continuing to execute if necessary. By contrast, calls to SIO_reclaim() are synchronous (blocking), so if no buffer is ready to be given back to the application, DSP/BIOS will perform a context switch to the next highest-priority task until it becomes ready. Another important concept in the SIO class driver design is buffer exchange. To provide efficient I/O operations with low overhead, you should avoid having data copied from one place to another during certain I/O operations in favor of recycling pointers to buffers passed between the application and the device. Before we can start interacting with the driver in a steady state, we'll need to prime the driver with an initial set of buffers for both the input and output streams. Once this is done, our application can run in an infinite loop, reclaiming the buffers, copying data to them, and issuing them once again to the driver. The codec's mini-driver We won't go through the details of how to implement each of these functions, but to illustrate the concept, we'll show a simple implementation of a mini-driver's channel object structure and mdSubmitChan() function. The channel object structure is initialized by the mini-driver's mdCreateChan() function and is shown Listing 3. The actual design of this structure is completely up to the driver writer, but many implementations look similar to this one. Elements of this structure can include channel state information, such as information about the current I/O packet being processed, a linked list of packets queued for processing, and the callback function that's to be used to notify the class driver that a packet's processing is complete. Listing 3: Channel object data structure Listing 4: mdSubmitChannel () function static int imask = HWI_disable(); // disable interrupts if (chan->dataPacket == NULL) // dataPacket must be set last, to synchronize with ISR. else HWI_restore(imask); // restore interrupts return (IOM_PENDING); The mdSubmitChan() function, shown in Listing 4, will receive an I/O packet from the class driver and either put the packet in queue if the function is already working on a previously submitted job or start working on the packet right away. All of the state driver information required to accomplish this is contained in the channel-object structure. Notice that interrupts are typically disabled in this function to maintain the coherency of the channel state; however, you should keep this period short for a proper driver design. A modular mini-driver architecture A codec driver requires the programming of three peripherals: the codec itself, a serial port, and the DMA controller. Since the DMA controller and serial port for a given DSP will always be the same, we can partition the DMA and serial-port driver code from the codec code. As we discussed earlier, a driver's functions can be divided into configuration and data movement. Since the codec and serial port communicate synchronously without software intervention, the driver code need only address moving data from the DMA to the DSP's memory. This enables us to bifurcate the codec mini-driver functions into two discrete modules. Only the mdBindDev() and mdCreateChan() functions need to be rewritten for a new codec because these functions perform initialization. The remaining functions are implemented in a generic serial-port/DMA data mover that you can use across many different mini-driver implementations. Figure 3 shows how a mini-driver can be split into generic data mover and device-specific portions. Device-driver buffer flow Time to modularize In our experience, the modular approach reduces the development effort for a new codec driver by 90%. Because the driver reuses existing debugged modules, development time is more predictable since typically it's much harder to debug a driver than to write the initial code. Nick Lethaby is the technology product manager for the DSP/BIOS operating system at Texas Instruments. He has over 16 years experience in embedded and real-time software applications and has a special interest in real-time kernels. Nick has a BS in computer science from the University of London. You can reach him at nlethaby@ti.com. David Friedland is a senior applications engineer and project manager for Texas Instruments. He has developed device drivers and other embedded systems software and currently manages the development of device drivers for DSP peripherals. He has over 17 years of experience in embedded and DSP software and has a BS in electronic engineering from San Diego State. He can be reached at dfriedland@ti.com. Copyright 2005 © CMP Media LLC
By Nick Lethaby and David Friedland, Courtesy of Embedded Systems Programming
Dec 15 2003 (17:00 PM)
URL: http://www.embedded.com/showArticle.jhtml?articleID=16700665
Whereas microprocessors are mainly used for general purpose control, DSPs almost invariably do hard real-time data-path processing, where data samples are input in a continuous stream. DSPs are optimized to move data quickly from a peripheral to the DSP core, leading to several architectural differences from microprocessors.
Figure 1: A DSP codec driver often involves configuring more than just the codec
A device driver performs two main functions:
By providing a clear abstraction between the driver and the application, you can essentially free the application of the specifics of a certain peripheral and port it more easily to new hardware. You can further apply these concepts of abstraction and reuse to the driver itself. A number of driver functions are independent of the underlying device, such as synchronization between the driver and the application. You can provide these services in a driver module that's specific to a particular class of devices but independent of any individual device. Such a module is often referred to as the "upper half" of a driver. For convenience, we'll use the term class driver to describe this module.
Figure 2: Any driver can be divided into two parts: a class driver and a mini-driver
The specifics of a peripheral are addressed in the lower half of the driver, for which we'll use the term mini-driver. The mini-driver is responsible for all device-specific initialization and control and for passing a buffer of data to (or receiving a buffer from) the class driver. The mini-driver must define a standard interface to the class driver since it enables a class driver to work with multiple mini-drivers or vice versa. For example, in a system with multiple codecs of different types, you can save code space by having just one instance of the class driver code work with all the different codec mini-drivers.
To illustrate this DSP device-driver architecture, let's look at an example that reveals the design decisions and implementation of the class driver and the mini-driver. We'll use the codec device driver for Texas Instruments' TMS320C5402 DSP starter kit board. The code is similar to other DSP/codec combinations, so you can adapt it as necessary.
Our example uses the SIO class driver. Like most software modules, the best way to understand the SIO class driver is to look at some actual code. The example, shown in Listing 1, simply reads sound data from the audio codec device driver, copies the data to another buffer, and then transmits it back out to the codec so that it can be heard through a speaker.
{
void* buf0, buf1, buf2, buf3;
*Allocate buffers for the SIO buffer exchange
*/
buf0 = (void*) MEM_calloc(0, BUFSIZE, BUFALIGN);
buf1 = (void*) MEM_calloc(0, BUFSIZE, BUFALIGN);
buf2 = (void*) MEM_calloc(0, BUFSIZE, BUFALIGN);
buf3 = (void*) MEM_calloc(0, BUFSIZE, BUFALIGN);
*Create the task and open the I/O streams
*/
TSK_create(echo);
inStream = SIO_create("/codec", SIO_INPUT, BUFSIZE);
outStream = SIO_create("/codec", SIO_OUTPUT, BUFSIZE);
* Start the DSP/BIOS scheduler when main () exits
*/
}
echo()
{
unsigned short *inbuf, *outbuf;
* Issue the first & second empty buffers to input stream.
*/
SIO_issue(inStream, buf0, SIO_bufsize(inStream), NULL);
SIO_issue(inStream, buf1, SIO_bufsize(inStream), NULL);
* Issue the first & second empty buffers to output stream.
*/
SIO_issue(outStream, buf2, SIO_bufsize(outStream), NULL);
SIO_issue(outStream, buf3, SIO_bufsize(outStream), NULL);
* Echo buffers ad infinitum.
*/
for (;;)
{
/*
* Reclaim full buffer from input stream
* and empty from output stream.
*/
sizeRead = SIO_reclaim(inStream, (void**)&inbuf, NULL);
* Copy data from input buffer to output buffer.
*/
for (int i = 0; i < sizeRead; i++)
{
outbuf[i] = inbuf[i];
}
* Issue full buffer to output stream
* and empty to input stream.
*/
SIO_issue(outStream, outbuf, nmadus, NULL)
}
}
As we discussed previously, the mini-driver is the lower half of the device driver and handles the device-specific chores of the drivernamely device initialization and data I/O. Because we need to support a range of DSP peripherals, including codecs, UARTs, and PCI controllers, we'll begin by defining a standard mini-driver API to support all required devices. We'll then use the standard mini-driver as a basis for the codec mini-driver. The mini-driver interface functions are defined as follows:
typedef struct { bool inuse; // TRUE => channel has been opened int imode; // IOM_INPUT or IOM_OUTPUT IOM_Packet *dataPacket; // current active I/O packet QUE_Obj pendList; // list of packets for I/O unsigned int *bufptr; // pointer *within* current buffer unsigned int bufcnt; // remaining samples to be handled IOM_TiomCallback cbFxn; // used to notify client when complete void* cbArg; // arg passed with callback function } ChanObj, *ChanHandle;
mdSubmitChan(void* chanp, IOM_Packet *packet)
{
ChanHandle chan = (ChanHandle) chanp;
unsigned int imask;
{
/*
* Start I/O job.
*/
chan->bufptr = (unsigned int *)packet->addr;
chan->bufcnt = packet->size;
chan->dataPacket = packet;
}
{
/*
* There is an I/O job already pending; queue packet.
*/
QUE_put(&chan->pendList, packet);
}
}
Since DSPs usually have multiple serial ports, a DSP may interface to several different data-converter devices. Since companies often use essentially the same application across different hardware platforms, which may have a mix of different peripherals, we can look for opportunities to make the mini-driver code more reusable across devices.
Figure 3: Only the mdBindDev() and mdCreateChan() functions need to be specific to a particular codec
To better understand the flow of data through the driver and to map out the interactions between the application, device driver and device, take a look at Figure 4, which shows a step-by-step breakdown.
Figure 4: The flow of data through the driver
You can simplify DSP driver design by abstracting driver functionality into different modules that isolate device-specific code from more generic functions. Although using a modular approach to driver development requires more up-front design time and effort, you'll see significant benefit when porting the application to a new hardware configuration.
Related Articles
- DSP or FPGA? How to choose the right device
- Use Pre-Configured Device Drivers (PCD) to reduce embedded system memory footprint
- Multimode: How to design a programmable baseband device for multiple wireless standards
- How customer-specific standard products ease mobile device design
- How to write an optimized FIR filter
New Articles
Most Popular
- Streamlining SoC Design with IDS-Integrate™
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
- PCIe error logging and handling on a typical SoC
E-mail This Article | Printer-Friendly Page |