Highly integrated embedded systems are becoming so complex that component integration in these systems, as well as the resulting system test and diagnosis, is becoming a significant challenge. The difficulty of ensuring that the end system operates as intended increases the development cost and lengthens the time-to-market of that system. It can also introduce faults and code bugs that can slow acceptance of a consumer device, or make a more specialized system unsuitable for its intended use. Make no mistake: Debugging tools for embedded systems hardware and software has made significant strides during the past decade. The ready availability of technologies such as source code debuggers, probes and JTAG access for on-chip debugging on the software side, combined with RTL verification and system-on-chip integration verification, bring unprecedented power to bear on fault identification and isolation. But is that enough? In some cases, it can be. Many engineers meet demanding schedules with existing tools. But with increasingly complex and customized systems such as cell phones, automotive electronics/telematics and medical instruments, isolating a fault in the application software, real-time OS, programmable logic or the interactions between them can overwhelm even the most modern tools and techniques. The most challenging problems are those in which each layer of the stack checks out but the layers in combination produce a fault or other error. Such an event is often timing-related, a familiar issue for logic designers but one that software developers are just starting to see as they take advantage of multithreading combined with hardware interrupts and response times. And these race conditions are some of the most technically difficult problems to find and fix. Integrating across the stack A typical stack for an Internet-enabled device, for example, might consist of a microprocessor, D/A converter, Ethernet and perhaps other hardware intellectual property encapsulated in programmable logic, as well as an RTOS and application software. Few vendors make multiple levels of this stack, and none make the entire stack. That forces designers to acquire components separately and perform custom integration after design. It might not seem that difficult at first. Most microprocessors, programming logic devices and RTOSes are known quantities. But that belies the growing complexity inside each component needed to support a high-end embedded system. Programmable logic devices with 100,000 gates and microprocessors with 10 million transistors serve as the platform for multitasking RTOSes that support several large running software applications. The potential for failure increases in each component, but the many interactions between these components create many opportunities for interaction errors. Those errors show up only when the components are integrated. See figure: Analyzing system events using a tool such as Green Hills' Event Analyzer enables system developers to track down the sources of faults that manifest themselves as system events. Source: Green Hills Software So if there is a failure or less severe fault, isolating that fault can prove challenging, especially if it can logically occur at any one of several places in that stack. This flaw can be either in design or in implementation and can manifest itself in any number of ways, including system failure, application or RTOS hang, or failure of a specific feature. Such failures can't easily be traced. Reduce integration risk Depending on the nature of the problem, several preventative measures are possible. Board support packages make fault analysis and debugging easier because the company has already done a certain level of integration and testing. That Wind River's VxWorks has been certified with the Freescale MPC 850 family, or that Green Hills' Integrity supports integrated development on an Xscale-configured board, makes it possible to pair an operating system with a microprocessor and board configuration. Virtually every major RTOS vendor offers board-support packages for its software. The packages tend to offer limited integration, usually drivers and other software interfaces, but can help reduce some of the risks surrounding integration. Modern integrated development environments (IDEs) can often be set up for specific processor/board/RTOS configurations. Green Hills' Multi IDE is especially capable in this regard, providing source debugging, event analysis and error checking entirely on-chip. Event analysis in particular provides a means to identify events such as context switches, interrupts, semaphores and even user-defined events to scrutinize software-hardware interactions. Others are using the open-source Eclipse framework as the foundation for a comprehensive IDE. RTOS vendors like QNX and Wind River use Eclipse as an integrated tools platform that spans both software and hardware. The advantage of Eclipse lies in its universal availability (it is written in Java and freely downloadable in binary or source format) and standard plug-in architecture. Embedded designers can take any plug-in on the market, or an internal too, and integrate it into their analysis and debugging environment. Another solution is to obtain as much of the hardware, IP and software stack as possible from a single vendor and hold that vendor responsible for localizing and diagnosing faults and errors. This can prevent finger pointing when problems arise. Few if any vendors have their own stack from silicon to software, but several perform a level of integration for the engineer across multiple levels. For example, the NetSilicon division of Digi International provides solutions that encom- pass much of this stack. An ARM processor is incorporated as a part of an ASIC that also includes on-chip Ethernet networking and JTAG port. Depending on the configuration, it's also possible to get peripheral support, DSP, DMA channels and other options. The software stack, called NET+Works, includes networking software, an RTOS, development tools, a development board and a JTAG debugger. This approach provides a means for the designer to have a single vendor responsible for key components, and it often requires the vendor to work closely with designers on specific implementations. But while these specialized options cover a number of potential embedded-system uses, they still represent a foundation for only a small fraction of complex embedded systems. This stack can be very useful to a small number of designers but not at all useful to the vast majority. Mentor Graphics can offer a comparable stack but one that comprises a combination of programmable logic tools for design verification and design-for-test, combined with the Accelerated Technology Nucleus RTOS and development tools. Through its Code|Lab suite, the company provides software development tools, a compiler and debugger, and hardware connectivity to a variety of microprocessors and boards. This approach is less specialized than NetSilicon's. The Synopsys approach focuses on formally capturing system assumptions through assertions. The assertions help formally verify the individual modules while also serving as monitors in a full-system simulation. Synopsys uses a reference verification methodology that helps designers execute on the recommended verification flow. Open-source improvements The open-source software movement may also be contributing toward improving the quality of complex systems, at least at the software level. The value brought by using open-source software and tools, such as Linux, the GNU compilers, the GNU debugger and even code frameworks, is that bugs in the OS, tools or frameworks can be found by contributors to that project. If these of bugs cause a system fault, patches or workarounds may already exist on the Internet. At the very least, the problem may already have been encountered by others, and debugging can be a joint yet distributed effort. But those solutions presume a purely software flaw, which can often be found through traditional debugging techniques. Fortunately, however, embedded designers and developers are joining the open-source community in greater numbers, making it possible to tap expertise across the range of technologies. Online gathering places such as LinuxDevices.com (www.linuxdevices.com), as well as commercial open-source specialists like Montavista and Lynuxworks, make it possible to leverage device expertise to address complex system integration needs. None of those alternatives represents a complete solution to fault analysis and debugging at the system level. And depending on the requirements of a specific embedded project, some or even all of them may not be available for use. As a system design uses fewer off-the-shelf components, the likelihood of an integration fault becomes greater. Unfortunately, custom assemblies also make finding that problem more difficult. Ultimately, the best way to avoid the complexities of fault isolation and system-level debugging might be to create a design using standard subassemblies and software. This represents a trade-off between technical perfection and ease of debugging. Increasingly complex systems make it harder for any engineer to have the background needed to encompass the details at every level, but paradoxically, that complexity makes it possible to have a better sense for the cause of a fault anywhere in the system. Peter Varhol (peter@petervarhol.com), a technology practitioner and writer based in Nashua, New Hampshire |