Software synthesis for embedded systems
Software synthesis for embedded systems Synthetic operating systems might mean never having to port software again. Software can be automatically generated-synthesized-to meet the demands of a changing system. For decades hardware design began with a schematic-a graphical representation of the hardware that could be turned into components and wires, a printed circuit-board layout, or a semiconductor chip. At first the process was done manually with hand-drawn schematics that were translated to layouts by highly trained draftsmen. The process was later automated with schematic-capture tools, automatic netlist generation, and physical-layout programs, though the concepts remained the same. Eventually, chip designs grew very large and a new design method was needed. Thus was hardware synthesis born. Hardware-description languages (HDLs) similar to programming languages were developed and now all chip designs begin as lines of code. The code is written at a high level that hides much of the complexity from the designer, and then is synthesized into a low-level description for layout and analysis. I'm proposing that the time is right for a similar evolution in embedded systems software design. Synthesis is the process of taking a high-level description and turning it into a lower-level description that, in the case of software, can be compiled directly. By working at a higher level, the user is kept uninvolved with implementation details. Synthesis employs automatic code generation (ACG) but there's more to it than that. Some popular and useful code-generation tools already exist. Microsoft Visual Basic and HTML-layout tools like Macromedia's Dreamweaver allow users to create graphics and buttons with little or no knowledge of the underlying code. These kinds of ACG tools are indispensable for creating a user interface. For generating code to perform calculations or create control functions, however, these tools are inadequate. Trouble with graphical tools for software If you use these graphical ACG tools, there are ways to debug the resulting code, none of which are ideal for code maintenance and all of which increase, rather than decrease, the development effort. You can update the original graphic description during debugging, but it's unlikely that many software engineers will want to continually go back to these graphical descriptions, modify them, regenerate source code, compile the code, and run it each time they discover a bug. This slows down development time significantly. You can also modify the generated source code during debugging and later attempt to modify the graphical description to generate nearly identical code. This solution is very difficult, if at all possible, and not practical in most cases. You can also just ignore the original graphical description once the code has been generated and only work with the resulting code. This approach means the original high-level graphical description is no longer correct or reusable. In this case the ACG tool is a great starting point for a software project but provides little utility once the development process is underway. Another major problem with graphical-entry ACG tools is interoperability. These tools use their own formats for representing algorithms and code blocks. A common data representation format could be developed, but that would reduce many of the competitive advantages of each system and so the tool vendors are unlikely to adopt anything but a simple format for sharing information. Schematic-capture vendors tackled the same problem in the 1980s; the result was a long string of standards committee meetings that eventually produced the electronic data interchange format (EDIF), an unwieldy and complex netlist format that is barely human readable for practical purposes. Hardware description languages Listing 1: State machine HDL source code Software development should take a similar route to the one that has worked so well for hardware development. What we need are higher-level constructs that tell an ACG tool what kind of code needs to be generated. I call these constructs primitives. Using these primitives to handle complex functions means that the programmer does not need specialized knowledge in areas that aren't directly related to the features of the program. Suppose, for example, that you need to send a file over an Ethernet connection using FTP. You could use a primitive such as: ethernet_send(filename, FTP); You wouldn't need any specialized knowledge of Ethernet drivers, FTP protocol, available APIs, or what kind of network hardware is attached to the system. Perhaps more important, you wouldn't care about the library functions available or even the operating system on which the software will run. All of this information can be given to the synthesis program, which will replace the primitive with code to send the file. That code might include direct calls to Ethernet hardware, calls to specific library routines, or a system call to the operating system, whichever is appropriate. Listing 2: Concurrent tasks in Linux To see how this would work, consider multithreading tasks. Look at Listing 2, which shows a main() routine in Linux that spawns two simple threads. Now look at the same operation in Windows, shown in Listing 3. Even in this simple example, the two routines are very different. Linux requires a thread structure to be created for each task. Windows requires a stack size parameter to be specified when creating a new thread. The code in Listing 4 is simpler to write and easier to understand. The software synthesis tool recognizes that SynthOS_Call() is a primitive used to "call" a task in its own thread. If the target operating system of the synthesis tool is Linux, the thread data structures and the appropriate syntax are inserted into the code to create Listing 2. If the target operating system is Windows, the code in Listing 3 is created. What is not shown in these examples is the code in the valPrint() routines, which must use mutexes to ensure that different threads of the same routine do not corrupt each other's data. Mutexes in Linux and Windows have very different structures. The valPrint() routine can be written without mutexes and the software synthesis tool would insert the code for the appropriate mutexes at the appropriate points. Software synthesis increases the usability of code. Since code is written at a higher level, it can easily be synthesized for different hardware or integrated into other programs. The synthesis tool, not the programmer, needs to understand the system in detail. In the example above, the code could be moved from a Windows machine to a Linux machine without any modifications. The synthesis program would be responsible for converting Windows system calls to Linux system calls. Software synthesis can also be tuned for different processors to optimize code. For example, if a processor has a floating-point unit, software synthesis can insert source code that accesses the floating-point unit. If the processor does not have a floating-point unit, software synthesis can insert the source code routines for performing the specific floating-point calculations required by the application. In particular, software synthesis can insert source code for only those floating-point routines that are actually used in the application. This is much more efficient than linking to a library that contains all possible floating-point routines including ones that the application never actually invokes. Synthesize operating system The synthesis tool gleans much of the information about the requirements of the operating system from the application source code and driver source code. The remaining information must be supplied by the user in the form of a configuration file, an example of which you can see in Listing 5. The file is divided into sections, each specified by a keyword in square brackets. The tool section simply specifies the version of the synthesis tool to use. The project section includes the project name. It also includes the target processor in case there are some hardware-dependent optimizations that the synthesis tool can perform. This section also includes the processor word size, the input source language, the scheduling algorithm to use, descriptive comments about the project, and compiler directives to be used when compiling the code. Listing 5: Configuration file The source section contains the names of all the source code files in the system. The lib section contains the list of object libraries used in the system. The interrupt_global section provides the names of functions for enabling and disabling interrupts so that the synthesis tool can protect critical sections of code. Each interrupt has its own section of the configuration file for more detailed information, if required by the synthesis tool. Each task has its own section of the configuration file that includes information that allows the synthesis tool to recognize each task, insert the appropriate code and data structures into each task, and maintain the appropriate priorities, frequencies, and task latencies. One added benefit to synthesizing the operating system is that software synthesis can detect most potential race conditions and eliminate them. Race conditions occur in global variables that are being modified by two or more tasks that are running simultaneously. One task might modify a global variable and, before it can check its value, another task has modified it again. Software synthesis is aware of all global variables in the system and can protect them from modification by another task. Tradeoffs Software synthesis creates C data structures to save each task's state during a task switch. While dedicated hardware for task switching is relatively fast, this hardware must swap many registers into and out of memory for each task switch. If a processor has 256 registers, all of them must be swapped into and out of memory during a hardware-assisted task switch. Having a mechanism to swap out all of the registers can be useful for a desktop system where it's unknown at compile time which applications will be running and when they'll be running. For an embedded system, all of the applications are known at compile time. For example, software synthesis examines all the source code and knows that a certain task uses only four registers while another task uses 16 registers. A synthesized operating system may use slower software mechanisms to swap out task states but its detailed knowledge of the tasks helps it minimize the amount of state information to swap. Because of this knowledge, the synthesized operating system will perform task switching much faster than an off-the-shelf operating system in many cases. By relying on the compiler, software synthesis tools can immediately support any processor for which a C compiler exists. Like any operating system, software synthesis tools need to have some knowledge of particular processor architectures and features, such as how to enable and disable interrupts during critical sections of code. The software synthesis tool needs to know about the board support package (BSP), which is the code used to initialize the processor after power-up and get the operating system running. These kinds of hardware dependencies are introduced to the software synthesis tool using the configuration file. And using software synthesis to generate an operating system eliminates the need for a team of engineers to port an operating system to each new processor architecture. An operating system created using software synthesis has no unnecessary functions. For example, many smaller, simpler embedded systems don't need hardware to perform memory management, context switching, or stack maintenance. Almost every off-the-shelf operating system includes these features and must therefore run on complex processors that support these functions. Software synthesis can generate an operating system that can be run on smaller, simpler processors that don't support these complex functions. Such processors are typically less expensive and less power hungry than the larger, complex alternatives (see the sidebar for an example). Using FPGAs These new FPGAs are resource-limited. While they're large in terms of gate count, they're not as large as a PC board full of chips. Memory, in particular, is a scarce commodity. Any on-chip memory that's used for data takes away from its use to implement logic. Also, memory on an FPGA is not as cost-effective as memory in a dedicated device. Discrete memory chips add cost to a design, so any tool that reduces the amount of memory needed for code and data can save precious resources and lower costs for systems based on this type of FPGA. In order to test software synthesis on this system, we added tasks to turn the single-task system into a multitask system. One task allowed a user to press buttons to alternately set or clear any of four bits in a register used to drive LEDs that displayed the number in binary. The register drove two seven-segment LEDs showing the contents in hexadecimal. Another task rotated the value of the single LEDs from left to right like an electronic billboard. Another task was added so that when a user clicked on a link in a Web page, a short description of the page appeared on the attached LCD display. To make these changes we had only to create the new tasks and insert several software-synthesis primitives. The resulting operating system was synthesized quickly and worked as desired, taking a mere 3KB of memory. Because the resulting code was all in C, we "ported" the resulting code to a Cypress CY8C21123 PSoC containing only 256 bytes of RAM and 4KB of flash memory. The porting consisted mainly of recompiling the code. Although the hardware-specific accesses were wrong for the Cypress part, the entire multitasking system, including the RTOS, fit easily into this tiny part, requiring only 192 bytes of RAM and less than 1KB of ROM. Some of these FPGAs have multiple processors in a single chip. These can be "hard" processors that are permanently part of the silicon or "soft" processors in the form of HDL netlists that can be implemented in the programmable logic. One processor might control packet flow at a network port while another processor performs encryption and decryption. Soft processors can take over the function of complex state machines in many systems because software is much easier to document and change than a hardwired state machine. It's impractical to run multiple copies of off-the-shelf operating systems for each processor; the memory and logic resources of the chip would be quickly used up. However, software synthesis can create very small, relatively simple, reliable operating systems for each processor on the chip. FPGAs users have traditionally been hardware designers, not software designers. The adoption process for new, more-complex FPGAs has been controlled by the hardware engineers. These engineers have a detailed knowledge of logic design and understand the tools involved with such a design, including hardware synthesis tools. They don't, in general, have a detailed knowledge of operating system software, software race conditions, semaphores, deadlocks, priorities, mutexes, scheduling algorithms, and so on. Software synthesis handles these aspects of the design automatically, letting the hardware engineers create embedded system software without any expert knowledge or the learning curve they would otherwise face. Future capabilities Because a synthesized embedded system consists of source code, including the operating system, analysis tools that work with the software synthesis tools can be created that find best-case and worst-case timing for all tasks using timing information supplied by microprocessor vendors. For exact timing, this analysis can be done after compiling the system code. The timing numbers would then be plugged into the code and a timing analysis would be performed to determine such things as worst-case latency for task switching or interrupt handling. This kind of "static code analysis" is analogous to static timing analysis for digital circuits. Other kinds of static code analysis are possible too, including memory usage analysis and resource usage analysis, before the code is ever run on real hardware. Software synthesis can optimize the resulting source code for particular hardware platforms. For example, if the target processor has hardware for assisting with task switching, software synthesis can introduce code to take advantage of it. This optimization is only one example of how synthesis can create code that works best with hardware specified by the user. Because of this ability to target different hardware, the user isn't tied to specific hardware when writing code. The resulting code can then be synthesized for different hardware and performance can be evaluated before any hardware system has been designed. Software synthesis can also be used to find which tasks are sharing resources. Sometimes this situation is obvious, such as when two tasks both use a shared hardware device. Sometimes this situation is more complex, as when tasks invoke other tasks that use a shared hardware device. The synthesis tool can examine all call trees that share resources. This is not possible with compiled libraries or off-the-shelf operating systems. In a typical embedded system today, hazard conditions are detected at run-time. Many conditions can be missed because the run-time tests don't cover all possibilities. With software synthesis, hazard detection becomes a deterministic static analysis of the system source code that can potentially find all possible problems before the code is even executed. Software synthesis enables users to experiment with different algorithms and see their effects during static code analysis or at run time. For embedded systems, different algorithms can be specified for the operating-system scheduling algorithm or for assigning task priorities. The appropriate source code can then be synthesized to implement the required algorithms. A static code analysis can show how the different algorithms affect performance. And of course, the different algorithms can be tried at run time to determine real-world performance numbers. For decades the holy grail of system design has been hardware/software codesign where a system-level description is automatically partitioned into hardware and software. Software source code is then generated in a language like C and a hardware description is generated in an HDL like Verilog. Rather than starting with a system-level description, however, a simpler and more practical approach might be more feasible. Software synthesis, combined with hardware synthesis and a flexible system like a platform FPGA, can come much closer to this elusive goal. Bob Zeidman is the president and founder of Zeidman Technologies, a company that develops software tools for hardware/software codesign. He is the author of the books Designing with FPGAs and CPLDs, Verilog Designer's Library, and Introduction to Verilog. Bob holds an MSEE degree from Stanford and a BSEE and BA in physics from Cornell. His e-mail address is bob@zeidman.biz. ACKNOWLEDGEMENTS Copyright 2005 © CMP Media LLC
By Bob Zeidman, Courtesy of Embedded Systems Programming
Jan 20 2005 (14:32 PM)
URL: http://www.embedded.com/showArticle.jhtml?articleID=57702593
ACG tools based on the Unified Modeling Language (UML) allow programmers to create control and calculation programs graphically by using boxes to represent input, output, and processing algorithms. Some UML tools generate code based on these diagrams. The limitations of these graphical-entry ACG tools are the same ones that limited the use of schematic capture tools for hardware design. Namely, graphical-entry tools hide important information from the user. You can't look at a function block and know what the relevant parameters are. You need to be at a computer, running the ACG tools, to examine the properties of each block.
Software engineers can instead look to hardware engineers for a better solution. Nearly all chip designs are now done by coding at a high functional level in an HDL such as Verilog or VHDL. This code is then synthesized into a lower-level description that can be directly mapped to hardware. The HDL code can be read and understood by anyone who knows that particular HDL and can be edited with any text editor. All the HDL vendors support the same standard HDLs, so interoperability isn't a problem. HDL code can be reused from design to design and unlike a graphical schematic representation, it's easy to document-the designer can add comments as needed to describe a code section. The compilation, simulation, and synthesis tools all work on the same HDL code and can produce all sorts of optimizations while leaving a complete trace back to the original source code. For comparison, Figure 1 shows a schematic design of the same circuit described in Listing 1 using an HDL (in this case, Verilog).
Figure 1: State machine schematic // This module implements the // memory control state machine module mcsm(sysclk, input1, input2, state3);
/* INPUTS */ input clk; // system clock input input1; // input from adder input input2; // input from adder
/* OUTPUTS */ output state3; // the write signal
/* DECLARATIONS */ wire ps1; // input to state1 ff wire ps2; // input to state2 ff wire ps3; // input to state3 ff reg state1; // state bit reg state2; // state bit reg state3; // state bit
assign ps1 = ~state2 & ~state3; assign ps2 = state1 & input1 & input2; assign ps3 = state2 | (state3 & input1);
// clock in the new state on // the rising edge of sysclk always @(posedge sysclk) begin state1 <= #3 ps1; state2 <= #3 ps2; state3 <= #3 ps3; end endmodule
void main() { // Create two threads for the valPrint() task
// Create a new thread Thread *the_thread1 = new Thread("child1");
// Start the new thread running valPrint() the_thread1->Fork(valPrint, 1);
// Create a new thread Thread *the_thread2 = new Thread("child2");
// Start the new thread running valPrint() the_thread2->Fork(valPrint, 1); }
Listing 3: Concurrent tasks in Windows void main() { // Create two threads for the valPrint() task
// Start a new thread running valPrint() _beginthread(valPrint, 0, 1);
// Start a new thread running valPrint() _beginthread(valPrint, 0, 1); }
Listing 4: Concurrent tasks using software synthesis void main() { // Create two threads for the valPrint() task
// Start a new thread running valPrint() SynthOS_call(valPrint(1));
// Start the new thread running valPrint() SynthOS_call(valPrint(1)); }
Up until now we have examined how software synthesis can be used to create complex functions without the need for the programmer to understand the underlying code of the functions. We have also discussed how software synthesis can be used to create portable programs that are independent of the processor or hardware platform or operating system on which they are running. Software synthesis also automates the process of creating the operating system itself. The operating system has evolved into a huge program in and of itself when really its function is simply to support and schedule the tasks that are running on top of it. Software synthesis can examine the source code of each task and, with some user input, create an ideal operating system in source code. The resulting operating system is optimized because features that aren't needed aren't created. This is different than modular off-the-shelf operating systems. Functions can be included or eliminated in a synthesized operating system at the granularity of t he C statement rather than the predefined module level. A synthesized operating system can therefore be much more finely optimized for size and feature set than an off-the-shelf operating system. # SynthOS Project File [tool] version = 1.00
# This is the start of the project section. [project] project_name = Project X target = 68HC05 processor_size = 32 language = C scheduler = round_robin contact = Vladimir Nabokov company = PaleFire Corporation website = www.palefire.com email = vlad@palefire.com # Description of the project. Use a "\" to continue on the next line. description = Mobile Phone Prototype \ This is the first version. \ I hope you'll like it. # The compiler directives. Use a "\" to continue on the next line. compiler_directives = $explicit \ $base10 \ $xyz
# The source code files are listed in the source section [source] file = ConfirmDialog.c file = AboutDialog.c file = ../BigTask.c file = F:/SynthOS/Code Development/SmallTask.c
# The library object files are listed in the lib section [lib] file = iolib.o file = ../lib/mathlib.o file = C:/Zeidman/libraries/TCPIP.o
[interrupt_global] enable = ON maxEnable = 3 getMask = int intGetIntMask(void); setMask = void setIntMask(int Mask); enableInt = void intEnable(int Vector); disableInt = void intDisable(int Vector); enableAll = void intEnableAll(void); disableAll = void intDisableAll(void); setMaxEnable = void setMaxInterrupt(int maxInt); checkInt = bool isEnabled(int Vector);
# This defines an individual interrupt vector [interrupt] interrupt = 1 vector = clockTimer
# This defines an individual interrupt vector [interrupt] interrupt = 2 vector = Keypad
# Each task has an associated [task] section. #Task1 [task] entry = Task1_routine period = 1 priority = 0 TCBQ_depth = 10 type = call
# Task2 [task] entry = Task2_routine period = 3 priority = 2 TCBQ_depth = 3 type = loop
When a software synthesis tool synthesizes an operating system, it writes code in a high-level language like C. At this level, the code has no direct control over placing values into registers or accessing hardware directly. As with any high-level code, it's the responsibility of the compiler to optimize the resulting low-level machine code. Thus one tradeoff with regard to a synthesized operating system is that it relies on the compiler to optimize things like task switching rather than using any specialized processor hardware.
Many FPGA vendors now have products that consist of programmable logic surrounding an embedded microprocessor and interface logic. Some examples include Altera's Excalibur, QuickMIPS from QuickLogic, and Xilinx's Virtex II. I believe chips like these represent the future of embedded system design, but they need a new type of tool set for software development, and software synthesis should be part of that new tool set.
Sample implementation
To demonstrate the effectiveness of software synthesis, we applied software synthesis to an Altera development board that contained the Altera Cyclone EP1C20 FPGA with a NIOS 32-bit soft processor. The development kit came with C source code for a simple Web server. The system was a single task with no operating system. The code included a TCP/IP stack so that a computer could be connected to the board using Ethernet. The computer could access a set of Web pages, some static and some dynamic.
Software synthesis has much exciting potential for future development, particularly in the area of embedded system development. Because source code is produced for the entire system, it's possible for synthesis tools to perform analysis, optimization, and experimentation to an extent not currently possible.
I would like to thank a number of people who contributed to the development of software synthesis and to the writing of this article. These people include Huong Le, Loc Le, Dan Hafeman, and Michael Barr.
Related Articles
- Optimizing embedded software for real-time multimedia processing
- Will Generative AI Help or Harm Embedded Software Developers?
- Software Infrastructure of an embedded Video Processor Core for Multimedia Solutions
- Embedded Software Unit Testing with Ceedling
- Processor-In-Loop Simulation: Embedded Software Verification & Validation In Model Based Development
New Articles
- Quantum Readiness Considerations for Suppliers and Manufacturers
- A Rad Hard ASIC Design Approach: Triple Modular Redundancy (TMR)
- Early Interactive Short Isolation for Faster SoC Verification
- The Ideal Crypto Coprocessor with Root of Trust to Support Customer Complete Full Chip Evaluation: PUFcc gained SESIP and PSA Certified™ Level 3 RoT Component Certification
- Advanced Packaging and Chiplets Can Be for Everyone
Most Popular
- System Verilog Assertions Simplified
- System Verilog Macro: A Powerful Feature for Design Verification Projects
- UPF Constraint coding for SoC - A Case Study
- Dynamic Memory Allocation and Fragmentation in C and C++
- Enhancing VLSI Design Efficiency: Tackling Congestion and Shorts with Practical Approaches and PnR Tool (ICC2)
E-mail This Article | Printer-Friendly Page |