Embedded Systems -> Linux scaled from desk to handheld
Linux scaled from desk to handheld
By Tim Bird, Chief Technology Officer, Lineo Inc., Lindon, Utah, EE Times
April 6, 2001 (2:24 p.m. EST)
URL: http://www.eetimes.com/story/OEG20010406S0055
Linux is being deployed in a range of connected "smart" embedded devices, from set-top boxes to home gateways, routers and industrial control devices. The trend bucks Linux's reputation as a large, full-featured operating system.
In common desktop use, that reputation is well-deserved. But in the embedded arena, a fundamentally different set of criteria exists for determining the requirements and features of the software, allowing it to be reduced in size.
The important item to remember in reconfiguring Linux for the smart and connected devices typical of the embedded market is that many of the features that have traditionally made Linux a great desktop OS can be removed or altered without affecting quality of the Linux. That can be accomplished using four main techniques:
- eliminating files;
- reducing libraries;
- using alternate implementations; and
- altering the source code.
One obvious and i mportant benefit of using a Linux-based solution in embedded devices is the availability of the source code. Because source code is available, a number of techniques can be used to reduce the size of embedded Linux solutions. They are techniques that would not be available with proprietary, binary-only software. The techniques include:
- recompiling using existing and available compilation options;
- finding and removing library dependencies;
- increasing the modularity of open-source packages; and
- adding configuration points to the Linux kernel.
One easy route to modifying open-source packages is to use the existing configuration and compilation options that come with many pieces of software. People are often surprised to discover the large number of compilation options available with many of the familiar desktop packages. In general, the GNU project has encouraged software developers to use a fairly consistent set of mechanisms for providing configuration and compilation op tions for software. Those include the use of configure scripts, make-file options and header files used for source configuration.
With many open-source software packages, a configure script is provided that automatically configures the software to be compiled for a certain platform and to use certain customer-defined options. That script can be called with numerous options to define how the package will eventually be compiled and linked. For example, one of the most common features available with GNU-sponsored programs is the ability to link in the read-line library, which provides basic command-line-editing capabilities. When a program is compiled without support for that library, it is usually a bit smaller and does not require that the library be present on the system in order for the program to run.
There are many other options that can have similar effects on the size of the binary programs produced by the software build process. To see the list of options supported for an individual pac kage, locate its configure script (usually in the top directory of the source code tree for the package), and invoke it with the "help" option.
Other places where compilation options are sometimes made available are the package make files and source header files. To find those options, it may be necessary to read the source code, although in many instances the online documentation for the software will describe the available options in a read-me or install file. In many cases, the make files and header files are automatically generated during software configuration. In such a case, any desired changes need to be made to the files from which the make files and header files are generated. Usually, those files end in the suffix ".in", as in makefile.in or config.h.in.
If you are ready to roll up your sleeves and start modifying the actual source code for a package, the best place to start is to analyze library dependencies and eliminate areas of the code that are dependent on unique libraries or library modules. Often, with small changes to the source code for a program, it is possible to remove dependencies on large pieces of library code. The result of making such changes can be the elimination of whole libraries, or better library reduction when static or dynamic library reduction techniques are used.
Breaking dependencyTo break a library dependency, you must first determine the set of libraries (and modules and symbols) being used by a program. The utility ldd can be used to find shared libraries that are required by a program. Other programs, such as objdump or nm, can be used to determine individual symbols required by a program. All of the utilities must be run on an existing executable image. For some programs (where the final binary may not be available or executable on the host development system), it is necessary to resort to reading the make file or the source code for the program itself. There are several techniques for removing library dependencies in the code, aside from the obvious one of ripping code out. One is to use an alternate library instead of the most common one.
For example, both ncurses and slang are libraries of character-oriented screen manipulation APIs. However, slang includes a small ncurses compatibility API in addition to supporting the routines for its own APIs. Thus, many programs that use the ncurses shared library can use the slang library without modification. Depending on the library requirements of the rest of the embedded solution, it may be possible to remove the requirement for the ncurses library, merely by substituting use of the slang library. That is done by editing the make file of the offending package and changing the library link specifications.
A final technique of source code modification is increasing the modularity of the code. That technique applies to all kinds of software, including libraries, user space programs and even the Linux kernel itself. In many cases, instead of just tearing out undesirab le code from a piece of software, it is better to modify the software so that a particular piece of it can be made optional. Then the code can be compiled either with or without the code that should be left out.
This is more desirable than just removing the code, for several reasons. First, if the changes you have made are not disruptive and do not make the code overly complex and hard to maintain, the original author of the code may accept your changes as an improvement and incorporate them into future releases of the software. In the open-source world, this is important because it increases the number of reviewers of your code (which increases its quality), and decreases your maintenance over time.
Second, this is usually the least risky way to modify the code. If the changes that were made end up damaging the code, then the code reduction can be sacrificed and everything returned to normal using simple configuration changes. Finally, you might use the modified software in multiple environm ents, and the feature that you wish to remove for a certain project may turn out to be useful for a different project in the future.
With any source changes, it is critical to avoid breaking any of the remaining functionality of the software you are modifying. One of the best tools available to make sure that the program flow in the remaining code is consistent with what it was before you made your changes is the strace utility. This tool shows the sequence of system calls made by a program. When strace is used both before and after a program has been modified, the dumps can help identify areas of code where the fundamental sequence of actions has been disrupted by a code change.
One area of increased modularity that Lineo has been working on is in the Linux kernel itself. The X86 version of the 2.2.13 Linux kernel has about 1,300 configuration points. At Lineo we believe this is too few. Although the kernel may be difficult to configure for a desktop user, we believe that adding configuratio n points that increase the modularity of the Linux kernel will leave Linux better suited for the wider variety of embedded uses for which it might be employed.
Although there are already a large number of kernel configuration points, many of the parts of the Linux kernel are not modular enough. This means that the granularity of selection is not fine enough to allow only the strictly required pieces to be selected.
For example, in the Linux kernel, there is a single option that controls whether System V IPC mechanisms are included in the kernel. However, System V IPC actually includes three services: shared memory, semaphores and message queues. Rarely does a single application use all three services. The services share a small amount of code, but other than that they are largely independent sets of routines.
By modifying the source and adding configuration points for those services as separate services, the modularity of the kernel can be increased. Another area of the kernel that ha s a large amount of code controlled by a single option is the proc file system. But, when the proc file system is enabled, a mass of source code is included.