Optimizing Compilers for Low-Power Operation
By: Neil Putoff, Field Applications Engineer, <%=company%>>
Contents
The compiler as a low-power component
Accessibility
Performance benchmarks
A more meaningful benchmark test
Benchmark test the compiler/vendor
Power usage can be a critical factor in many portable wireless products, forcing designers to constantly add new power-saving tricks to their arsenal. As more complex wireless applications demand greater processing resources, power efficiency is a growing challenge.
Battery-powered devices such as cell phones and personal digital assistants (PDAs) are particularly sensitive to power consumption, as it translates directly into battery life, battery size, and the overall weight of the product. These are very important selling features for portable devices—sometimes determining the success or failure of a product.
The traditional methods of reducing the power consumption of an embedded wireless system have been rooted in hardware: reduced voltages, improved processes, and greater control over power use. These approaches have yielded substantial power savings, especially when the processor is in a suspended state.
Despite all their efforts, wireless engineers are still searching for new ways to reduce power consumption in portable designs. There is a method that is often overlooked by portable system designers. This method imposes no per-unit cost penalty, requires no changes to the circuitry, and can increase the amount of available microprocessor processing power. The secret: use the best optimizing compiler you can buy.
The compiler as a low-power component
In its simplest terms, every instruction executed by an embedded microprocessor represents battery drain. Therefore, if an embedded program can be run using fewer instruction cycles, the processor can be suspended sooner, resulting in less battery use, which equals longer battery life.
In many portable applications, the embedded software can be structured to periodically execute code, then suspend the microprocessor until the start of the next period. The power consumption of such an application will follow this general pattern (Figure 1):

In this application, the microprocessor will execute a set of functions, and then go into a low power suspend mode. If the functions can be optimized to execute using fewer instructions, then the microprocessor could be suspended sooner and battery life could be extended.
One way engineers are trying to execute fewer instruction is by accelerating execution speed. To obtain the fastest execution speed, it's hard to beat hand-coded assembly from a skilled software engineer. Unfortunately, coding in assembly is one of the most inefficient and error-prone means of software development, usually leading to prohibitively long product development cycles.
High-level languages such as C, C++, and Embedded C++ enable much more efficient and maintainable software development processes. But the burden of achieving fast-executing code is placed on the compiler, and to a large extent on the software engineer's knowledge of the compiler and related tools.
Optimized compilers, such as Green Hills Software's C, C++, and EC++ compilers, generate embedded software that runs faster than compilers that are not optimized. To achieve fast code, the compiler can employ many different strategies, sometimes at the expense of a larger program size in memory. A large number of optimization strategies and options are an important consideration when choosing a compiler.
Accessibility
What is almost as important as the compiler's skill at generating optimized code is its accessibility. The smartest optimizations in the world are of no use if they can't be easily accessed and understood by the wireless engineer.
To provide accessibility, most compilers will support a set of command-line switches to set the options for optimization. Additionally, some tools, such as Green Hills' MULTI integrated development environment (IDE), will support an easier to use graphical interface for setting compiler options (Figure 2).

Performance benchmarks
The traditional methods of determining execution efficiency of a microprocessor or compiler are with benchmark software. Synthetic benchmarks such as dhrystone have been so severely scrutinized by microprocessor and compiler vendors as to become almost irrelevant. Others are targeted toward desktop applications. Either doesn't translate well into embedded application performance.
More recently, there have been efforts at benchmarking microprocessors and compilers using embedded application code. This approach uses a variety of real-world applications, and provides meaningful embedded performance numbers.
There are (at the time this article was written writing) only a few published embedded benchmark scores. These benchmarks are initially directed at comparing the performance of competing microprocessors, which might not help you choose an optimizing compiler for a given microprocessor type and application.
Consider Figure 3, which shows the execution speed of embedded code for a variety of applications. These application were run on the same target platform, but compiled for fastest execution speed with two different premium compilers:

Clearly, an engineer's choice of compiler will have a significant impact on the overall performance of his/her product as well as its battery life.
A more meaningful benchmark test
The most meaningful measure of a compiler's capability to produce fast executing code for a specific product is by running specific embedded software. However, porting an entire embedded application from one compiler to another can represent nontrivial effort, especially if lots of compiler-specific pragmas, assembler and linker directives and non-ANSI coding practices were used.
A better approach is to create a specific benchmark, using specific application code, and the following guidelines:
1. Use ANSI-C for the entire benchmark test.
2. Use most-often-executed functions for the test, in the same proportions as the actual application.
3. Remove all hardware dependencies so this code can be run entirely from RAM or in a simulator.
4. Wrap the called functions with a repeatable set of initialized variables and data.
5. Make the application be self-terminating—no endless loops.
6. Optionally remove comments and change descriptive names.
Such a benchmark has a number of advantages:
- It can be relatively easy to construct;
- It represents the performance requirements of your application;
- It is easily tested on any number of compilers;
- The testing can be done by the compiler vendor; and
- It provides a convenient means of evaluating the performance of different microprocessor types. (This can be very handy when selecting a microprocessor for a new design.)
Benchmark test the compiler/vendor
Traditional benchmarks are focused on testing one quality of a compiler's code: execution speed. More correctly, they are a measure of the potential speed of a compiler's code, because achieving top speed from a compiler often requires detailed knowledge on the options and operation of the compiler.
This benchmark can be used to test for three important qualities of a compiler:
- Execution speed of the resulting compiled code;
- Ease of use of the compiler in achieving fast-running code; and
- Support of the compiler vendor.
Here's a scenario for selecting the best compiler for a specific application:
1. Create a portable benchmark source file.
2. Request an evaluation compiler and toolset from several reputable vendors.
3. Work with that compiler for a limited amount of time. This provides a measure of the ease-of-use of the compiler, and of the level of support of the vendor. Record the top execution speed achieved during this time period.
4. Provide access to your benchmark source code to the compiler vendor, to see what the top speed can be when compiled by an expert. This is another way of measuring their commitment to supporting their customers. It also gives you a strategy for optimizing your entire application with that vendor's compiler, should you decide to purchase. Record the top execution speed achieved by the vendor.
5. Compare the execution speeds achieved during your trial evaluation and that achieved by the vendor. This is another measure of the ease-of-use of the product.
By comparing the top speed, support, and ease of use of each compiler, an engineer should be able to make a well-informed decision about which toolset will help him/her achieve the longest battery life from a portable product, without compromising a development schedule.
Wrap-up
A custom benchmarking test is an ideal way to select a compiler that can squeeze out extra battery life from your products. It requires only a minimal engineering investment, and allows your software engineers to stay abreast of the latest in development tools. It is suitable for products in production as well as new designs, and is transferable across different product variations. Most importantly, it provides the competitive advantage of longer battery life—or smaller batteries—to help your portable design stand out above your competitors.
About the author:
Neil Putoff, Field Applications Engineer, Green Hills Software, Inc., 30 West Sola Street, Santa Barbara, CA 93101. Phone: 805-965-6044; Fax: 805.965.6343.
Edited by Robert Keenan