Constant folding

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

Constant folding

Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:

  i = 320 * 200 * 32;

Most modern compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000). The resulting code would load the computed value and store it rather than loading and multiplying several values.

Constant folding can even use arithmetic identities. When x is an integer type, the value of 0 * x is zero even if the compiler does not know the value of x.

Constant folding may apply to more than just numbers. Concatenation of string literals and constant strings can be constant folded. Code such as "abc" + "def" may be replaced with "abcdef".

Constant folding can be done in a compiler's front end on the IR tree that represents the high-level source language, before it is translated into three-address code, or in the back end, as an adjunct to constant propagation.

Constant folding and cross compilation

In implementing a cross compiler, care must be taken to ensure that the behaviour of the arithmetic operations on the host architecture matches that on the target architecture, as otherwise enabling constant folding will change the behaviour of the program. This is of particular importance in the case of floating point operations, whose precise implementation may vary widely.

Constant propagation

Constant propagation is the process of substituting the values of known constants in expressions at compile time. Such constants include those defined above, as well as intrinsic functions applied to constant values. Consider the following pseudocode:

  int x = 14;
  int y = 7 - x / 2;
  return y * (28 / x + 2);

Propagating x yields:

  int x = 14;
  int y = 7 - 14 / 2;
  return y * (28 / 14 + 2);

Continuing to propagate yields the following (which would likely be further optimized by dead code elimination of both x and y.)

  int x = 14;
  int y = 0;
  return 0;

Constant propagation is implemented in compilers using reaching definition analysis results. If all a variable's reaching definitions are the same assignment which assigns a same constant to the variable, then the variable has a constant value and can be replaced with the constant.

Constant propagation can also cause conditional branches to simplify to one or more unconditional statements, when the conditional expression can be evaluated to true or false at compile time to determine the only possible outcome.

The optimizations in action

Constant folding and propagation are typically used together to achieve many simplifications and reductions, by interleaving them iteratively until no more changes occur. Consider this pseudocode, for example:

  int a = 30;
  int b = 9 - (a / 5);
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * (60 / a);

Applying constant propagation once, followed by constant folding, yields:

  int a = 30;
  int b = 3;
  int c;

  c = b * 4;
  if (c > 10) {
     c = c - 10;
  }
  return c * 2;

Repeating both steps twice results in:

  int a = 30;
  int b = 3;
  int c;

  c = 12;
  if (true) {
     c = 2;
  }
  return c * 2;

As a and b have been simplified to constants and their values substituted everywhere they occurred, the compiler now applies dead code elimination to discard them, reducing the code further:

  int c;

  if (true) {
     c = 2;
  }
  return c * 2;

In above code, instead of True it could be 1 or any other Boolean construct depending on compiler framework. With traditional constant propagation we will get only this much optimization. It can't change structure of the program.

There is another similar optimization, called sparse conditional constant propagation, which selects the appropriate branch on the basis of if-condition.[1] The compiler can now detect that the if statement will always evaluate to true, c itself can be eliminated, shrinking the code even further:

  return 4;

If this pseudocode constituted the body of a function, the compiler could further take advantage of the knowledge that it evaluates to the constant integer 4 to eliminate unnecessary calls to the function, producing further performance gains.

See also

References

  1. Wegman, Mark N; Zadeck, F. Kenneth (April 1991), "Constant Propagation with Conditional Branches", ACM Transactions on Programming Languages and Systems, 13 (2): 181–210, doi:10.1145/103135.103136

Further reading

This article is issued from Wikipedia - version of the 11/8/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.