Baseline Compiler

General Architecture

The goal of the baseline compiler is to efficiently generate code that is "obviously correct." It also needs to be easy to port to a new platform and self contained (the entire baseline compiler must be included in all Jikes RVM boot images to support dynamically loading other compilers).

Roughly two thirds of the baseline compiler is machine-independent. The main file is BaselineCompiler and its parent TemplateCompilerFramework. The main platform-dependent file is BaselineCompilerImpl.

Baseline compilation consists of two main steps: GC map computation (discussed below) and code generation. Code generation is straightforward, consisting of a single pass through the bytecodes of the method being compiled. The compiler does not try to optimize register usage, instead the bytecode operand stack is held in memory. This leads to bytecodes that push a constant onto the stack, creating a memory write in the generated machine code. The number of memory accesses in the baseline compiler corresponds directly to the number of bytecodes. TemplateCompilerFramework contains the main code generation switch statement that invokes the appropriate emit<bytecode>_ method of BaselineCompilerImpl.

GC Maps

The baseline compiler computes GC maps by abstractly interpreting the bytecodes to determine which expression stack slots and local variables contain references at the start of each bytecode. There are additional compilations to handle JSRs; see the source code for details. This strategy of computing a single GC map that applies to all the internal GC points for each bytecode slightly constrains code generation. The code generator must ensure that the GC map remains valid at all GC points (including implicit GC points introduced by null pointer exceptions). It also forces the baseline compiler to report reference parameters for the various invoke bytecodes as live in the GC map for the call (because the GC map also needs to cover the various internal GC points that happen before the call is actually performed). Note that this is not an issue for the optimizing compiler which computes GC maps for each machine code instruction that is a GC point.

Command-Line Options

The command-line options to the baseline compiler are stored as fields in an object of type BaselineOptions; this file is mechanically generated by the build process. To add or modify the command-line options in, you must modify either BooleanOptions.dat, or ValueOptions.dat. You should describe your desired command-line option in a format described below in the appendix; you will also find the details for the optimizing compiler's command-line options. Some options are common to both the baseline compiler and optimizing compiler. They are defined by the SharedBooleanOptions.dat and SharedValueOptions.dat files found in the rvm/src-generated/options directory.