14 Core Runtime Services

Chapter 14
Core Runtime Services

The Jikes RVM runtime environment implements a variety of services which a Java application relies upon for correct execution. The services include:

Object Model: The way objects are represented in storage.
Class and Code Management: The mechanism for loading, and representing classes from class ﬁles. The mechanism that triggers compilation and linking of methods and subsequent storage of generated code.
Thread Management: thread creation, scheduling and synchronization/exclusion
JNI: Native interface for writing native methods and invoking the virtual machine from native code.
Calling Conventions: calling conventions used for invoking methods in Jikes RVM
Exception Management: hardware exception trapping and software exception delivery.
Bootstrap: getting an initial Java application running in a fully functional Java execution environment

The requirement for many of these runtime services is clearly visible in language primitives such as new(), throw() and in java.lang and java.io APIs such as Thread.run(), System.println(), File.open() etc. Unlike conventional Java APIs which merely modify the state of Java objects created by the Java application, implementation of these primitives requires interaction with and modiﬁcation of the platform (hardware and system software) on which the Java application is being executed.

In addition to the services described above, Jikes RVM also provides some services that are speciﬁc to its purpose as a research tool:

VM Callbacks: Notﬁcations about potentially interesting events in the VM.

14.1 Object Model

An object model dictates how to represent objects in storage; the best object model will maximize eﬃciency of frequent language operations while minimizing storage overhead. Jikes RVM’s object model is deﬁned by ObjectModel.

14.1.1 Overview

Values in the Java™ programming language are either primitive (e.g. int, double, etc.) or they are references (that is, pointers) to objects. Objects are either arrays having elements or scalar objects having ﬁelds. Objects are logically composed of two primary sections: an object header (described in more detail below) and the object’s instance ﬁelds (or array elements).

The following non-functional requirements govern the Jikes RVM object model:

instance ﬁeld and array accesses should be as fast as possible,
null-pointer checks should be performed by the hardware if possible,
method dispatch and other frequent runtime services should be fast,
other (less frequent) Java operations should not be prohibitively slow, and
per-object storage overhead (ie object header size) should be as small as possible.

Assuming the reference to an object resides in a register, compiled code can access the object’s ﬁelds at a ﬁxed displacement in a single instruction. To facilitate array access, the reference to an array points to the ﬁrst (zeroth) element of an array and the remaining elements are laid out in ascending order. The number of elements in an array, its length, resides just before its ﬁrst element. Thus, compiled code can access array elements via base + scaled index addressing.

The Java programming language requires that an attempt to access an object through a null object reference generates a NullPointerException. In Jikes RVM, references are machine addresses, and null is represented by address 0. On Linux, accesses to both very low and very high memory can be trapped by the hardware, thus all null checks can be made implicit.

14.1.2 Object Header

Logically, every object header contains the following components:

textbfTIB Pointer: The TIB (Type Information Block) holds information that applies to all objects of a type. The structure of the TIB is deﬁned by TIBLayoutConstants. A TIB includes the virtual method table, a pointer to an object representing the type, and pointers to a few data structures to facilitate eﬃcient interface invocation and dynamic type checking.
textbfHash Code: Each Java object has an identity hash code. This can be read by Object.hashCode() or in the case that this method was overridden, by System.identityHashCode. The default hash code is usually the location in memory of the object, however, with some garbage collectors objects can move. So the hash code remains the same, space in the object header may be used to hold the original hash code value.
textbfLock: Each Java object has an associated lock state. This could be a pointer to a lock object or a direct representation of the lock.
textbfArray Length: Every array object provides a length ﬁeld that contains the length (number of elements) of the array.
textbfGarbage Collection Information: Each Java object has associated information used by the memory management system. Usually this consists of one or two mark bits, but this could also include some combination of a reference count, forwarding pointer, etc.
textbfMisc Fields: In experimental conﬁgurations, the object header can be expanded to add additional ﬁelds to every object, typically to support proﬁling.

An implementation of this abstract header is deﬁned by two ﬁles:

JavaHeader, which supports TIB access, default hash codes, and locking. It also provides a few bits for use by the memory management subsystem.
MiscHeader, which supports adding additional ﬁelds to all objects.

Information that is speciﬁc to garbage collection uses the available bits from the Java header. Depending on the chosen garbage collector, the available bits can be accessed via an appropriate class, e.g.:

HeaderByte which provides access to methods for logging and unlogging for various collectors
RCHeader for reference counting garbage collectors
ForwardingWord which provides methods for object forwarding which is used by some copying collectors

14.1.3 Field Layout

Fields tend to be recorded in the Java class ﬁle in the order they are declared in the Java source ﬁle. We lay out ﬁelds in the order they are declared with some exceptions to improve alignment and pack the ﬁelds in the object.

Fields of type double and long beneﬁt from being 8 byte aligned. Every RVMClass records the preferred alignment of the object as a whole. We lay out double and long ﬁelds ﬁrst (and object references if these are 8 bytes long) so that we can avoid making holes in the ﬁeld layout for alignment. We don’t do this for smaller ﬁelds as all objects need to be a multiple of 4 bytes in size.

When we lay out ﬁelds we may create holes to improve alignment. For example, an int following a byte, we’ll create a 3 byte hole following the byte to keep the int 4 byte aligned. Holes in the ﬁeld layout can be 1, 2 or 4 bytes in size. As ﬁelds are laid out, holes are used to avoid increasing the size of the object. Sub-classes inherit the hole information of their parent, so holes in the parent object can be reused by their children.

14.2 Class and Code Management

The runtime maintains a database of Java instances which identiﬁes the currently loaded class and method base. The classloader class base enables the runtime to identify and dynamically load undeﬁned classes as they required during execution. All the classes, methods and compiled code arrays required to enable the runtime to operate are pre-installed in the initial boot image. Other runtime classes and application classes are loaded dynamically as they are needed during execution and have their methods compiled lazily. The runtime can also identify the latest compiled code array (and, on occasions, previously generated versions of compiled code) of any given method via this classbase and recompile it dynamically should it wish to do so.

Lazy method compilation postpones compilation of a dynamically loaded class’ methods at load-time, enabling partial loading of the class base to occur. Immediate compilation of all methods would require loading of all classes mentioned in the bytecode in order to verify that they were being used correctly. Immediate compilation of these class’ methods would require yet more loading and so on until the whole classbase was installed. Lazy compilation delays this recursive class loading process by postponing compilation of a method until it is ﬁrst called.

Lazy compilation works by generating a stub for each of a class’ methods when the class is loaded. If the method is an instance method this stub is installed in the appropriate TIB slot. If the method is static it is placed in a linker table located in the JTOC (linker table slots are allocated for each static method when a class is dynamically loaded). When the stub is invoked it calls the compiler to compile the method for real and then jumps into the relevant code to complete the call. The compiler ensures that the relevant TIB slot/linker table slot is updated with the new compiled code array. It also handles any race conditions caused by concurrent calls to the dummy method code ensuring that only one caller proceeds with the compilation and other callers wait for the resulting compiled code.

14.2.1 Class Loading

Jikes^TM RVM implements the Java^TM programming language’s dynamic class loading. While a class is being loaded it can be in one of seven states. These are:

vacant: The RVMClass object for this class has been created and registered and is in the process of being loaded.
loaded: The class’s bytecode ﬁle has been read and parsed successfully. The modiﬁers and attributes for the ﬁelds have been loaded and the constant pool has been constructed. The class’s superclass (if any) and superinterfaces have been loaded as well.
resolved: The superclass and superinterfaces of this class has been resolved. The oﬀsets (whether in the object itself, the JTOC, or the class’s TIB) of its ﬁelds and methods have been calculated.
instantiated: The superclass has been instantiated and pointers to the compiled methods or lazy compilation stubs have been inserted into the JTOC (for static methods) and the TIB (for virtual methods).
initializing: The superclass has been initialized and the class initializer is being run.
initialized: The superclass has been initialized and the class initializer has been run.
class initializer has failed: There was an exception during execution of the <clinit> method so the class cannot be initialized successfully.

14.2.2 Code Management

A compiled method body is an array of machine instructions (stored as ints on PowerPC^TM and bytes on x86-32). The Jikes RVM Table of Contents(JTOC), stores pointers to static ﬁelds and methods. However, pointers for instance ﬁelds and instance methods are stored in the receiver class’s TIB. Consequently, the dispatch mechanism diﬀers between static methods and instance methods.

The JTOC

The JTOC holds pointers to each of Jikes^TM RVM’s global data structures, as well as literals, numeric constants and references to String constants. The JTOC also contains references to the TIB for each class in the system. Since these structures can have many types and the JTOC is declared to be an array of ints, Jikes RVM uses a descriptor array, co-indexed with the JTOC, to identify the entries containing references. The JTOC is depicted in the ﬁgure below.

Virtual Methods

A TIB contains pointers to the compiled method bodies (executable code) for the virtual methods and other instance methods of its class. Thus, the TIB serves as Jikes RVM’s virtual method table. A virtual method dispatch entails loading the TIB pointer from the object reference, loading the address of the method body at a given oﬀset oﬀ the TIB pointer, and making an indirect branch and link to it. A virtual method is dispatched to with the invokevirtual bytecode; other instance methods are invoked by the invokespecial bytecode.

Static Fields and Methods

Static ﬁelds and pointers to static method bodies are stored in the JTOC. Static method dispatch is simpler than virtual dispatch, since a well-known JTOC entry method holds the address of the compiled method body.

Instance Initialization Methods

Pointers to the bodies of instance initialization methods, <init>, are stored in the JTOC. (They are always dispatched to with the invokespecial bytecode.)

Lazy Method Compilation

Method slots in a TIB or the JTOC may hold either a pointer to the compiled code, or a pointer to the compiled code of the lazy method invocation stub. When invoked, the lazy method invocation stub compiles the method, installs a pointer to the compiled code in the appropriate TIB or the JTOC slot, then jumps to the start of the compiled code.

Interface Methods

Regardless of whether or not a virtual method is overridden, virtual method dispatch is still simple since the method will occupy the same TIB oﬀset its deﬁning class and in every sub-class. However, a method invoked through an invokeinterface call rather than an invokevirtual call, will not occupy the same TIB oﬀset in every class that implements its interface. This complicates dispatch for invokeinterface.

The simplest, and least eﬃcient way, of locating an interface method is to search all the virtual method entries in the TIB ﬁnding a match. Instead, Jikes RVM uses an Interface Method Table (IMT) which resembles a virtual method table for interface methods. Any method that could be an interface method has a ﬁxed oﬀset into the IMT just as with the TIB. However, unlike in the TIB, two diﬀerent methods may share the same oﬀset into the IMT. In this case, a conﬂict resolution stub is inserted in the IMT. Conﬂict resolution stubs are custom-generated machine code sequences that test the value of a hidden parameter to dispatch to the desired interface method. For more details, see InterfaceInvocation.

14.3 Thread Management

This section provides some explanation of how Java^TM threads are scheduled and synchronized by Jikes^TM RVM.

All Java threads (application threads, garbage collector threads, etc.) derive from RVMThread. Each RVMThread maps directly to one native thread, which may be implemented using whichever C/C++ threading library is in use (currently pthreads). Unless -X:availableProcessors or -X:gc:threads is used, native threads are allowed to be arbitrarily scheduled by the OS using whatever processor resources are available; Jikes^TM RVM does not attempt to control the thread-processor mapping at all.

Using native threading gives Jikes^TM RVM better compatibility for existing JNI code, as well as improved performance, and greater infrastructure simplicity. Scheduling is oﬄoaded entirely to the operating system; this is both what native code would expect and what maximizes the OS scheduler’s ability to optimally schedule Java^TM threads. As well, the resulting VM infrastructure is both simpler and more robust, since instead of focusing on scheduling decisions it can take a ”hands-oﬀ” approach except when Java threads have to be preempted for sampling, on-stack-replacement, garbage collection, Thread.suspend(), or locking. The main task of RVMThread and other code in org.jikesrvm.scheduler is thus to override OS scheduling decisions when the VM demands it.

The remainder of this section is organized as follows. The management of a thread’s state is discussed in detail. Mechanisms for blocking and handshaking threads are described. The VM’s internal locking mechanism, the Monitor, is described. Finally, the locking implementation is discussed.

14.3.1 Tracking the Thread State

The state of a thread is broken down into two elements:

Should the thread yield at a safe point?
Is the thread running Java code right now?

The ﬁrst mechanism is provided by the RVMThread.takeYieldpoint ﬁeld, which is 0 if the thread should not yield, or non-zero if it should yield at the next safe point. Negative versus positive values indicate the type of safe point to yield at (epilogue/prologue, or any, respectively).

But this alone is insuﬃcient to manage threads, as it relies on all threads being able to reach a safe point in a timely fashion. New Java threads may be started at any time, including at the exact moment that the garbage collector is starting; a starting-but-not-yet-started thread may not reach a safe point if the thread that was starting it is already blocked. Java threads may terminate at any time; terminated threads will never again reach a safe point. Any Java thread may call into arbitrary JNI code, which is outside of the VM’s control, and may run for an arbitrary amount of time without reaching a Java safe point. As well, other mechanisms of RVMThread may cause a thread to block, thereby making it incapable of reaching a safe point in a timely fashion. However, in each of these cases, the Java thread is ”eﬀectively safe” - it is not running Java code that would interfere with the garbage collector, on-stack-replacement, locking, or any other Java runtime mechanism. Thus, a state management system is needed that would notify these runtime services when a thread is ”eﬀectively safe” and does not need to be waited on.

RVMThread provides for the following thread states, which describe to other runtime services the state of a Java thread. These states are designed with extreme care to support the following features:

Allow Java threads to either execute Java code, which periodically reaches safe points, and native code which is ”eﬀectively safe” by virtue of not having access to VM services.
Allow other threads (either Java threads or VM threads) to asynchronously request a Java thread to block. This overlaps with the takeYieldpoint mechanism, but adds the following feature: a thread that is ”eﬀectively safe” does not have to block.
Prevent race conditions on state changes. In particular, if a thread running native code transitions back to running Java code while some other thread expects it to be either ”eﬀectively safe” or blocked at a safe point, then it should block. As well, if we are waiting on some Java thread to reach a safe point but it instead escapes into running native code, then we would like to be notiﬁed that even though it is not at a safe point, it is now eﬀectively safe, and thus, we do not have to wait for it anymore.

The states used to put these features into eﬀect are listed below.

NEW. This means that the thread has been created but is not started, and hence is not yet running. NEW threads are always eﬀectively safe, provided that they do not transition to any of the other states.
IN_JAVA. The thread is running Java code. This almost always corresponds to the OS ”runnable” state - i.e. the thread has no reason to be blocked, is on the runnable queue, and if a processor becomes available it will execute, if it is not already executing. IN_JAVA thread will periodically reach safe points at which the takeYieldpoint ﬁeld will be tested. Hence, setting this ﬁeld will ensure that the thread will yield in a timely fashion, unless it transitions into one of the other states in the meantime.
IN_NATIVE. The thread is running either native C code, or internal VM code (which, by virtue of Jikes^TM RVM’s metacircularity, may be written in Java). IN_NATIVE threads are ”eﬀectively safe” in that they will not do anything that interferes with runtime services, at least until they transition into some other state. The IN_NATIVE state is most often used to denote threads that are blocked, for example on a lock.
IN_JNI. The thread has called into JNI code. This is identical to the IN_NATIVE state in all ways except one: IN_JNI threads have a JNIEnvironment that stores more information about the thread’s execution state (stack information, etc), while IN_NATIVE threads save only the minimum set of information required for the GC to perform stack scanning.
IN_JAVA_TO_BLOCK. This represents a thread that is running Java code, as in IN_JAVA, but has been requested to yield. In most cases, when you set takeYieldpoint to non-zero, you will also change the state of the thread from IN_JAVA to IN_JAVA_TO_BLOCK. If you don’t intend on waiting for the thread (for example, in the case of sampling, where you’re opportunistically requesting a yield), then this step may be omitted; but in the cases of locking and garbage collection, when a thread is requested to yield using takeYieldpoint, its state will also be changed.
BLOCKED_IN_NATIVE. BLOCKED_IN_NATIVE is to IN_NATIVE as
IN_JAVA_TO_BLOCK is to IN_JAVA. When requesting a thread to yield, we check its state; if it’s IN_NATIVE, we set it to be BLOCKED_IN_NATIVE.
BLOCKED_IN_JNI. Same as BLOCKED_IN_NATIVE, but for IN_JNI.
TERMINATED. The thread has died. It is ”eﬀectively safe”, but will never again reach a safe point.

The states are stored in RVMThread.execStatus, an integer ﬁeld that may be rapidly manipulated using compare-and-swap. This ﬁeld uses a hybrid synchronization protocol, which includes both compare-and-swap and conventional locking (using the thread’s Monitor, accessible via the RVMThread.monitor() method). The rules are as follows:

All state changes except for IN_JAVA to IN_NATIVE or IN_JNI, and
IN_NATIVE or IN_JNI back to IN_JAVA, must be done while holding the lock.
Only the thread itself can change its own state without holding the lock.
The only asynchronous state changes (changes to the state not done by the thread that owns it) that are allowed are IN_JAVA to IN_JAVA_TO_BLOCK, IN_NATIVE to BLOCKED_IN_NATIVE, and IN_JNI TO BLOCKED_IN_JNI.

The typical algorithm for requesting a thread to block looks as follows:

thread.monitor().lockNoHandshake();
if (thread is running) {
   thread.takeYieldpoint=1;

   // transitions IN_JAVA -> IN_JAVA_TO_BLOCK, IN_NATIVE->BLOCKED_IN_NATIVE, etc.
   thread.setBlockedExecStatus();

   if (thread.isInJava()) {
     // Thread will reach safe point soon, or else notify
     // us that it left to native code.
     // In either case, since we are holding the lock,
     // the thread will effectively block on either the safe point
     // or on the attempt to go to native code, since performing
     // either state transition requires acquiring the lock,
     // which we are now holding.
   } else {
     // Thread is in native code, and thus is "effectively safe",
     // and cannot go back to running Java code so long as we hold
     // the lock, since that state transition requires
     // acquiring the lock.
   }
}
thread.monitor().unlock();

Most of the time, you do not have to write such code, as the cases of blocking threads are already implemented. For examples of how to utilize these mechanisms, see RVMThread.block(), RVMThread.hardHandshakeSuspend(), and RVMThread.softHandshake(). A discussion of how to use these methods follows in the section below.

Finally, the valid state transitions are as follows.

NEW to IN_JAVA: occurs when the thread is actually started. At this point it is safe to expect that the thread will reach a safe point in some bounded amount of time, at which point it will have a complete execution context, and this will be able to have its stack traces by GC.
IN_JAVA to IN_JAVA_TO_BLOCK: occurs when an asynchronous request is made, for example to stop for GC, do a mutator ﬂush, or do an isync on PPC.
IN_JAVA to IN_NATIVE: occurs when the code opts to run in privileged mode, without synchronizing with GC. This state transition is only performed by Monitor, in cases where the thread is about to go idle while waiting for notiﬁcations (such as in the case of park, wait, or sleep), and by org.jikesrvm.runtime.FileSystem, as an optimization to allow I/O operations to be performed without a full JNI transition.
IN_JAVA to IN_JNI: occurs in response to a JNI downcall, or return from a JNI upcall.
IN_JAVA_TO_BLOCK to BLOCKED_IN_NATIVE: occurs when a thread that had been asked to perform an async activity decides to go to privileged mode instead. This state always corresponds to a notiﬁcation being sent to other threads, letting them know that this thread is idle. When the thread is idle, any asynchronous requests (such as mutator ﬂushes) can instead be performed on behalf of this thread by other threads, since this thread is guaranteed not to be running any user Java code, and will not be able to return to running Java code without ﬁrst blocking, and waiting to be unblocked (see BLOCKED_IN_NATIVE to IN_JAVA transition.
IN_JAVA_TO_BLOCK to BLOCKED_IN_JNI: occurs when a thread that had been asked to perform an async activity decides to make a JNI downcall, or return from a JNI upcall, instead. In all other regards, this is identical to the IN_JAVA_TO_BLOCK to BLOCKED_IN_NATIVE transition.
IN_NATIVE to IN_JAVA: occurs when a thread returns from idling or running privileged code to running Java code.
BLOCKED_IN_NATIVE to IN_JAVA: occurs when a thread that had been asked to perform an async activity while running privileged code or idling decides to go back to running Java code. The actual transition is preceded by the thread ﬁrst performing any requested actions (such as mutator ﬂushes) and waiting for a notiﬁcation that it is safe to continue running (for example, the thread may wait until GC is ﬁnished).
IN_JNI to IN_JAVA: occurs when a thread returns from a JNI downcall, or makes a JNI upcall.
BLOCKED_IN_JNI to IN_JAVA: same as BLOCKED_IN_NATIVE to IN_JAVA, except that this occurs in response to a return from a JNI downcall, or as the thread makes a JNI upcall.
IN_JAVA to TERMINATED: the thread has terminated, and will never reach any more safe points, and thus will not be able to respond to any more requests for async activities.

14.3.2 Blocking and Handshaking

Various VM services, such as the garbage collector and locking, may wish to request a thread to block. In some cases, we want to block all threads except for the thread that makes the request. As well, some VM services may only wish for a ”soft handshake”, where we wait for each non-collector thread to perform some action exactly once and then continue (in this case, the only thread that blocks is the thread requesting the soft handshake, but all other non-collector threads must ”yield” in order to perform the requested action; in most cases that action is non-blocking). A uniﬁed facility for performing all of these requests is provided by RVMThread.

Four types of thread blocking and handshaking are supported:

RVMThread.block(). This is a low-level facility for requesting that a particular thread blocks. It is inherently unsafe to use this facility directly - for example, if thread A calls B.block() while thread B calls A.block(), the two threads may mutually deadlock.
RVMThread.beginPairHandshake(). This implements a safe pair-handshaking mechanism, in which two threads become bound to each other for a short time. The thread requesting the pair handshake waits until the other thread is at a safe point or else is ”eﬀectively safe”, and prevents it from going back to executing Java code. Note that at this point, neither thread will respond to any other handshake requests until RVMThread.endPairHandshake() is called. This is useful for implementing biased locking, but it has general utility anytime one thread needs to manipulate something another thread’s execution state.
RVMThread.softHandshake(). This implements soft handshakes. In a soft handshake, the requesting thread waits for all non-collector threads to perform some action exactly once, and then returns. If any of those threads are eﬀectively safe, then the requesting thread performs the action on their behalf. softHandshake() is invoked with a SoftHandshakeVisitor that determines which threads are to be aﬀected, and what the requested action is. An example of how this is used is found in org.jikesrvm.mm.mmtk.Collection and org.jikesrvm.compilers.opt.runtimesupport.OptCompiledMethod.
RVMThread.hardHandshakeSuspend(). This stops all threads except for the garbage collector threads and the thread making the request. It returns once all Java threads are stopped. This is used by the garbage collector itself, but may be of utility elsewhere (for example, dynamic software updating). To resume all stopped threads, call RVMThread.hardHandshakeResume(). Note that this mechanism is carefully designed so that even after the world is stopped, it is safe to request a garbage collection (in that case, the garbage collector will itself call a variant of hardHandshakeSuspend(), but it will only aﬀect the one remaining running Java thread).

14.3.3 The Monitor API

The VM internally uses an OS-based locking implementation, augmented with support for safe lock recursion and awareness of handshakes. The Monitor API provides locking and notiﬁcation, similar to a Java lock, and is implemented using a pthread_mutex and a pthread_cond.

Acquiring a Monitor lock, or awaiting notiﬁcation, may cause the calling RVMThread to block. This prevents the calling thread from acknowledging handshakes until the blocking call returns. In some cases, this is desirable. For example:

In the implementation of handshakes, the code already takes special care to use the RVMThread state machine to notify other threads that the caller may block. As such, acquiring a lock or waiting for a notiﬁcation is safe.
If acquiring a lock that may only be held for a short, guaranteed-bounded length of time, the fact that the thread will ignore handshake requests while blocking is safe - the lock acquisition request will return in bounded time, allowing the thread to acknowledge any pending handshake requests.

But in all other cases, the calling thread must ensure that the handshake mechanism is notiﬁed that thread will block. Hence, all blocking Monitor methods have both a ”NoHandshake” and ”WithHandshake” version. Consider the following code:

someMonitor.lockNoHandshake();
// perform fast, bounded-time critical section
someMonitor.unlock(); // non-blocking

In this code, lock acquisition is done without notifying handshakes. This makes the acquisition faster. In this case, it is safe because the critical section is bounded-time. As well, we require that in this case, any other critical sections protected by someMonitor are bounded-time as well. If, on the other hand, the critical section was not bounded-time, we would do:

someMonitor.lockWithHandshake();
// perform potentially long critical section
someMonitor.unlock();

In this case, the lockWithHandshake() operation will transition the calling thread to the IN_NATIVE state before acquiring the lock, and then transition it back to IN_JAVA once the lock is acquired. This may cause the thread to block, if a handshake is in progress. As an added safety provision, if the lockWithHandshake() operation blocks due to a handshake, it will ensure that it does so without holding the someMonitor lock.

A special Monitor is provided with each thread. This monitor is of the type NoYieldpointsMonitor and will also ensure that yieldpoints (safe points) are disabled while the lock is held. This is necessary because any safe point may release the Monitor lock by waiting on it, thereby breaking atomicity of the critical section. The NoYieldpointsMonitor for any RVMThread may be accessed using the RVMThread.monitor() method.

Additional information about how to use this API is found in the following section, which discusses the implementation of Java locking.

14.3.4 Thin and Biased Locking

Jikes^TM RVM uses a hybrid thin/biased locking implementation that is designed for very high performance under any of the following loads:

Locks only ever acquired by one thread. In this case, biased locking is used, an no atomic operations (like compare-and-swap) need to be used to acquire and release locks.
Locks acquired by multiple threads but rarely under contention. In this case, thin locking is used; acquiring and releasing the lock involves a fast inlined compare-and-swap operation. It is not as fast as biased locking on most architectures.
Contended locks. Under sustained contention, the lock is ”inﬂated” - the lock will now consist of data structures used to implement a fast barging FIFO mutex. A barging FIFO mutex allows threads to immediately acquire the lock as soon as it is available, or otherwise enqueue themselves on a FIFO and await its availability.

Thin locking has a relatively simple implementation; roughly 20 bits in the object header are used to represent the current lock state, and compare-and-swap is used to manipulate it. Biased locking and contended locking are more complicated, and are described below.

Biased locking makes the optimistic assumption that only one thread will ever want to acquire the lock. So long as this assumption holds, acquisition of the lock is a simple non-atomic increment/decrement. However, if the assumption is violated (a thread other than the one to which the lock is biased attempts to acquire the lock), a fallback mechanism is used to turn the lock into either a thin or contended lock. This works by using RVMThread.beginPairHandshake() to bring both the thread that is requesting the lock and the thread to which the lock is biased to a safe point. No other threads are aﬀected; hence this system is very scalable. Once the pair handshake begins, the thread requesting the lock changes the lock into either a thin or contended lock, and then ends the pair handshake, allowing the thread to which the lock was biased to resume execution, while the thread requesting the lock may now contend on it using normal thin/contended mechanisms.

Contended locks, or ”fat locks”, consist of three mechanisms:

A spin lock to protect the data structures.
A queue of threads blocked on the lock.
A mechanism for blocked threads to go to sleep until awoken by being dequeued.

The spin lock is a org.jikesrvm.scheduler.SpinLock. The queue is implemented in org.jikesrvm.scheduler.ThreadQueue. And the blocking/unblocking mechanism leverages org.jikesrvm.scheduler.Monitor; in particular, it uses the Monitor that is attached to each thread, accessible via RVMThread.monitor(). The basic algorithm for lock acquisition is:

spinLock.lock();
while (true) {
   if (lock available) {
     acquire the lock;
     break;
   } else {
     queue.enqueue(me);
     spinLock.unlock();

     me.monitor().lockNoHandshake();
     while (queue.isQueued(me)) {
        // put this thread to sleep waiting to be dequeued,
        // and do so while the thread is IN_NATIVE to ensure
        // that other threads don’t wait on this one for
        // handshakes while we’re blocked.
        me.monitor().waitWithHandshake();
     }
     me.monitor().unlock();
     spinLock.lock();
   }
}
spinLock.unlock();

The algorithm for unlocking dequeues the thread at the head of the queue (if there is one) and notiﬁes its Monitor using the lockedBroadcastNoHandshake() method. Note that these algorithms span multiple methods in org.jikesrvm.scheduler.ThinLock and org.jikesrvm.scheduler.Lock; in particular, lockHeavy(), lockHeavyLocked(), unlockHeavy(), lock(), and unlock().

14.4 JNI

14.4.1 Overview

This section describes how Jikes RVM interfaces to native code. There are three major aspects of this support:

JNI Functions: This is the mechanism for transitioning from native code into Java code. Jikes RVM implements the 1.1 through 1.4 JNI speciﬁcations.
Native methods: This is the mechanism for transitioning from Java code to native code. In addition to the normal mechanism used to invoke a native method, Jikes RVM also supports a more restricted syscall mechanism that is used internally by low-level VM code to invoke native code.
Integration with threading: JNI may be freely used from any Java method. The mechanisms required to make this work are discussed in great detail in Thread Management, and to some extent in the sections that follow.

14.4.2 JNI Functions

All of the 1.1 through 1.4 JNIEnv interface functions are implemented.

The functions are deﬁned in the class JNIFunctions. Methods of this class are compiled with special prologues/epilogues that translate from native calling conventions to Java calling conventions and handle other details of the transition related to threading. Currently the optimizing compiler does not support these specialized prologue/epilogue sequences so all methods in this class are baseline compiled. The prologue/epilogue sequences are actually generated by the platform-speciﬁc JNICompiler.

Calling a JNI function results in the thread attempting to transition from IN_JNI to IN_JAVA using a compare-and-swap; if this fails, the thread may block to acknowledge a handshake. See Thread Management for more details.

14.4.3 Invoking Native Methods

There are two mechanisms whereby RVM may transition from Java code to native code.

The ﬁrst mechanism is when RVM calls a method of the class SysCall. The native methods thus invoked are deﬁned in one of the C and C++ ﬁles of the JikesRVM executable. These native methods are non-blocking system calls or C library services. To implement a syscall, Jikes RVM compilers generate a call sequence consistent with the platform’s underlying calling convention. A syscall is not a GC-safe point, so syscalls may modify the Java heap (eg. memcpy()). For more details on the mechanics of adding a new syscall to the system, see the header comments of SysCall.java. Note again that the syscall methods are NOT JNI methods, but an independent (more eﬃcient) interface that is speciﬁc to Jikes RVM.

The second mechanism is JNI. Naturally, the user writes JNI code using the JNI interface. RVM implements a call to a native method by using the platform-speciﬁc JNICompiler to generate a stub routine that manages the transition between Java bytecode and native code. A JNI call is a GC-safe point, since JNI code cannot freely modify the Java heap.

14.4.4 Interactions with Threading

See the Thread Management subsection for more details on the thread system in Jikes RVM.

There are two ways to execute native code: syscalls and JNI. A Java thread that calls native code by either mechanism will never be preempted by Jikes RVM, but in the case of JNI, all of the VM’s services will know that the thread is ”eﬀectively safe” and thus may be ignored for most purposes. Additionally, threads executing JNI code may have handshake actions performed by other threads on their behalf, for example in the case of GC stack scanning. This is not the case with syscalls. As far as Jikes RVM is concerned, a Java thread that enters syscall native code is still executing Java code, but will appear to not reach a safe point until after it emerges from the syscall. This issue may be side-stepped by using the RVMThread enterNative() and leaveNative methods, as shown in org.jikesrvm.runtime.FileSystem.

14.4.5 Missing Features

Native Libraries: JNI 1.2 requires that the VM specially treat native libraries that contain exported functions named JNI_OnLoad and
JNI_OnUnload. Only JNI_OnLoad is currently implemented.
JNICompiler: The only known deﬁciency in JNICompiler is that the prologue and epilogues only handle passing local references to functions that expect a jobject; they will not properly handle a jweak or a regular global reference. This would be fairly easy to implement.
JavaVM interface: The JavaVM interface has GetEnv fully implemented and AttachCurrentThread partly implemented, but DestroyJavaVM, DetachCurrentThread, and AttachCurrentThreadAsDaemon are just stubbed out and return error codes. There is no good reason why AttachCurrentThread and friends cannot be implemented; it just hasn’t been done yet, mostly because there was no easy way to support them prior to the introduction of native threads.
Directly-Exported Invocation Interface Functions: These functions (GetDefaultJavaVMInitArgs and JNI_CreateJavaVM are partly implemented but JNI_GetCreatedJavaVMs) is not implemented. This is because we do not provide a virtual machine library that can be linked against, nor do we support native applications that launch and use an embedded Java VM. There is no inherent reason why this could not be done, but we have not done so yet.

14.4.6 Things JNI Can’t Handle

atexit routines: Calling JNI code via a routine run at exit time means calling back into a VM that has been shutdown. This will cause the Jikes RVM to freeze on Intel architectures.

Contributions of any of the missing functionality (and/or associated tests) would be greatly appreciated.

14.5 Exception Management

The runtime has to deal with the relatively small number of hardware signals which can be generated during Java execution. On operating systems other than AIX, an attempt to dereference a null value (an access to a null value manifests as a read to a small negative address outside the mapped virtual memory address space) will generate a a segmentation fault. This means that the Jikes RVM does not need to generate explicit tests guarding against dereferencing null values and this results in faster code generationg for non-excepting code.

The Jikes RVM handles the signal and reenters Java so that a suitable Java exception handler can be identiﬁed, the stack can be unwound (if necessary) and the handler entered in order to deal with the exception. Failing location of a handler, the associated Java thread must be cleanly terminated.

The Jikes RVM actually employs software traps to generate hardware exceptions in a small number of other cases, for example to trap array bounds exceptions. Once again a software only solution would be feasible. However, since a mechanism is already in place to catch hardware exceptions and restore control to a suitable Java handler the use of software traps is relatively simple to support.

Use of a hardware handler enables the register state at the point of exception to be saved by the hardware exception catching routine. If a Java handler is registered in the call frame which generated the exception this register state can be restored before reentry, avoiding the need for the compiler to save register state around potentially excepting instructions. Register state for handlers in frames below the exception frame is automatically saved by the compiler before making a call and so can always be restored to the state at the point of call by the exception delivery code.

The bootloader registers signal handlers which catch SEGV and TRAP signals. These handlers save the current register state on the stack, create a special handler frame above the saved register state and return into this handler frame executing RuntimeEntrypoints.deliverHardwareException(). This method searches the stack from the excepting frame (or from the last Java frame if the exception occurs inside native code) looking for a suitable handler and unwinding frames which do not contain one. At each unwind the saved register state is reset to the state associated with the next frame. When a handler is found the delivery code installs the saved register state and returns into the handler frame at the start of the handler block.

The Jikes RVM employs some of the same code used by the hardware exception handler to implement the language primitive throw(). This primitive requires a handler to be located and the stack to be unwound so that the handler can be entered. A throw operation is always translated into a call to RuntimeEntrypoints.athrow() so the unwind can never happens in the handler frame. Hence the register state at the point of re-entry is always saved by the call mechanism and there is no need to generate a hardware exception.

14.6 Bootstrap

Jikes RVM is started up by a boot program written in C, the bootloader. The bootloader is responsible for

registering signal handlers to deal with the hardware errors generated by Jikes RVM
establishing the initial virtual memory map employed by Jikes RVM
mapping the Jikes RVM image ﬁles
installing the addresses of the C wrapper functions which are invoked by the runtime to interact with the underlying operating system into the boot record of at the start of Jikes RVM image area
setting up the JTOC and TR registers for its RVMThread/pthread
switching the pthread into the bootstrap Java stack running the bootstrap Java method in the bootstrap Java thread

At this point all further initialization of Jikes RVM is done either in Java or by employing the wrapper callbacks located in the boot record.

The initial bootstrap routine is VM.boot(). It sets up the initial thread environment so that it looks like any other thread created by a call to Thread.start() then performs a variety of Java boot operations, including initialising the memory manager subsystem, the runtime compiler, the system classloader and the time classes.

The bootstrap routine needs to rerun class initializers for a variety of the runtime and runtime library classes which are already loaded and compiled into the image ﬁle. This is necessary because some of the data generated by these initialization routines will not be valid in the JIkes RVM runtime. The data may be invalid as the host environment that generated the boot image may diﬀer from the current environment.

The boot process the enables the Java scheduler and locking system, setting up the data structures necessary to launch additional threads. The scheduler also starts the FinalizerThread and multiple garbage collector threads (i.e. multiple instances of CollectorThread).

Next, the boot routine boots the the JNI subsystem which enables calls to native code to be compiled and executed then re-initialises a few more classes whose init methods require a functional JNI (i.e. java.io.FileDescriptor).

Finally, the boot routine loads the boot application class supplied on the rvm command line, creates and schedules a Java main thread to execute this class’s main method, then exits, switching execution to the main thread. Execution continues until the application thread and all non-daemon threads have exited. Once there are no runnable threads (other than system threads such as the idle threads, collector threads etc) execution of the RVM runtime terminates and the rvm process exits.

14.6.1 Memory Map

Jikes RVM divides its available virtual memory space into various segments containing either code, or data or a combination of the two. The basic map is as follows:

Boot Segment

The bottom segment of the address space is left for the underlying platform to locate the boot program (including statically linked library code) and any dynamically allocated data and library code.

Jikes RVM Image Segment

The next area is the one initialized by the boot program to contain the all the initial static data, instance data and compiled method code required in order for the runtime to be able to function. The required memory data is loaded from an image ﬁle created by an oﬀ line Java program, the boot image writer.

This image ﬁle is carefully constructed to contain data which, when loaded at the correct address, will populate the runtime data area with a memory image containing:

a JTOC
all the TIBs, static method code arrays and static ﬁeld data directly referenced from the JTOC
all the dynamic method code arrays indirectly referenced from the TIBS
all the classloader’s internal class and method instances indirectly referenced via the TIBS
ancillary structures attached to these class and method instances such as class bytecode arrays, compilation records, garbage collection maps etc
a single bootstrap Java thread instance in which Java execution commences
a single bootstrap thread stack used by the bootstrap thread.
a master boot record located at the start of the image load area containing references to all the other key objects in the image (such as the JTOC, the bootstrap thread etc) plus linkage slots in which the booter writes the addresses of its C callback functions.

Jikes RVM Heap Segment

The Jikes RVM heap segment is used to provide storage for code and data created during Java execution. Jikes RVM can be conﬁgured to employ various diﬀerent allocation managers taken from the MMTk memory management toolkit.

14.7 Calling Conventions

14.7.1 Architecture-independent concepts

Stackframe layout and calling conventions may evolve as our understanding of Jikes RVM’s performance improves. Where possible, API’s should be used to protect code against such changes.

Register conventions

Registers (general purpose, gp, and ﬂoating point, fp) can be roughly categorized into four types:

Scratch: Needed for method prologue/epilogue. Can be used by compiler between calls.
Dedicated: Reserved registers with known contents:
- JTOC - Jikes RVM Table Of Contents. Globally accessible data: constants, static ﬁelds and methods.
- FP - Frame Pointer Current stack frame (thread speciﬁc).
- TR - Thread register. An object representing the current RVMThread instance (the one executing on the CPU containing these registers).
Volatile (”caller save”, or ”parameter”): Like scratch registers, these can be used by the compiler as temporaries, but they are not preserved across calls. Volatile registers diﬀer from scratch registers in that volatiles can be used to pass parameters and result(s) to and from methods.
Nonvolatile (”callee save”, or ”preserved”): These can be used (and are preserved across calls), but they must be saved on method entry and restored at method exit. Highest numbered registers are to be used ﬁrst. (At least initially, nonvolatile registers will not be used to pass parameters.)

Stack conventions

Stacks grow from high memory to low memory.

Method prologue responsibilities

(some of these can be omitted for leaf methods):

Execute a stackoverﬂow check, and grow the thread stack if necessary.
Save the caller’s next instruction pointer (callee’s return address)
Save any nonvolatile ﬂoating-point registers used by callee.
Save any nonvolatile general-purpose registers used by callee.
Store and update the frame pointer FP.
Store callee’s compiled method ID
Check to see if the Java^TM thread must yield the Processor (and yield if threadswitch was requested).

Method epilogue responsibilities

(some of these can be ommitted for leaf methods):

Restore FP to point to caller’s stack frame.
Restore any nonvolatile general-purpose registers used by callee.
Restore any nonvolatile ﬂoating-point registers used by callee.
Branch to the return address in caller.

14.7.2 Architecture-speciﬁc calling conventions

The following architecture-speciﬁc classes are of interest for calling conventions:

StackframeLayoutConstants for layout of the stack frame
JNICompiler for transition to native code
RegisterConstants for register deﬁnitions
BaselineConstants for register usage by the baseline compiler
CallingConventions which expands the calling conventions shortly before register allocation in the optimizing compiler

14.8 VM Callbacks

Jikes^TM RVM provides callbacks for many runtime events of interest to the Jikes RVM programmer, such as classloading, VM boot image creation, and VM exit. The callbacks allow arbitrary code to be executed on any of the supported events.

The callbacks are accessed through the nested interfaces deﬁned in the Callbacks class. There is one interface per event type. To be notiﬁed of an event, register an instance of a class that implements the corresponding interface with Callbacks by calling the corresponding add...() method. For example, to be notiﬁed when a class is instantiated, ﬁrst implement the Callbacks.ClassInstantiatedMonitor interface, and then call Callbacks.addClassInstantiatedMonitor() with an instance of your class. When any class is instantiated, the notifyClassInstantiated method in your instance will be invoked.

The appropriate interface names can be obtained by appending ”Monitor” to the event names (e.g. the interface to implement for the MethodOverride event is Callbacks.MethodOverrideMonitor). Likewise, the method to register the callback is ”add”, followed by the name of the interface (e.g. the register method for the above interface is Callbacks.addMethodOverrideMonitor()).

Since the events for which callbacks are available are internal to the VM, there are limitations on the behavior of the callback code. For example, as soon as the exit callback is invoked, all threads are considered daemon threads (i.e. the VM will not wait for any new threads created in the callbacks to complete before exiting). Thus, if the exit callback creates any threads, it has to join() with them before returning. These limitations may also produce some unexpected behavior. For example, while there is an elementary safeguard on any classloading callback that prevents recursive invocation (i.e. if the callback code itself causes classloading), there is no such safeguard across events, so, if there are callbacks registered for both ClassLoaded and ClassInstantiated events, and the ClassInstantiated callback code causes dynamic class loading, the ClassLoaded callback will be invoked for the new class, but not the ClassInstantiated callback.

Examples of callback use can be seen in the Controller class in the adaptive system.

14.8.1 An Example: Modifying SPECjvm98 to Report the End of a Run

The SPECjvm®98 benchmark suite is conﬁgured to run one or more benchmarks a particular number of times. For example, the following runs the compress benchmark for 5 iterations:

rvm SpecApplication -m5 -M5 -s100 -a _201_compress

It is sometimes useful to have the VM notiﬁed when the application has completed an iteration of the benchmark. This can be performed by using the Callbacks interface. The speciﬁcs are speciﬁed below:

Modify spec/harness/ProgramRunner.java:
1. add an import statement for the Callbacks class:
  import org.jikesrvm.Callbacks;
2. before the call to runOnce add the following:
  Callbacks.notifyAppRunStart(className, run);
3. after the call to runOnce add the following:
  Callbacks.notifyAppRunComplete(className, run);
Recompile the modiﬁed ﬁle:
javac -classpath .:$RVM_BUILD/RVM.classes:$RVM_BUILD/RVM.classes/rvmrt.jar spec/harness/ProgramRunner.java

or create a stub version of Callbacks.java and place it the appropriate directory structure with your modiﬁed ﬁle, i.e.,
org/jikesrvm/Callbacks.java
Run Jikes RVM as you normally would using the SPECjvm98 benchmarks.

In the current system the Controller class will gain control when these callbacks are made and print a message into the AOS log ﬁle (by default, placed in Jikes RVM’s current working directory and called AOSLog.txt).

14.8.2 Another Example: Directing a Recompilation of All Methods During the Application’s Execution

Another callback of interest allows an application to direct the VM to recompile all executed methods at a certain point of the application’s execution by calling the recompileAllDynamicallyLoadedMethods method in the Callbacks class. This functionality can be useful to experiment with the performance eﬀects of when compilation occurs. This VM functionality can be disabled using the DISABLE_RECOMPILE_ALL_METHODS boolean ﬂag to the adaptive system.

Jikes RVM

Resources

Documentation

Project Information

Chapter 14Core Runtime Services