Chapter 15
Magic
Most Java runtimes rely upon the foreign language APIs of the underlying platform operating system to implement runtime behaviour which involves interaction with the underlying platform. Runtimes also occasionally employ small segments of machine code to provide access to platform hardware state. Note that this is expedient rather than mandatory. With a suitably smart Java bytecode compiler it would be quite possible to implement a full Java-in-Java runtime i.e. one comprising only compiled Java code (the JNode project is an attempt to implement a runtime along these lines; the Xerox, MIT, Lambda and TI Explorer Lisp machine implementations and the Xerox Smalltalk implementation were highly successful attemtps at fully compiled language runtimes).
This section provides information on ⋆ magic ⋆ which is an escape hatch that JikesTM RVM provides to implement functionality that is not possible using the pure JavaTM programming language. For example, the Jikes RVM garbage collectors and runtime system must, on occasion, access memory or perform unsafe casts. The compiler will also translate a call to Magic.threadSwitch() into a sequence of machine code that swaps out old thread registers and swaps in new ones, switching execution to the new thread’s stack resumed at its saved PC
There are three mechanisms via which the Jikes RVM ⋆ magic ⋆ is implemented:
- Compiler Intrinsics: Most methods are within class librarys but some functions are built in (that is, intrinsic) to the compiler. These are referred to as intrinsic functions or intrinsics.
- Compiler Pragmas: Some intrinsics do not provide any behaviour but instead provide information to the compiler that modifies optimizations, calling conventions and activation frame layout. We refer to these mechanisms as compiler pragmas.
- Unboxed Types: Besides the primitive types, all Java values are boxed types. Conceptually, they are represented by a pointer to a heap object. However, an unboxed type is represented by the value itself. All methods on an unboxed type must be Compiler Intrinsics.
The mechanisms are used to implement the following functionality:
- RawMemoryAccess: Unfetted access to memory.
- Uninterruptible Code: Declaring code to be uninterruptible.
- Alternative Calling Conventions: Declaring different calling conventions and activation frame layout. This is done via annotations, see the org.vmmagic.pragma package.
15.1 Compiler Intrinsics
A compiler intrinsic will usually generate a specific code sequence. The code sequence will usually be inlined and optimized as part of compilation phase of the optimizing compiler.
Magic
All the methods in Magic are compiler intrinsics. Because these methods access raw memory or other machine state, perform unsafe casts, or are operating system calls, they cannot be implemented in Java code.
A JikesTM RVM implementor must be extremely careful when writing code that uses Magic to circumvent the Java type system. The use of Magic.objectAsAddress to perform various forms of pointer arithmetic is especially hazardous, since it can result in pointers being ”lost” during garbage collection. All such uses of magic must either occur in uninterruptible methods or be guarded by calls to VM.disableGC and VM.enableGC. The optimizing compiler performs aggressive inlining and code motion, so not explicitly marking such dangerous regions in one of these two manners will lead to disaster.
Since magic is inexpressible in the Java programming language , it is unsurprising that the bodies of Magic methods are undefined. Instead, for each of these methods, the Java instructions to generate the code is stored in GenerateMagic and GenerateMachineSpecificMagic (to generate HIR) and the baseline compilers (to generate assembly code) (Note: The optimizing compiler always uses the set of instructions that generate HIR; the instructions that generate assembly code are only invoked by the baseline compiler.). Whenever the compiler encounters a call to one of these magic methods, it inlines appropriate code for the magic method into the caller method.
sun.misc.Unsafe
The methods of sun.misc.Unsafe are not treated specially by the compilers. The Jikes RVM ships a custom sun.misc.Unsafe implementation that implements the operations with Jikes RVM magics and internal helper routines.
15.2 Unboxed Types
If a type is boxed then it means that values of that type are represented by a pointer to a heap object. An unboxed type is represented by the value itself such as int, double, float, byte etc.
In the Jikes RVM terminology, an unboxed type is a custom unboxed type. Normal Java primitives such as int are never referred to as unboxed types.
The Jikes RVM also defines a number of unboxed types. Due to a limitation of the way the compiler generates code the Jikes RVM must define an unboxed array type for each unboxed type. The unboxed types are:
- org.vmmagic.unboxed.Address
- org.vmmagic.unboxed.Extent
- org.vmmagic.unboxed.ObjectReference
- org.vmmagic.unboxed.Offset
- org.vmmagic.unboxed.Word
- org.jikesrvm.compilers.common.Code
Values of unboxed types appear only in the virtual machine’s stack, registers, or as fields/elements of class/array instances.
Unboxed types may inherit from Object but they are not objects. As such there are some restrictions on the use of unboxed types:
- A unboxed type instance must not be passed where an Object is expected. This will type-check, but it is not what you want. A corollary is to avoid overloading a method where the two overloaded versions of the method can only be distinguished by operating on an Object versus an unboxed type. The optimizing compiler can detect some invalid uses of unboxed types.
- An unboxed type must not be synchronized on.
- They have no virtual methods.
- They do not support lock operations, generating hashcodes or any other method inherited from Object.
- All methods must be compiler intrinsics.
- Avoid making an array of an unboxed type. Instead represent it by the array version of unboxed type. i.e. org.vmmagic.unboxed.Address[] should be replaced with org.vmmagic.unboxed.AddressArray but org.vmmagic.unboxed.AddressArray[] is fine.
15.3 Raw Memory Access
The type org.vmmagic.Address is used to represent a machine-dependent address type. org.vmmagic.Address is an unboxed type. In the past, the base type int was used to represent addresses but this approach had several shortcomings. First, the lack of abstraction makes porting nightmarish. Equally important is that Java type int is signed whereas addresses are more appropriately considered unsigned. The difference is problematic since an unsigned comparison on int is inexpressible in the Java programming language.
To overcome these problems, instances of org.vmmagic.Address are used to represent addresses. The class supports the expected well-typed methods like adding an integer offset to an address to obtain another address, computing the difference of two addresses, and comparing addresses. Other operations that make sense on int but not on addresses are excluded like multiplication of addresses. Two methods deserve special attention: converting an address into an integer and the inverse. These methods should be avoided where possible.
Without special intervention, using a Java object to represent an address would be at best abysmally inefficient. Instead, when the Jikes RVM compiler encounters creation of an address object, it will return the primitive value that represents an address for that platform. Currently, the address type maps to either a 32-bit or 64-bit unsigned integer. Since an address is an unboxed type it must obey the rules outlined in Unboxed Types.
15.4 Uninterruptible Code
Declaring a method uninterruptible enables a Jikes RVM developer to prevent the Jikes RVM compilers from inserting ”hidden” thread switch points in the compiled code for the method. As a result, the code can be written assuming that it cannot involuntarily ”lose control” while executing due to a timer-driven thread switch. In particular, neither yield points nor stack overflow checks will be generated for uninterruptible methods. When writing uninterruptible code, the programmer is restricted to a subset of the Java language. The following are the restrictions on uninterruptible code.
- Because a stack overflow check represents a potential yield point (if GC is triggered when the stack is grown), stack overflow checks are omitted from the prologues of uninterruptible code. As a result, all uninterruptible code must be able to execute in the stack space available to them when the first uninterruptible method on the call stack is invoked. This is typically about 8K for uninterruptible regions called from mutator code. The collector threads must preallocate enough stack space, since all collector code is uninterruptible. As a result, using recursive methods in the GC subsystem is a bad idea.
- Since no yield points are inserted in uninterruptible code, there will be no timer-driven thread switches while executing it. So, if possible, one should avoid ”long running” uninterruptible methods outside of the GC subsystem.
- Certain bytecodes are forbidden in uninterruptible code, because Jikes RVM
cannot implement them in a manner that ensures uninterruptibility. The
forbidden bytecodes are:
- aastore
- invokeinterface
- new
- newarray
- anewarray
- athrow
- checkcast and instanceof unless the LHS type is a final class
- monitorenter
- monitorexit
- multianewarray
- Uninterruptible code cannot cause class loading and thus must not contain unresolved getstatic, putstatic, getfield, putfield, invokevirtual, or invokestatic bytecodes.
- Uninterruptible code cannot contain calls to interruptible code. As a consequence, it is illegal to override an uninterruptible virtual method with an interruptible method.
- Uninterruptible methods cannot be synchronized. If you need synchronization in an uninterruptible method, you must use one of the internal locks or synchronization primitives.
We have augmented the baseline compiler to print a warning message when one of these restrictions is violated. The optimizing compiler currently does not check for uninterruptibility violations. Consequently, it is a good idea to compile a boot image with the baseline compiler (e.g. using prototype-opt) after modifying uninterruptible code.
If uninterruptible code were to raise a runtime exception such as NullPointerException, ArrayIndexOutOfBoundsException, or ClassCastException, then it could be interrupted. We assume that such conditions are a programming error (or VM bug) and do not flag bytecodes that might result in one of these exceptions being raised as a violation of uninterruptibility.
In a few cases it is necessary to modify the conditions of checking for uninterruptibility to avoid spurious warning messages. This should be done with extreme care. The checking conditions for a particular method can be modified by using one of the following annotations:
- org.vmmagic.pragma.UninterruptibleNoWarn - disables checking for uninterruptibility violations but behaves like org.vmmagic.pragma.Uninterruptible otherwise. Used for methods that need to be uninterruptible but are only executed when writing the boot image.
- org.vmmagic.pragma.Unpreemptible - instructs the JVM to avoid inserting operations that could trigger garbage collection or thread switching but does not disallow them. Calls to preemptible code will cause warnings. This is used for code that is involved in thread scheduling, locking or the creation of exception objects.
- org.vmmagic.pragma.UnpreemptibleNoWarn - used for unpreemptible code that calls interruptible code.
Do not use the annotation org.vmmagic.pragma.LogicallyUninterruptible. Its usage is being phased out.
The following rules determine whether or not a method is uninterruptible.
- All class initializers are interruptible, since they can only be invoked during class loading.
- All object constructors are interruptible, since they an only be invoked as part of the implementation of the new bytecode.
- If a method is annotated with org.vmmagic.pragma.Interruptible then it is interruptible.
- If none of the above rules apply and a method is annotated with org.vmmagic.pragma.Uninterruptible, then it is uninterruptible.
- If none of the above rules apply and the declaring class is annotated with org.vmmagic.pragma.Uninterruptible then it is uninterruptible.
Whether to annotate a class or a method with org.vmmagic.pragma.Uninterruptible is a matter of taste and mainly depends on the ratio of interruptible to uninterruptible methods in a class. If most methods of the class should be uninterruptible, then annotating the class is preferred.