Chapter 10
Testing Jikes RVM
Jikes RVM includes provisions to run unit tests as well as functional and performance tests. It also includes a number of actual tests, both unit and functional ones.
10.1 Unit Tests
Jikes RVM makes writing simple unit tests easy. Simply give your JUnit 4 tests a name ending in Test and place test sources under rvm/test-src. The tests will be picked up automatically.
The tests are then run on the bootstrap VM, i.e. the JVM used to build Jikes RVM. You can also configure the build to run unit tests on the newly built Jikes RVM. Note that this may significantly increase the build times of slow configurations (e.g. prototype and protype-opt).
If you are developing new unit tests, it may be helpful to run them on an existing Jikes RVM image. This can be done by using the Ant target unit-tests-on-existing-image. The path for the image is determined by the usual properties of the Ant build.
10.2 Functional and Performance Tests
See External Test Resources for details or downloading prerequisites for the functional tests. The tests are executed using an Ant build file and produce results that conform to the definition below. The results are aggregated and processed to produce a high level report defining the status of Jikes RVM.
The testing framework was designed to support continuous and periodical execution of tests. A ”test-run” occurs every time the testing framework is invoked. Every ”test-run” will execute one or more ”test-configuration”s. A ”test-configuration” defines a particular build ”configuration” (See Configuring Jikes RVM for details) combined with a set of parameters that are passed to Jikes RVM during the execution of the tests. i.e. a particular ”test-configuration” may pass parameters such as -X:aos:enable_recompilation=false -X:aos:initial_compiler=opt -X:irc:O1 to test the Level 1 Opt compiler optimizations.
Every ”test-configuration” will execute one or more ”group”s of tests. Every ”group” is defined by a Ant build.xml file in a separate sub-directory of $RVM_ROOT/testing/tests. Each ”test” has a number of input parameters such as the classname to execute, the parameters to pass to Jikes RVM or to the program. The ”test” records a number of values such as execution time, exit code, result, standard output etc. and may also record a number of statistics if it is a performance test.
The project includes several different types of test runs and the description of each the test runs and their purpose is given in Test Run Descriptions.
Note that the buildit script provides a fast and easy way to build and the system. The script is simply a wrapper around the mechanisms described below.
10.2.1 Ant properties
There is a number of ant properties that control the test process. Besides the properties that are already defined in Building Jikes RVM, special test properties may also be specified.
Property | Description | Default |
test-run.name | The name of the test-run. The name should match one of the files located in the build/test-runs/ directory minus the ’.properties’ extension. | pre-commit |
results.dir | The directory where Ant stores the results of the test run. | ${jikesrvm.dir}/ |
results.archive | The directory where Ant gzips and archives a copy of test run results and reports. | ${results.dir}/ |
send.reports | Define this property to send reports via email. | (Undefined) |
mail.from | The from address used when emailing report. | jikesrvm-core@ |
mail.to | The to address used when emailing report. | jikesrvm- |
mail.host | The host to connect to when sending mail. | localhost |
mail.port | The port to connect to when sending mail. | 25 |
<configuration>. | If set to true, the test process will skip the build step for specified configurations. For the test process to work the build must already be present. | (Undefined) |
skip.build | If defined the test process will skip the build step for all configurations and the javadoc generation step. For the test process to work the build must already be present. | (Undefined) |
skip.javadoc | If defined the test process will skip the javadoc generation step. | (Undefined) |
10.2.2 Defining a test-run
A test-run is defined by a number of properties located in a property file located in the build/test-runs/ directory.
The property test.configs is a whitespace separated list of test-configuration ”tags”. Every tag uniquely identifies a particular test-configuration. Every test-configuration is defined by a number of properties in the property file that are prefixed with test.config.<tag>. See the test run property table for the possible properties.
Property | Description | Default |
tests | The names of the test groups to execute. | None |
name | The unique identifier for test-configuration. | ”” |
configuration | The name of the Jikes RVM build configuration to test. | <tag> |
target | The name of the Jikes RVM build target. This can be used to trigger compilation of a profiled image | ”main” |
mode | The test mode. May modify the way test groups execute. See individual groups for details. | ”” |
extra.rvm.args | Extra arguments that are passed to the Jikes RVM. These may be varied for different runs using the same image. | ”” |
The order of the test-configurations in test.configs is the order that the test-configurations are tested. The order of the groups in test.config.<tag>.test is the order that the tests are executed.
The simplest test-run is defined in the following figure. It will use the build configuration ”prototype” and execute tests in the ”basic” group.
The test process also expands properties in the property file so it is possible to define a set of tests once but use them in multiple test-configurations as occurs in the following figure. The groups basic, optests and dacapo are executed in both the prototype and prototype-opt test configurations.
test.configs=prototype prototype-opt
test.config.prototype.tests=${test.set}
test.config.prototype-opt.tests=${test.set}
Each test can have additional parameters specified that will be used by the test infrastructure when starting the Jikes RVM instance to execute the test. These additional parameters are described in table for test specific parameters.
Parameter | Description | Default Property | Default Value |
initial.heapsize | The initial size of the heap. | ${test.initial.heapsize} | ${config.default-heapsize.initial} |
max.heapsize | The initial size of the heap. | ${test.max.heapsize} | ${config.default-heapsize.maximum} |
max.opt.level | The maximum optimization level for the tests or an empty string to use the Jikes RVM default. | ${test.max.opt.level} | “” |
processors | The number of processors to use for garbage collection for the test or ’all’ to use all available processors. | ${test.processors} | all |
time.limit | The time limit for the test in seconds. After the time limit expires the Jikes RVM instance will be forcefully terminated. | ${test.time.limit} | 1000 |
class.path | The class path for the test. | ${test.class.path} |
|
extra.args | Extra arguments that are passed to Jikes RVM. | ${test.rvm.extra.args} | “” |
exclude | If set to true, the test will be not be executed. |
| “” |
To determine the value of a test specific parameters, the following mechanism is used:
- Search for one of the the following ant properties, in order.
- test.config.<build-configuration>.<group>.<test>.<parameter>
- test.config.<build-configuration>.<group>.<parameter>
- test.config.<build-configuration>.<parameter>
- test.config.<build-configuration>.<group>.<test>.<parameter>
- test.config.<build-configuration>.<group>.<parameter>
- If none of the above properties are defined then use the parameter that was passed to the <rvm> macro in the ant build file.
- If no parameter was passed to the <rvm> macro then use the default value which is stored in the ”Default Property” as specified in the above table. By default the value of the ”Default Property” is specified as the ”Default Value” in the above table, however a particular build file may specify a different ”Default Value”.
10.2.3 Excluding tests
Sometimes it is desirable to exclude tests. The test exclusion may occur as the test is known to fail on a particular target platform, build configuration or maybe it just takes too long. To exclude a test, you must define the test specific parameter ”exclude” to true either in .ant.properties or in the test-run properties file.
For example, at the time of writing the Jikes RVM does not fully support volatile fields and as a result the test named ”TestVolatile” in the ”basic” group will always fail. Rather than being notified of this failure we can disable the test by adding a property such as ”test.config.basic.TestVolatile.exclude=true” into test-run properties file.
10.2.4 Executing a test-run
The tests are executed by the Ant driver script test.xml. The test-run.name property
defines the particular test-run to execute and if not set defaults to ”sanity”. The
command ant -f test.xml -Dtest-run.name=simple executes the test-run defined
in build/test-runs/simple.properties. When this command completes you can point
your browser at
${results.dir}/tests/${test-run.name}/Report.html to get an overview on test
run or at
${results.dir}/tests/${test-run.name}/Report.xml for an XML document
describing test results.
10.2.5 Jenkins integration
Executing a test-run on a recent version of Jikes RVM (later than 3.1.3) also produces a file called MinimalReport-JUnitFormat.xml. This file contains the test results in a format that’s suitable for processing as JUnit results in Jenkins. To use this file in Jenkins, select the appropriate job and add a post-build step to publish JUnit test result reports. Ensure that the box ”Retain long standard output/error” is not checked. If long standard output/error is retained, you will likely run into OutOfMemoryErrors when Jenkins parses the results of larger test runs. For example, parsing results from the sanity test run (which contains >3000 test cases) exhausts 5 GB heaps.
10.3 External Test Resources
The tests included in the source tree are designed to test the correctness and performance of the Jikes RVM. This document gives a step by step instructions for setting up the external dependencies for these tests.
The first step is selecting the base directory where all the external code is to be located. The property external.lib.dir needs to be set to this location. i.e.
> mkdir -p /home/peter/Research/External
Then you need to follow the instructions below for the desired benchmarks. The instructions assume that the environment variable BENCHMARK_ROOT is set to the same location as the external.lib.dir property.
10.3.1 Open Source Benchmarks
In the future other benchmarks such as BigInteger, Ashes or Volano may be included.
Dacapo
Dacapo describes itself as ”This benchmark suite is intended as a tool for Java benchmarking by the programming language, memory management and computer architecture communities. It consists of a set of open source, real world applications with non-trivial memory loads. The suite is the culmination of over five years work at eight institutions, as part of the DaCapo research project, which was funded by a National Science Foundation ITR Grant, CCR-0085792.”
The release needs to be downloaded and placed in the $BENCHMARK_ROOT/dacapo/ directory. i.e.
> cd $BENCHMARK_ROOT/dacapo/
> wget http://sourceforge.net/projects/dacapobench/files/archive/2006-10/dacapo-2006-10.jar/download?use_mirror=autoselect"
jBYTEmark
jBYTEmark was a benchmark developed by Byte.com a long time ago.
> cd $BENCHMARK_ROOT/jBYTEmark-0.9
> wget http://img.byte.com/byte/bmark/jbyte.zip
> unzip -jo jbyte.zip ’app/class/*’
> unzip -jo jbyte.zip ’app/src/jBYTEmark.java’
> ... Edit jBYTEmark.java to delete "while (true) {}" at the end of main. ...
> javac jBYTEmark.java
> jar cf jBYTEmark-0.9.jar *.class
> rm -f *.class jBYTEmark.java
CaffeineMark
CaffeineMark describes itself as ”The CaffeineMark is a series of tests that measure the speed of Java programs running in various hardware and software configurations. CaffeineMark scores roughly correlate with the number of Java instructions executed per second, and do not depend significantly on the the amount of memory in the system or on the speed of a computers disk drives or internet connection.”
> cd $BENCHMARK_ROOT/CaffeineMark-3.0
> wget http://www.benchmarkhq.ru/cm30/cmkit.zip
> unzip cmkit.zip
xerces
Process some large documents using xerces XML parser.
> wget http://archive.apache.org/dist/xml/xerces-j/Xerces-J-bin.2.8.1.tar.gz
> tar xzf Xerces-J-bin.2.8.1.tar.gz
> mkdir -p $BENCHMARK_ROOT/xmlFiles
> cd $BENCHMARK_ROOT/xmlFiles
> wget http://www.ibiblio.org/pub/sun-info/standards/xml/eg/shakespeare.1.10.xml.zip
> unzip shakespeare.1.10.xml.zip
Soot
Soot describes itself as ”Originally, Soot started off as a Java optimization framework. By now, researchers and practitioners from around the world use Soot to analyze, instrument, optimize and visualize Java and Android applications.”
> cd $BENCHMARK_ROOT/soot-2.2.3
> wget http://www.sable.mcgill.ca/software/sootclasses-2.2.3.jar
> wget http://www.sable.mcgill.ca/software/jasminclasses-2.2.3.jar
Java Grande Forum Sequential Benchmarks
Java Grande Forum Sequential Benchmarks is a benchmark suite designed for single processor execution.
> cd $BENCHMARK_ROOT/JavaGrandeForum
> wget http://www2.epcc.ed.ac.uk/javagrande/seq/jgf_v2.tar.gz
> tar xzf jgf_v2.tar.gz
Java Grande Forum Multi-threaded Benchmarks
Java Grande Forum Multi-threaded Benchmarks is a benchmark suite designed for parallel execution on shared memory multiprocessors.
> cd $BENCHMARK_ROOT/JavaGrandeForum
> wget http://www2.epcc.ed.ac.uk/javagrande/threads/jgf_threadv1.0.tar.gz
> tar xzf jgf_threadv1.0.tar.gz
JLex Benchmark
JLex is a lexical analyzer generator, written for Java, in Java.
> cd $BENCHMARK_ROOT/JLex-1.2.6/classes/JLex
> wget http://www.cs.princeton.edu/~appel/modern/java/JLex/Archive/1.2.6/Main.java
> mkdir -p $BENCHMARK_ROOT/QBJC
> cd $BENCHMARK_ROOT/QBJC
> wget http://www.ocf.berkeley.edu/~horie/qbjlex.txt
> mv qbjlex.txt qb1.lex
10.3.2 Proprietary Benchmarks
SPECjbb2005
SPECjbb2005 describes itself as ”SPECjbb2005 (Java Server Benchmark) is SPEC’s benchmark for evaluating the performance of server side Java. Like its predecessor, SPECjbb2000, SPECjbb2005 evaluates the performance of server side Java by emulating a three-tier client/server system (with emphasis on the middle tier). The benchmark exercises the implementations of the JVM (Java Virtual Machine), JIT (Just-In-Time) compiler, garbage collection, threads and some aspects of the operating system. It also measures the performance of CPUs, caches, memory hierarchy and the scalability of shared memory processors (SMPs). SPECjbb2005 provides a new enhanced workload, implemented in a more object-oriented manner to reflect how real-world applications are designed and introduces new features such as XML processing and BigDecimal computations to make the benchmark a more realistic reflection of today’s applications.” SPECjbb2005 requires a license to download and use.
SPECjbb2005 can be run on command line via:
SPECjbb2005 may also be run as part regression tests.
> cd $BENCHMARK_ROOT/SPECjbb2005
> ...Extract package here???...
SPECjbb2000
SPECjbb2000 describes itself as ”SPECjbb2000 (Java Business Benchmark) is SPEC’s first benchmark for evaluating the performance of server-side Java. Joining the client-side SPECjvm98, SPECjbb2000 continues the SPEC tradition of giving Java users the most objective and representative benchmark for measuring a system’s ability to run Java applications.” SPECjbb2000 requires a license to download and use. Benchmarks should no longer be performed using SPECjbb2000 as the benchmarks have very different characteristics.
> cd $BENCHMARK_ROOT/SPECjbb2000
> ...Extract package here???...
SPEC JVM98 Benchmarks
JVM98 features: ”Measures performance of Java Virtual Machines. Applicable to networked and standalone Java client computers, either with disk (e.g., PC, workstation) or without disk (e.g., network computer) executing programs in an ordinary Java platform environment. Requires Java Virtual Machine compatible with JDK 1.1 API, or later.” SPEC JVM98 Benchmarks require a license to download and use.
10.4 Test Run Descriptions
The Jikes RVM project contains several different test runs with different purposes. This document attempts to capture the purpose of each different test run.
10.4.1 pre-commit
This test run MUST be run prior to committing code. It is relatively short and is designed to capture as many potential bugs in the shortest possible time. It is expected that the pre-commit test run will take 7-15 minutes on modern intel architecture.
10.4.2 core
There is a set of workloads we consider important (i.e. dacapo and SPEC*). There is a set of build configurations we consider important (ie prototype, development, production). We as a group wish to guarantee that all important workloads will will run correctly on all important build configurations, i.e. We should NEVER regress. The core test run is designed to identify as early as possible any failures in this matrix of build configuration x workload. It is run continuously 24 hours a day (or at least every time a change is made). It is expected that the core test run will take 2-6 hours to complete depending on the environment.
The best way to identify the failures is to stress test the system by forcing frequent garbage collections and compilation at specific optimization levels (and perhaps frequent thread switching and frequent OSR events in the future). It is critical that we have a stable research base so intermittent failures are NOT acceptable. If we can not pass a stress test then there is no guarantee that we have a stable research base.
10.4.3 sanity
The sanity test runs cover a larger number of build configurations and workloads. They may not always pass and may test many of the less frequently used configurations (gctrace, gcspy, and individual stress tests) and less important workloads. Performance tests are also included in this test run. Something we use to gauge the health of the project as a whole and to track regressions. These are run once a day on major platforms. These time to complete can vary but expected to take several hours at the least.
10.4.4 Other test runs
A set of test runs that are used for testing specific aspects of the system from performance, gcmap bug finding, io hammering etc. There may also be a set of personal/site-specific test runs included in this set that are not checked into Git repository.
10.4.5 Summary
We must NEVER regress in the core test run. The pre-commit test run attempts to ensure no core regressions this while keeping running time reasonable. The sanity test run gives us an overall picture on the health of the code base. While the other test runs are used at different times for different purposes.