SPECjbb2000, A Java Business Benchmark

1.0 Overview

SPECjbb2000 is a software benchmark product developed by the Standard Performance Evaluation Corporation (SPEC), a non-profit group of computer vendors, system integrators, universities, research organizations, publishers, and consultants. It is designed to measure a system's ability to run Java server applications.

SPECjbb2000 is implemented as a Java program emulating a 3-tier system with emphasis on the middle tier. All three tiers are implemented within the same JVM. These tiers mimic a typical business application, where users in Tier 1 generate inputs that result in the execution of business logic in the middle tier (Tier 2), which calls to a database on the third tier. In SPECjbb2000, the user tier is implemented as random input selection. SPECjbb2000 fully implements the middle tier business logic. The 3rd tier is represented by binary trees rather than a separate database.

SPECjbb2000 is totally self contained and self driving (generates its own data, generates its own multi-threaded operations, and does not depend on any package beyond the JRE).

SPECjbb2000 is inspired by the TPC-C benchmark and loosely follows the TPC-C specification for its schema, input generation, and operation profile. SPECjbb2000 replaces database tables with Java classes and replaces data records with Java objects. The objects are held in memory by either BTrees (also Java objects) or other data objects. Therefore SPECjbb2000 does no disk IO. Since there is no database, it does not support object persistence with ACID properties corresponding to a RDB implementation. SPECjbb2000 uses only Java synchronization to synchronize multi-threaded access to shared objects in the population. Since users do not reside on external client systems, there is no network IO in SPECjbb2000.

While SPECjbb2000 is inspired by TPC-C, it is in no way comparable. SPECjbb2000 is memory resident, uses totally different data set sizes, mix of workloads, performs no I/O to disks, and has no think times. It has a different set of run and reporting rules, a different measure of throughput, and a different metric. Such comparison would be a SERIOUS violation of SPEC's run and reporting rules and of TPC's "fair use policy." Violations are subject to penalties.

2.0 Internals

Since SPECjbb2000 is loosely based on the TPC-C specification, most of the same nomenclature applies. The system modeled is a wholesale company, with warehouses that serve a number of districts. There are a set of operations that customers (also known as users or terminals) initiate, such as placing new orders or requesting the status of an existing order. Additional operations are generated within the company, such as processing orders for delivery, entering customer payments, and checking stock levels.

In SPECjbb2000, there is only one terminal (or customer) active per warehouse. A warehouse is a unit of stored data. It contains roughly 25 MB of data stored in Btrees. Terminals map directly to Java threads. Each thread executes operations in sequence, with each operation selected from the operation mix using a probability distribution. As the number of warehouses increases during the full benchmark run, so does the number of threads.

3.0 Code Structure

The benchmark code is shipped in jar files. In addition, source code is provided in the src subdirectory for reference. Please note section 2.2 of the run rules prohibiting recompilation for published results.

The code contained in the jar files are:

jbb.jar: code related to the infrastrusture of the benchmark.
jbb_no_precompile.jar: code related to the customizable objects in the benchmark. Since this code represents classes that are dynamically loaded by the benchmark, nothing in this jar file may be precompiled by a static compiler. Refer to run rules section 2.1.1.
reporter.jar: code for postprocessing of results, metric computation, and result files generation.
check.jar: code for validity checking and verification.

4.0 Performance Metric

SPECjbb2000 measures the throughput of the underlying Java platform, which is the rate at which business operations are performed per second.

A "point" represents a two-minute measurement of work done at a given number of warehouses. A full benchmark run consists of a sequence of measurement points with an increasing number of warehouses (and thus an increasing number of threads).

The SPECjbb2000 metric is calculated as follows:

All points (numbers of warehouses) are run, from 1 up to at least twice the number of warehouses expected to produce the peak throughput. At a minimum all points from 1 to 8 must be run.
The peak is observed to be at N warehouses.
The throughputs for all the points from N warehouses to 2*N inclusive warehouses are averaged. This average is the SPECjbb2000 metric. As explained in section 2.3 of the run and reporting rules, results from systems that are unable to run all points up to 2*N warehouses are still considered valid. For all the missing points in the range N+1 to 2*N, the throughput is considered to be 0 ops/second in the metric computation.

The reporting tool contained within SPECjbb2000 produces a graph of the throughput at all the measured points with warehouses on the horizontal axis and throughputs on the vertical axis. All points from 1 to the minimum of 8 or 2*N are required to be run and reported. Missing points in the range N+1 to 2*N will be reported to have a throughput of 0 ops/second. The points being averaged for the metric will be marked on the report.

5.0 Conclusions

While SPECjbb2000 is not a full blown OLTP benchmark, it is a very good stand-in for a large business application. It is also a very good functional and performance test for Java platforms. It has been found to effectively exercise the implementation of the Java Virtual Machine (JVM), Just-in-time compiler (JIT), garbage collection, threads and some aspects of the operating system. The benchmark measures the performance of CPUs, caches, memory hierarchy and the scalability of Shared Memory Processors (SMPs).