JXInsight/Simz – Near Real-Time Distributed JVM Load Balance Analysis
“The way that organizations and organisms anticipate the future is by taking signals from the past, most of the time. Knowing what has just happened in the past, an organism can bring that data forward to modify its future actions.” - Kevin Kelly
Anticipation is extremely difficult in a distributed context because to do so effectively we need to shorten and shrink both time (observational delay) and space (observational dimension). This is a problem for many load balancers that attempt to employ sophisticated scheduling algorithms (other than random or round robin), directing traffic based on past, present and predicted workload patterns, request performance and resource capacity availability.
There are many technical challenges to overcome in achieving optimal load balancing, such as being able to predict not just future workload patterns but the actual completion times of work already distributed, as well as the the availability of additional capacity subsequent to work completion. Ideally a load balancer would like to estimate when capacity is likely to be sufficiently freed up on one or more application backends to handle current inbound traffic. To do so accurately can be a challenge when application backends also employ queues and scheduling fronting service entry points. So whilst work might have been dispatched, it does not necessarily mean it has actually started execution, though we could probably include this wait time in the service time of a queueing model.
This is a very interesting optimization problem to solve but before we can tackle this we must be able to detect an imbalance in the distribution of workload and its execution within application server/service backends.
The rest of this article will demonstrate how effective an approach based on discrete event simulation and real-time replay is at detecting even the smallest of changes across multiple JVM instances under constant load conditions as well as the fair scheduling in the replaying of homogenous execution behavior by the service. In doing so we achieve extreme levels of application activity monitoring.
To converge and compress both time and space we can use JXInsight/OpenCore and JXInsight/Simz, instrumenting each application backend instance with metering and feeding such events to a Simz service by enabling the optional
simz metering extension. The service then replays in near real-time each metered activity
end() pair instrumentation calls per thread, with an one to one mapping of application thread to simulated thread.
Because we are not interested in the resource metering the interval between the
end() calls of each instrumented probe we can install an metering interceptor within the Simz service itself that votes no (in the
begin() call back method) to the metering of each firing probe, but that instead creates and updates a metric counter for each application backend, tracking the execution of work performed remotely. The interceptor is completely oblivious to the fact it is in a simulated environment except for the code allowing for multiple UUIDs to be present within an environment context, though there would only ever be one in a real application runtime.
The following code will be used to simulate an application backend with the
call() method representing the work performed by a typical web application server request processing loop. And yes we want this to blaze along.
Before running up the two test application backend instances the Simz service is started.
The following is the
jxinsight.override.config file loaded by the service from the
jxinsight.home directory. This instructs the metering runtime to install our interceptor and collect metric samples every 5 seconds.
To run up the instances of the dynamically instrumented application backend the following script was used.
Here is the
jxinsight.override.config file loaded by the application backend from the
./turbo home directory.
Typically the simulation replay service is started on a remote machine but in the testing reported here both application backends and the service were run on the same machine (2.6 Ghz Core i7).
Here is a listing of the metrics sampling performed in the first 35 seconds. After the initial startup interval each application backend is executing approximately 14 million
call() method invocations per second. The Simz service is replaying 28 million probes per second and 56 million remote metering feed events a second at an average of 71 nanoseconds per call and 36 nanoseconds per event. Pretty impressive! We can see from the last metric sample collection that the OS scheduling is pretty fair as is the event replay.
We don’t necessarily need to instrument methods dynamically at class load-time using a
javaagent library. Below is an alternative implementation of an application backend this time explicitly using the Probes Open API.
Here is the script used to run this version of the application backend which also uses the same
jxinsight.override.config as above. Note the removal of the
-javaagent argument and the addition to the classpath of the
For this test run we mixed a dynamically instrumented application backend with a manually instrumented backend. From the metric sample collections below there is a hint at this difference with one backend hovering in the 14.6 million probes per second range and the other at 14.1 million.
Here is another test run this time running two instances of the manually instrumented application backend with metric sample collection performed every second. Very consistent even when there is a drop in the throughput due to the running up of scheduled job on the machine under test.
In a follow-up article we will look at how this amazing technology can go beyond answering “what is happening (at this moment) on server X”, answering such questions as “when will work likely complete on server X” and “how much capacity is likely to be available N seconds from now on server X”.
Java(TM) SE Runtime Environment (build 1.7.0_12-ea-b05)
Java HotSpot(TM) 64-Bit Server VM (build 24.0-b26, mixed mode)
OS X 10.8.2 2.6 Ghz Core i7
8 GB 1600 MHz DDR3