Monkeying Around with Latency Injection at Service Points using QoS & ACE
As part of our test regime for many of the adaptive control in execution (ACE) metering extensions currently under development we have being looking at various different techniques to inject disturbance into the test environment to assess the effectiveness of the adaptation algorithms built into our controllers. With the initial metering extensions focused largely on response time and throughout that disturbance manifests itself in latency (or delayed execution).
In the article Determining whether an Application Performance Hotspot is a Performance Limiter we showed an example of how easy delay could be injected transparently into any JVM application for the purpose of confirming a bottleneck and measuring the degree any changes in its performance behavior has on its callers – both direct and indirect. In this article we will cover many other ways this is possible with JXInsight/OpenCore without the need to change code or introduce an explicit compile time dependency on an external library. We want to monkey around but not have to call into a monkey’s code base.
Lets start with the case in which service A running in a JVM makes remote procedure calls into service B running in another JVM where we would like to introduce delay to see how it impacts service A. Whilst we may know the interfaces used in the service interaction and data communication between A and B we might not be sure of the mapping to actual method implementations. Ideally we want to designate a particular part of the codebase in service B as the implementation of the service and then inject latency into entry point calls (all determined dynamically at runtime) representing the service call boundary. Initially we don’t want to inject latency into every single internal method invocation in processing a request because that would make it harder to reason about how much delay was introduced and possibly call into question the impact of other system dynamics especially if B makes out-bound calls to other services which call re-entrantly back into A or B.
Quality of Service (QoS) for Apps
Assuming the JVM running service B has our metering instrumentation agent deployed we can introduce delay by simply enabling our
qos metering extension, defining a QoS service then mapping it to a QoS resource and finally controlling service reservations using either rate limiting or fixing the capacity artificially low. The degree of delay is controlled via a
timeout setting on the service and its occurrence by the availability (more so unavailability) of capacity at the resource.
The configuration below introduces a delay up to a maximum value of 10 seconds (10,000 ms) for all entry point calls into the service codebase beyond the first call within each 1 second (1,000 ms) interval by replenishing the (non-renewal) resource ever second with 1 unit of reservation capacity. We can increase both the
interval properties on the resource to vary the call sample size and sequencing of the injected delay.
Here is alternative configuration this one introducing a similar delay as above but for all concurrent calls above a degree of one by restricting the (renewal) resource reservation
capacity to only 1. A benefit of this approach is that we increase the
capacity slightly to allow both A and B to ramp-up and than introduce randomness into the call selection process by way of the typical fluctuating call concurrency.
There are many other possibilities with our amazing QoS for Apps technology including using concurrency barriers and latches attached to resources which has been detailed in the article Achieving Required Workload Levels in Performance Testing using QoS for Apps.
Adaptive Control in Execution (ACE)
We can use the same technology we are testing to introduce delay itself via the concurrency control mechanism within the adaptive valves.
The following configuration limits the concurrency of calls following through the
tvalve in executing methods within the
com.acme package and its sub-packages to 1 thereby introducing a 10 second delay to those additional concurrent methods. Just like in our last QoS configuration randomness in the injection of delay can be piggy backed on the typical volatility of call concurrency by adjusting the
If you really want to monkey around you can even define multiple valve configurations with different
But we can do much better by introducing randomness into delay call selection by way of the adaptive capabilities within the valve itself which uses hill climbing (local search) to find the optimal concurrency level for throughput in the case of the
tvalve metering extension or average response time in the case of the
rvalve metering extension. We allow the adaptation to occur but after 100 evaluations and concurrency adjustments we force the valve to
reset itself back to its
lower (and initially) concurrency setting and relearn and re-adjust all over again.
Here is a similar configuration using the
rvalve metering extension.
Finally we can marry both QoS for Apps and Adaptive Control in Execution (ACE) technologies by enabling valves on QoS resources.
And if that is not enough you can alway create you very own metering extension which introduces delay based on application specifics using the extensibility enabled via our interceptor metering extension.