Activity Based Resource Management for Multitenant JVM Services
The growth of the cloud computing and cloud based applications over the last few years has spurred renewed interest in multi-tenancy architectures with companies such as IBM actively researching ways of bringing multi-tenancy to the JVM in the cloud in which an application (and its context) serves as a tenant within a shared JVM runtime. Other companies today are also exploring ways to “innovate” in areas of service differentiation, recognizing different classes of customers in terms of cost and quality (performance, reliability, etc), and invariably multi-tenancy plays a significant role in achieving this.
“Multitenancy is the fundamental technology that clouds use to share IT resources cost-efficiently and securely” - Salesforce.com
Whilst multi-tenancy offers many advantages to the service provider in reducing operational cost and improving resource utilization it presents its own challenges in particular how to effectively manage the underlying resources in a way that ensures a certain level of quality is maintained across different execution contexts. Fortunately various techniques, generally falling under the QoS heading, exist today to help meet such challenges in protecting, policing (and shaping), prioritizing and predicting consumption within a shared (execution & resource) environment.
As it currently stands bringing multi-tenancy to the JVM by way of collocation of (diverse) application deployment units is fraught with risk and dangers both to the hosted applications and the runtime. It would be far better if instead such engineering effort shifted focus to facilitating, via standard interfaces and system mechanisms, a more temporally kind of tenancy such as outlined in Visions of Cloud Computing – PaaS Everywhere. Surely we can design and develop other less riskier approaches to bringing multi-tenancy to the Java platform without retrofitting the entire JVM runtime to support isolation of collocated applications which is incredibly difficult to achieve once such applications reach out to advertise themselves, discover their environment and communicate with it, never mind all the resource management involving memory heaps, thread pools, threads, classloaders, files, sockets, etc? We believe so.
Applications, more so services, should become multi-tenant aware not necessarily their runtimes. An application and its runtime should work in concert to ensure this can be done efficiently, effectively, securely and more importantly adaptively. This co-ordination, communication and co-operation should be contextual (within the application), and performed at the execution level of a request, task, activity, and code block.
“To prevent malicious or unintentional monopolization of shared, multitenant system resources, Force.com has an extensive set of governors and resource limits associated with Force.com code execution.” – Salesforce.com
To put some of the above into a slightly more practical perspective lets imagine we are a cloud vendor offering a SaaS version of Apache Cassandra that allows users to insert data into column families (tables) within distinct and isolated keyspaces (databases). We will ignore how to ensure keyspaces are uniquely named per tenant (account) and instead focus on the resource management problems associated with concurrent processing of requests across different keyspaces (and tenants) within the same node (execution) point – a shared runtime.
Before we get into the resource management we should first check that our service offers similar performance behavior across similar workloads against different keyspaces. To do this a simple single-threaded thrift/rpc client application was used to continuously send insert requests for up to 100 seconds to a specified keyspace and column family. Below is the metering model obtained from a managed Cassandra server runtime that serviced two such clients running concurrently. One client issued insert requests against keyspace
db_a targeting column family
cf_a. The other client issued insert requests against keyspace
db_d targeting column family
cd_b. Fairly similar throughput and average response times (in microseconds) over the 100 seconds of execution.
Running 3 clients in parallel with two clients inserting into the
db_a keyspace and one into the
db_b keyspace gives us the metering model below. With the additional
db_a client the throughput for
db_b has dropped and its average response time increased.
Note: Both client and server processes executed on the same host machine to induce more resource contention and starvation. Ideally we’d use the metering to predict and provision both current and future resources needs, then adaptively partition such resources across services which you could view as a sort of customer class.
To address the workload imbalance across keyspaces we can place a rate limit on the number of
org.apache.cassandra.db metered operations on a per keyspace (tenant) basis. This can be achieved enabling the
qos probes provider and defining different QoS
service classes that map to the different contextual (activity) probes and then regulating the execution of such probes by controlling the resource reservation process for any of the associated resource pools. No code changes are needed. This is barebones Cassandra with some monitoring and management capabilities dynamically injected at runtime by our runtime agent.
The metering results below shows that the throughput rate for the two
db_a clients has dropped inline with the rate limit policy defined above and in doing so the throughput and average response time for the one
db_b client has vastly improved and is now relatively close enough to the single client results. Evidence of the delay introduced by the rate limiting mechanism introduced at runtime can be found by comparing the average performance time of the
write operations across keyspaces.
We can very easily vary the rate limiting across tenant keyspaces if need be. We would do this if we had both paying and non-paying (trial) accounts.
With a lower
limit set on the
resource, associated with the
service class that is mapped to the
write probe, even better throughput and response times are achieved for our single
db_b client competing for service (and resources) with the two other
Up to now we have controlled the frequency of keyspace operations, over 1 second intervals, but we have not explicitly controlled the degree of concurrency in the processing of keyspace operations.
We start with a baseline involving 6 clients inserting into
db_a and 1 client into
db_b. With no resource management in place the increased workload on the
db_a keyspace has severely impacted the
db_b keyspace throughput. Whilst the response time hit is evenly spread across all insert requests irrespective of keyspace we would like to investigate whether we can we improve the experience for the
db_b client without completely routing requests to separate server instance.
We can limit the degree of concurrent execution against each keyspace in setting the maximum
capacity for each resource pool associated with a QoS (tenant) service. The following configuration limits the concurrency to 1 thread per keyspace at the thrift/rpc entry points into the managed runtime. Again this control requires no code changes.
With the new resource management policy the throughput for
db_b has improved, as has the average response time. But the enforcement of the resource management policy has caused the average response time for
db_a keyspace operations to nearly triple as managed delay (suspension) is introduced into the execution.
A drawback of the previous policy is the possibility of resources being under utilized. This can be addressed to some extent by introducing a single shared resource pool and then having each
service hold a
reserve that when summed across all services is less than the maximum
capacity of the resource pool.
The additional unit of capacity available when the local service reserve is depleted has allowed the
db_a clients to increase throughput at some expense to the
db_b client though not as significant as experienced without any resource management policy set.
Note: One reason for not obtaining even greater control over the throughput for
db_b keyspace is that there is further processing occurring prior to the insert operation that is not currently instrumented and hence not under the control of the metering engine.
In this article we have demonstrated how easy it is to control any type of software, even software you don’t necessarily have access to the source code, using a resource management model that consists of virtual resources and virtual services that make up a system and whose dynamics are controlled by way of resource reservation policies. Control over the actual software execution is by way of this proxy management model centered around resource flows. All this is defined external to the application and its code by operations though there is no reason why this meta data cannot be modeled and managed dynamically within its environment.
We have only looked at a few control techniques in the context of resource management. There are many more options in terms of modeling and configuration we could utilize including reservation lanes, priority queues, adaptive valves, reservation timeouts and quotas. The point is that we don’t have to re-engineer entires systems to exert some influence over them in order to encourage system execution behavior to align with our needs and the environment they operate in and under.
“I view this novel approach as a scalable and secure form of ‘mind control’ over the machine.” – William Louth, JINSPIRED