Prediction is the Future in Application Performance Management & Software Optimization
There are a few general strategies to improving the performance (speed) of a software application.
1. Not Working. You can’t beat this in terms of speed. Always try to avoid doing (or generating) work in the first place. Unfortunately most of the time we must do something to deliver a service but we should always keep this in mind when tackling any performance problem. This is one area in which computers could learn from some of the “best” of us. The computer says No.
2. Working Harder. Getting more out of the hardware and resources available to the software possibly upgrading, improving or aligning the hardware components to the nature of the workload. At present we have reached some physical limits.
3. Working Smarter. Improving the efficiency of the software in terms of algorithmic work needed to be done and possibly its ordering (coalescing) and operation (collocation). Sometimes in applying this we need to do more work but of a different nature to reduce the cost of current or near immediate more expensive work to be done (i.e. sorting => searching). The first strategy is this taken to an extreme.
4. Working in Parallel. Getting more work done in a much shorter time window by adding computing capacity and consumers of such capacity which process (map & reduce, split & stitch) smaller parts of the overall execution concurrently (simultaneously). A lot of the current focus is here both in a distributed and non-distributed form.
Even with the best programming models (message/actor based), languages and runtimes (Java virtual machine) in the world its still pretty hard to achieve the speedups required (or anticipated) whilst fully utilizing the ever growing computing capacity (100 -> 1000 cores) due to sequential nature of our thought process (not the internal workings of the mind) in such endeavors and the obvious physical limits (bottlenecks) that lie elsewhere (IO) in the processing pipeline. Once we have hit the limits with parallelization of an activity, task, job, request, interaction, etc…we are left with prediction and dynamic adaption in our performance arsenal.
So what can be done differently to move beyond such limits? We believe to make application software faster we need to combine and apply all of the above strategies but not within different phases in the application lifecycle, not at different places in the application architecture, not at a particular (static or reference) point in time, not with an expected performance model in mind. No these strategies needs to be applied all the time and just in time by the software itself (in parallel). To make things faster (in terms of the critical path) we need to have the software work far far much harder than it does today but work that is very different than what is the primary function. This “harder” secondary work will be done in parallel even possibly replicated (duplicated) across many processing units (cores). This work will be much smarter in predicting the near realtime (immediate) needs (data, resource) of the primary function execution path with the optimal result that very little or no work is done at least along the primary critical performance path (at path that is very likely going to be adaptive which such adaption done in parallel). For every primary processing unit (or worker) there will be one or more secondary collocated processing units (or supervisors) observing the primary function and predicting its execution path, behavior and resource needs and using such predictions to reserve capacity, allocate and load resources, pre-compute intermediary results, make online adaptions to the software algorithms and execution paths as well as prioritizing work. Eventually we will get to the point that such secondary work in the form of profiling, protecting, policing, prioritizing, predicting and provisioning will dominate the actual capacity consumption profile of an application as users demand much lower response times. Strangely enough there is going to be more waste and cost in matching growing expectations of lower response times.
Note: We can have multiple primary processing units fulfilling a single service request and for each such unit multiple secondary processing units observing, supervising, controlling, predicting, prioritizing,….
Curling Concurrent Computation
A simple sport analogy would be curling “in which players slide stones across a sheet of ice towards a target area…The curler can induce a curved path by causing the stone to slowly turn as it slides, and the path of the rock may be further influenced by two sweepers with brooms who accompany it as it slides down the sheet, using the brooms to alter the state of the ice in front of the stone. A great deal of strategy and teamwork goes into choosing the ideal path and placement of a stone for each situation, and the skills of the curlers determine how close to the desired result the stone will achieve” – From Wikipedia, the free encyclopedia
“Curling is the only sport where the trajectory of the projectile can be influenced after release. Players “sweep” the ice directly in front of the curling stone to decrease the friction of the stone with the ice” – The Physics Of Curling
Another example of a sweeper, this one trailing, is garbage collection. We just now need to push such self regulated mechanisms further up the application stack as well as its abstractions and models.
Note: There is another way to address speed that does not actually lower the response times of requests themselves and that is for each interaction to do more “valued” work for the user in the same amount of time playing to our ever increasing parallel capacity and capability.
Real(& In Time) Management of Application Performance
To make such a future possible primary workers and secondary supervisors need a common collocated supporting runtime that exposes an observation (measurement) model and control (feedback) system. A realization of such a runtime we believe is found in JXInsight/OpenCore’s 3 key technologies: Probes (metering), Metrics (monitoring) and Signals (fingerprinting).
Today’s legacy application performance monitoring (APM) products add no value to the software they attempt to manage. This “management” occurs in the form of changes over many generations (revisions) of the application but never actually within the lifetime of a single incarnation of the application itself. Tomorrows future application performance management solutions (which we are selling today ;-)) will optimize, control, coordinate, prioritize even adapt the application being managed. There won’t be a management dashboard at least not one with big RED and GREEN circles representing some condition on an observation of an application measurement (or metric). What will be communicated to operators is the effectiveness of the secondary processing in continually predicting and adapting the software. In terms of operations the secondary work becomes the primary concern.