Introduction
I recently conducted an onsite performance investigation for one of our Fortune 100 customers. The first performance problem that needed to be addressed urgently was the performance difference of the client application across different geographical regions. The client applications running on the Microsoft Windows platform connected to cluster of 20 Java application servers centraly located in the US. The database supporting the cluster of servers was collocated on the same site. Users outside of the US had recorded slow response times in opening frequently accessed forms in the application. The customer had suspected that network latency costs were impacting the performance of the remote clients but the test staff were not convinced that the network costs could account for such large differences especially because of the network
capacity available.
Client Testing
The first performance engineering task performed was to create an isolated test environment that could be monitored separately from the user population. Once this was done 2 users from local and remote geographical locations logged into the isolated application server and performed the opening of one of the forms that was deemed "terribly" slow when executed remotely. JXInsight timeline snapshots were captured for each use case test run. Because the opening of the form did not involve any additional client interaction following the menu selection we could safely discount any possible differences in user application experience and task speed from the results.
Local Client Application Timeline
The following timeline graphic from the JXInsight management console shows the number of server request traces (
light blue rectangles) , JDBC transactions (
horizontal brackets) and SQL statements (
yellow rectangles) received during the opening of the form. Each row in the timeline grid represents a server request processing thread. Each cell in the grid represents a period of 10 milliseconds. The number of requests (
traces and transactions) processed during the test execution was 13. The first request started at 18:59:24 520ms and the last finished at 18:59:25 380ms equating to a overall duration of 860 milliseconds. The first cluster of serverside transactional activity is the workflow processing prior to UI form construction. The middle section in the timeline is the period taken by the client application to construct the UI form. The second cluster of activity is the workflow processing post the UI form construction.
Remote Client Timelime (Truncated)
Performing the exact same test execution from the remote client resulted in the following server request traces , JDBC transactions and SQL statements timeline activity. The timeline cell resolution of 10 milliseconds remains the same as the graphic above. The complete transaction timeline activity is only
partially displayed because of the longer duration. In fact the view presented below shows just the first 5 transactions of the use case which consists of 13 individual traces and transactions. The actual transaction and trace activity is the roughly the same as the local client because both clients are accessing the same application server and backend database.
What is strikingly differently is the (
non-activity)
gap between each consecutive server request trace and transaction. Because each transaction activity is the result of a single user initiated operation,
open form, the time between each server trace and transaction is the actual time spent sending, receiving and processing the request data. Since the data volume is constant across each client the increase in time for the non-activity periods is the
network latency overhead of the client-to-server roundtrips .
The start time for the first server request is 19:43:40 300ms and the finish time for the last request is 19:42:42 150ms. The total duration time for the use case from a server-side perspective is thus 1850 ms - more than twice the time of the local client. The total clock time for all recorded traces was 175ms which means the network latency costs (
from a client to server persepective) constitutes 90% of the overall performance costs. In fact the percentage is much closer to 95% if we discount for the UI form construction (
the mdidle period of non-activity).
Now when the development staff was informed of the poor performing use case their initial reaction was (1) the client machine was slow, (2) the database was slow, (3) the SQL was not tuned, or (4) the application server. Improving the execution of the SQL statements was never going to have much impact for users when a 10% or 20% gain in components performance constituted less than 0.5% overall performance improvement.
Once the development staff started using JXInsight it became all to obvious that the number of roundtrips had to be reduced.
The following graphic shows the complete remote client timeline with a cell resolution of 25 ms. Note the UI construction period between 19:43:41 600ms and 19:43:42 25ms.
William Louth, JXInsight's Product Architect