Software Performance Workload Modelling
Software Performance Workload Modelling
Vijay Datla
Abstract: The debate between performance engineers and business stakeholders over non-functional requirements is probably as old
as the performance discipline itself. What set of transactions is enough to represent my system?, Why do we not load test every
transaction? , Our volumes are much higher than what the targets show are some of the common questions that need to be answered.
From a technical perspective, benefits from load testing every transaction are not enough to justify the effort involved in the exercise.
However, for a business, even a small risk of one untested low volume transaction affecting the others or bringing down the entire
system is high enough to raise a flag. This paper is an attempt to balance these concerns by discussing how to create workload models
that are closer representations of the real world enterprise applications. It answers common requirement gathering questions like where
to look for information, on what basis to include and exclude use cases from workloads and how to derive a complete and convincing
workload model. This paper highlights the risks associated with selective modelling and the possible mitigations. It also brings to the
table tips and tricks of the trade, some lessons learnt the hard way.
www.ijcat.com 13
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656
What are the most common use cases? or transactions Most enterprise projects maintain data warehouses for
that happen on the system most frequently? storing archived information that can be accessed to
Do all users arrive into the system over a small obtain non-functional details
window or are they spread across the day?
What are the peak periods of access to the system? Domain Research
Are there periodic tasks that the system is designed to In most cases there is existing research in the market
accomplish? that has been done on various kind of applications
E.g. catering to several domains. If there is absolutely no
End of Month/Quarter reports? information available in house then these researches can
Close of business? be a good place to start from.
Seasonal sales? Log Parsers
Year End closing? And so on In case of implementations with an existing system in
place, server access logs are excellent sources of real
1.2. Picking the right resources: time information. There are several tools in the market
Common sources like Business Analytics, RFP, that parse access logs into comprehendible, meaningful
Business volume reports, Audit reports, Inputs from information. There are several log parsing tools in the
legacy system, Capacity sizing document, Webserver market that can produce meaningful data from Web
access logs, Data ware house, google analytics.On Server logs. AWStats is one such open source log
enterprise level projects there can be several sources of parsing tool that is being used here as an example. This
information when gathering the non-functional parser extracts data from a web server access log and
requirements. converts it into meaningful server statistics. It is much
like a web analytics tool, only that it works offline. It
Business Analysts (BAs) produces graphs that provide insight into load patterns
BAs are always the first source of information for non- in terms of user visits, page visits bandwidth etc.
functional requirements. They may or may not have all The tool lists the most commonly accessed pages which
the information required, but they will be able to make helps in determining the high volume transactions. It
the connection to the right business contacts. also indicates the browser most commonly used to
RFPs access the application website. With the advancement
RFPs usually contain a non-functional requirements of browsers features and variety in the market, this
section. The requirements specific to Performance may information is useful in deciding what browser to use
be few and non-elaborated but will still contain when simulating load on the application.
response times, customer base, transaction volumes etc. The most important use of the tool is in studying the
Business Reports user arrival and load pattern. The hourly graphs outline
There are several reports that the business maintains the user arrival pattern and the
like Volume reports, Accounting, auditing reports that weekly, monthly and yearly graphs help in determining
can provide insight into business statistics the peak periods.
Legacy Systems
In case of legacy modernization projects, there already The below charts show the website usage patterns in
is a system, maybe a mainframe that is still serving the terms of top accessed URLs, top downloads, average
business. Running simple select queries on this system user visit durations and distribution of browsers for
can help in studying the real world transaction volumes incoming requests.
and load patterns
Hardware Sizing Documents
In the initial stages of SDLC, enterprise projects go
through the process of determining the hardware
required to support the solution. This sizing is based on
the throughput that the system is expected to achieve.
So either on a high level or in detail, some study is
already done at this stage that can often be used as
opposed to reinventing the wheel.
Google Analytics
For enterprise applications with already existing
websites, Google Analytics is a web-analytics solution
that provides detailed insights into the website traffic. It
reports traffic patterns, sources of incoming load,
navigation patterns, detailed load patterns over a period
of time and much more.
The below graph shows the hourly distribution of load.
Data Warehouse
This kind of information helps determine the peak hours
of the day and the % increase in load during the peak
www.ijcat.com 14
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656
www.ijcat.com 15
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656
www.ijcat.com 16
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656
The excel above is a sample workload model for our test report as one of the metrics, you can see the number
example. Please note that the values are mere of hits made to each page in the test suite. This number
assumptions and in no way represent the actual volumes is a count of how many transactions were successfully
of bank. completed on the system.
The information at hand was the distribution of a From the back end, post every test run a simple query
concurrency of 604 users across the three modules, on the database can give a count of volumes achieved
Teller Banking, Online Banking and Phone Banking. during a test run.
Also known were the target volumes for each of the
shortlisted use cases. A study of the use case navigation 5. THE BIRDS EYE VIEW:
For Performance Testing to reveal accurate
and call flow helped determine the length (number of
characteristics of a system, the workload model should
pages) of each use case. Applying the above formulae be a close representation of real world production load
over the given information, targeted throughout (Page pattern. For complex enterprise applications user
Views per second) per use case and a concurrency interface is just one entry point into the system. There
distribution within each module was calculated. are several other interfaces, WebServices scheduled
jobs etc that share the system resources. To simulate a
real world production load pattern it is essential to look
Once created, it is important to get a sign-off on a at the complete picture and account for at least
workload model before starting execution. This ensures incoming load from all possible sources.
that the requirements set forth for the Performance With the increasing complexity of business models and
testing exercise are correct and validates the interdependence on business partners and service
assumptions made. providers, interaction with external subsystems through
Since a workload model relates more to the business, it interfaces and exposed WebServices and messaging
is important to represent the information well. A interfaces is one primary source of incoming load.
Other sources are inter-module communications
pictorial representation of information is more likely to
between modules under test and those that are out of
be well noticed and understood when compared to an scope of the Performance test exercise.
excel containing a whole lot of numbers. Another activity to account for is the daily Batch jobs
and schedulers that run during the regular business
hours. For those that run during off-peak hours, its
4. WORKLOAD VALIDATION: important to test and ensure that the execution of all
In order to redesign the complete workload model it is scheduled batch jobs complete during the designated
recommended to do an early validation such as reverse window and do not overflow into the regular business
calculation, Think time between pages(TT), Avg hours. Along similar lines, there are regular backup and
response time for each page(RT), time to complete archival activities that need to be allocated resources.
execution x-, Achieved throughput One other consideration that needs to go into a
completing a workload is the recurring business
Requirement analysis, market research and solution activities that take place over and above the regular
design are based on a series of assumptions and it is tasks. For example Close-of-Business, End-of-Month
important to ensure that the assumptions are correct by reporting, Quarterly reports etc.
validating that the goals are achievable. This validation
can be done without having to execute the load tests, 6. SEASONAL WORKLOAD MODELS:
just by doing some reverse calculations. These models are business critical.There are a few
For example, lets assume that the average Think time domains that every so often, experience a substantial
between pages i.e. TT is set at an average of 10Secs and variation in their load pattern. These are called seasonal
the Response Time target for each web page i.e. RT is workloads. For applications that cater to these domains,
4Secs. ensuring performance and stability during such seasonal
Hence the throughput of a business transaction x that workloads also becomes the responsibility of the
can be achieved by the derived Concurrency Cx is performance test exercise. Some examples of such
Vx = Cx * 3600 / (Lx * (TT+RT)) seasonal workloads are:
where Lx is the length of x i.e. number of pages. If the eCommerce Applications for Retailers: End of Season
achieved Vx is in line with the targeted business Sales, Holidays like Christmas and Thanksgiving
transaction volumes then it is safe to say that the Banking and Financial Applications: End of Year
assumption of think times and the response time Closing
requirements are correct. Job Portals: Graduation Period
Validation can also be done post-execution at either the Human Resource Management Systems: Appraisals
front-end or the back-end. At the front end, there are etc
load generation tools that report counts of execution of
transactions under test. Lets take the example of the
IBM Rational Performance Tester load test tool. In the
www.ijcat.com 17
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656
Tips:
These tips are some lessons that have been learnt from
requirement gathering processes with several customer
and hence are generic and applicable to all domains like
banking, insurance, retail, telecommunication etc:
Make sure you have a complete understanding of how
the business that is being served by the application.
What major functionalities does it cater to and what
external systems does it interact with. Try to relate that
to the solution design
If and when possible, visit the business on-site to
understand the system usage and study the load patterns
Always set targets at peaks and not the average
volumes
Account for growth rates by targeting the volumes
projected for the rollout timeline
For a new system with no existing data, derive the data
volumes. During execution, load test with databases
holding at least near-production volume of data
Ensure that there is room for server maintenance
activities at average loads
Last but the most important, get a sign-off on the
requirements set forth for performance before starting
execution
CONCLUSION:
In this session we have gone over the process of
gathering and defining requirements for performance
www.ijcat.com 18