0% found this document useful (0 votes)
124 views6 pages

Software Performance Workload Modelling

The debate between performance engineers and business stakeholders over non-functional requirements is probably as old as the performance discipline itself. ‘What set of transactions is enough to represent my system?’, ‘Why do we not load test every transaction?’ , ‘Our volumes are much higher than what the targets show’ are some of the common questions that need to be answered. From a technical perspective, benefits from load testing every transaction are not enough to justify the effort involved in the exercise. However, for a business, even a small risk of one untested low volume transaction affecting the others or bringing down the entire system is high enough to raise a flag. This paper is an attempt to balance these concerns by discussing how to create workload models that are closer representations of the real world enterprise applications. It answers common requirement gathering questions like where to look for information, on what basis to include and exclude use cases from workloads and how to derive a complete and convincing workload model. This paper highlights the risks associated with selective modelling and the possible mitigations. It also brings to the table tips and tricks of the trade, some lessons learnt the hard way.

Uploaded by

ATS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views6 pages

Software Performance Workload Modelling

The debate between performance engineers and business stakeholders over non-functional requirements is probably as old as the performance discipline itself. ‘What set of transactions is enough to represent my system?’, ‘Why do we not load test every transaction?’ , ‘Our volumes are much higher than what the targets show’ are some of the common questions that need to be answered. From a technical perspective, benefits from load testing every transaction are not enough to justify the effort involved in the exercise. However, for a business, even a small risk of one untested low volume transaction affecting the others or bringing down the entire system is high enough to raise a flag. This paper is an attempt to balance these concerns by discussing how to create workload models that are closer representations of the real world enterprise applications. It answers common requirement gathering questions like where to look for information, on what basis to include and exclude use cases from workloads and how to derive a complete and convincing workload model. This paper highlights the risks associated with selective modelling and the possible mitigations. It also brings to the table tips and tricks of the trade, some lessons learnt the hard way.

Uploaded by

ATS
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

International Journal of Computer Applications Technology and Research

Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

Software Performance Workload Modelling

Vijay Datla

Abstract: The debate between performance engineers and business stakeholders over non-functional requirements is probably as old
as the performance discipline itself. What set of transactions is enough to represent my system?, Why do we not load test every
transaction? , Our volumes are much higher than what the targets show are some of the common questions that need to be answered.
From a technical perspective, benefits from load testing every transaction are not enough to justify the effort involved in the exercise.
However, for a business, even a small risk of one untested low volume transaction affecting the others or bringing down the entire
system is high enough to raise a flag. This paper is an attempt to balance these concerns by discussing how to create workload models
that are closer representations of the real world enterprise applications. It answers common requirement gathering questions like where
to look for information, on what basis to include and exclude use cases from workloads and how to derive a complete and convincing
workload model. This paper highlights the risks associated with selective modelling and the possible mitigations. It also brings to the
table tips and tricks of the trade, some lessons learnt the hard way.

Keywords: Performance, Modelling, Vijay Datla, Vijay, Datla

Validating Hardware Sizing Once the objectives are


1. REQUIREMENT ANALYSIS: clear, the next step is to define the scope at a high level,
Just like any Software Development Lifecycle (SDLC), meaning which modules or what part of the solution
will need to be tested as part of the performance
a Performance lifecycle also begins with Requirements
exercise. To go deeper into the objectives and scope of
Analysis with the difference that the requirements are performance, it is essential to have a thorough
purely non-functional in nature. Non-functional understanding of the system. This understanding can
requirement is a requirement that specifies the criteria come not just by studying the application but also by
that can be used to judge the operation of a system studying the business.
rather than a functional behavior. There are several
kinds of non-functional requirements like Security, 1.1. Asking the right questions:
Maintainability, Usability and so on but the specifics - Customer base
- Growth rate
that we are interested in are Performance, Scalability
- Concurrency
and to a certain extent Availability. Requirements - Volume centircs vs user centrics
gathering forms the foundation for all future - Most common transactions
performance engineering activities on a project. - Response time requirements
Mistakes made in understanding the business - User arrival pattern
requirements translate into setting of wrong goals and
takes all the performance efforts into the wrong Gathering requirements for performance testing is the
most challenging task given that there is no one place
direction. Requirements gathering is therefore the key
with consolidated information and most sources are
to a successful Performance Engineering project. But external. Readily available non-functional performance
even before getting into requirements, it is important to requirements and statistics is a rare occurrence.
understand the objectives. It is a common However, it not the lack of availability that adds to the
misconception that performance can only be done to challenge, it is the process of gathering and
measure the response time of the system. In literal consolidating data from various sources thats a
terms, measuring performance of a system is purely cumbersome task. More than getting the right answers,
it is about asking the right questions. Since the process
Performance Testing which is part of a larger discipline
involves dealing with business, its important to frame
called Performance Engineering. Performance testing is questions more comprehendible to a business mind.
a means; an enabler in achieving the Performance Instead of asking what is the concurrency or throughput
engineering objectives. So what are these objectives? target, try asking what is the customer base of the
business? How many of these customers will be
Measuring and improving Performance of an accessing the system at any given time? The following
application should give an idea:
Meeting the non-functional requirement targets What is the expected business growth rate?
Improving user experience Is the system volume centric or user centric?
Benchmarking the application and hardware What response time is the system required to serve in
case of web based OLTP transactions?

www.ijcat.com 13
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

What are the most common use cases? or transactions Most enterprise projects maintain data warehouses for
that happen on the system most frequently? storing archived information that can be accessed to
Do all users arrive into the system over a small obtain non-functional details
window or are they spread across the day?
What are the peak periods of access to the system? Domain Research
Are there periodic tasks that the system is designed to In most cases there is existing research in the market
accomplish? that has been done on various kind of applications
E.g. catering to several domains. If there is absolutely no
End of Month/Quarter reports? information available in house then these researches can
Close of business? be a good place to start from.
Seasonal sales? Log Parsers
Year End closing? And so on In case of implementations with an existing system in
place, server access logs are excellent sources of real
1.2. Picking the right resources: time information. There are several tools in the market
Common sources like Business Analytics, RFP, that parse access logs into comprehendible, meaningful
Business volume reports, Audit reports, Inputs from information. There are several log parsing tools in the
legacy system, Capacity sizing document, Webserver market that can produce meaningful data from Web
access logs, Data ware house, google analytics.On Server logs. AWStats is one such open source log
enterprise level projects there can be several sources of parsing tool that is being used here as an example. This
information when gathering the non-functional parser extracts data from a web server access log and
requirements. converts it into meaningful server statistics. It is much
like a web analytics tool, only that it works offline. It
Business Analysts (BAs) produces graphs that provide insight into load patterns
BAs are always the first source of information for non- in terms of user visits, page visits bandwidth etc.
functional requirements. They may or may not have all The tool lists the most commonly accessed pages which
the information required, but they will be able to make helps in determining the high volume transactions. It
the connection to the right business contacts. also indicates the browser most commonly used to
RFPs access the application website. With the advancement
RFPs usually contain a non-functional requirements of browsers features and variety in the market, this
section. The requirements specific to Performance may information is useful in deciding what browser to use
be few and non-elaborated but will still contain when simulating load on the application.
response times, customer base, transaction volumes etc. The most important use of the tool is in studying the
Business Reports user arrival and load pattern. The hourly graphs outline
There are several reports that the business maintains the user arrival pattern and the
like Volume reports, Accounting, auditing reports that weekly, monthly and yearly graphs help in determining
can provide insight into business statistics the peak periods.
Legacy Systems
In case of legacy modernization projects, there already The below charts show the website usage patterns in
is a system, maybe a mainframe that is still serving the terms of top accessed URLs, top downloads, average
business. Running simple select queries on this system user visit durations and distribution of browsers for
can help in studying the real world transaction volumes incoming requests.
and load patterns
Hardware Sizing Documents
In the initial stages of SDLC, enterprise projects go
through the process of determining the hardware
required to support the solution. This sizing is based on
the throughput that the system is expected to achieve.
So either on a high level or in detail, some study is
already done at this stage that can often be used as
opposed to reinventing the wheel.
Google Analytics
For enterprise applications with already existing
websites, Google Analytics is a web-analytics solution
that provides detailed insights into the website traffic. It
reports traffic patterns, sources of incoming load,
navigation patterns, detailed load patterns over a period
of time and much more.
The below graph shows the hourly distribution of load.
Data Warehouse
This kind of information helps determine the peak hours
of the day and the % increase in load during the peak

www.ijcat.com 14
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

hour. Having access to this information also helps in 1.3. Categorization


deciding off peak windows for scheduling batch and One other premise that can go a long way in
cron jobs during the day. maximizing the code coverage of performance testing
efforts and de-risking the system is categorization.
Several enterprise transactions can be classified as
variations or flavors of one base transaction. Even
though there will be slight variation in input parameters,
the backend tables and the data access objects will be
the same. For instance, a customer updating his phone
number vs. updating his address in the profile. Even
though both transactions start out differently, they
essentially perform an UPDATE on the profile table,
and one can be termed as a flavor of the other.
Along similar lines, the transaction could be the same
but coming in from different sources. E.g. a request for
account creation could come in from the web, from an
agency or from customer service agents over the phone.
However different the sources, the execution flow in all
cases would involve a call to the same WebService and
would end in an INSERT in the accounts and related
reference tables.
The below graph shows the distribution of load over a Once you have identified sets of similar transactions,
year. This information helps in understanding seasonal combine the volumes of each; select the transaction
workloads if any experienced by the business and in with the highest volume to represent the set; and load
turn the application. test it to the combined volume.
This approach covers wider grounds while limiting the
effort involved in preparing and maintaining test
frameworks for each transaction.

Most complex enterprise applications today are heavily


data dependent. A simple example of such a transaction
would be funds transfer in a bank account. To complete
this transaction, there is a pre-requisite of having
enough funds in a source account. If we keep executing
this transaction over a set of accounts, the data will
need to be refreshed either by using a different set of
accounts or by changing the available balance on
existing accounts.
To make it more complex, there are systems like
Service Request Management Systems that are designed
around flow of data from one stage to the other.
Performance testing such systems becomes a nightmare
because one successful execution of tests requires
useful data to be created at each stage and the entire
cycle repeated for the next run.
2. DEFINING SCOPE: This added complexity introduces another factor which
Consider high volume, Complex design, Business is the Return on Investment i.e. whether the effort
Impact, Resource Intensive, Seasonal peaks. With this involved in preparing for and maintaining a test case
gathered Information define the categorization and from one run to the other is worth the benefit from
target for combined use case volumes in each category testing it.
and test the high volume use case in each category. In essence, it cannot be just one factor that can
sufficiently determine the transaction set but it has to be
a combination of all. Whatever the selection process,
the choices are influenced by aggressive delivery
schedules and there is always a trade-off.

www.ijcat.com 15
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

3. CREATING WORKLOAD MODEL: So we need to draw a line at a certain throughput, i.e.


Factors to be considered are growth rates, transactional define a threshold below which a use case will not be
distribution, complex transactions. considered for load testing. Use cases highlighted with
green in the table above are transactions chosen on
When defining targets it is important to account for account of their high volumes.
growth rate. Non-functional requirements are usually
defined in the initial stages of the project. By the time Now that we have defined the scope, we will derive a
the solution goes to production business volumes grow workload using the requirements and data available. In
considerably. The targets defined for performance
most enterprise applications, the requirements are a
testing should be raised by the growth rate factor up to
the roll out dates. combination of volume-centric and user-centric targets,
For a simple system where most transactions take the i.e. module level concurrency and business volumes
same amount of time to complete, the conversion from targets for every use case. For instance, in our example
throughput to concurrency and vice-versa can be of the Banking application, its easy to know how many
generalized to a simple formula: bank tellers will be using the core banking application,
T=C/(tt+rt)
how many customer service agents will be working on
Where T is the throughput (tps) in page views per
second the customer service module and so on. Assuming that
C is the concurrency we have statistics on transaction volumes from say the
tt is the think time between pages in seconds previous year, using simple mathematical logics, we
And rt is the Response time of each page in seconds can derive a workload model. But first lets define some
However, in case of complex longer transactions the variables:
workload model has to be worked out differently.
Total application concurrency C
Let us take an example of a generic Core Banking
Application. A core banking solution will comprise of Concurrency of a module y Cy
several modules that cater to Teller Banking,
Online/Net Banking, Tele Banking, Mobile Banking, Total number of modules in the application m
Customer Service etc.
All these modules function as different entry points into Therefore, C = C1 + C2 + .. + Cm. For sake of
the system. Despite the different interfaces and web simplicity, lets represent it by SUM[C1:Cm]
layers, they will all access the same backend services,
data objects and database tables. So if we were to define Now lets get into distribution within a module. Lets say
scope and create a workload model for this application, the total number of transactions in the module y is n.
we will have to look at the architecture on a whole by Consider a transaction x in the module y. Lets say the
considering requirements of individual modules and target volume of x is Vx per hour and the length of x is
how they interact with each other and the external Lx.
interfaces.
Its important to note that the target transaction volumes
Unlike functional testing, performance testing efforts should be of the time of the rollout. So, if the
have to be limited to only a select number of requirements were defined in 2016, the application goes
transactions. Before deriving a workload model, we live in 2017 and the growth rate is 10% then the target
have to first select transactions that form the scope of volumes for performance testing should be 120% of the
performance testing within each module.
2016 volumes.
For simplicity, let us work with three modules of our
core banking application- Teller Banking, Online
Since each transaction has its own length, i.e. a different
Banking and Phone Banking.
Functionally, there are a total of 22 use cases arising number of pages, it is important to first translate
from these modules as listed in the table above. For a business volumes into page views and then go over
business, the ideal risk-free scenario is to performance distribution. Hence, the target page views per second,
test all 22 use cases. However, the effort involved in i.e. Tx = Vx * Lx
creating a load test framework for 22 Use cases and
maintaining it across builds and releases can be a very User distribution i.e. the distribution of the module level
challenging and time consuming activity. Projects concurrency amongst its transactions or use cases will
seldom have the resources and the time to support the be a function of the target page views Tx.
ask. Moreover, the benefit from load testing every use Therefore, concurrency of a transaction x in a module y
case is usually not worth the effort involved. i.e.
Cx = ROUND ( Tx / SUM[T1:Tn] ) * Cy

www.ijcat.com 16
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

The excel above is a sample workload model for our test report as one of the metrics, you can see the number
example. Please note that the values are mere of hits made to each page in the test suite. This number
assumptions and in no way represent the actual volumes is a count of how many transactions were successfully
of bank. completed on the system.
The information at hand was the distribution of a From the back end, post every test run a simple query
concurrency of 604 users across the three modules, on the database can give a count of volumes achieved
Teller Banking, Online Banking and Phone Banking. during a test run.
Also known were the target volumes for each of the
shortlisted use cases. A study of the use case navigation 5. THE BIRDS EYE VIEW:
For Performance Testing to reveal accurate
and call flow helped determine the length (number of
characteristics of a system, the workload model should
pages) of each use case. Applying the above formulae be a close representation of real world production load
over the given information, targeted throughout (Page pattern. For complex enterprise applications user
Views per second) per use case and a concurrency interface is just one entry point into the system. There
distribution within each module was calculated. are several other interfaces, WebServices scheduled
jobs etc that share the system resources. To simulate a
real world production load pattern it is essential to look
Once created, it is important to get a sign-off on a at the complete picture and account for at least
workload model before starting execution. This ensures incoming load from all possible sources.
that the requirements set forth for the Performance With the increasing complexity of business models and
testing exercise are correct and validates the interdependence on business partners and service
assumptions made. providers, interaction with external subsystems through
Since a workload model relates more to the business, it interfaces and exposed WebServices and messaging
is important to represent the information well. A interfaces is one primary source of incoming load.
Other sources are inter-module communications
pictorial representation of information is more likely to
between modules under test and those that are out of
be well noticed and understood when compared to an scope of the Performance test exercise.
excel containing a whole lot of numbers. Another activity to account for is the daily Batch jobs
and schedulers that run during the regular business
hours. For those that run during off-peak hours, its
4. WORKLOAD VALIDATION: important to test and ensure that the execution of all
In order to redesign the complete workload model it is scheduled batch jobs complete during the designated
recommended to do an early validation such as reverse window and do not overflow into the regular business
calculation, Think time between pages(TT), Avg hours. Along similar lines, there are regular backup and
response time for each page(RT), time to complete archival activities that need to be allocated resources.
execution x-, Achieved throughput One other consideration that needs to go into a
completing a workload is the recurring business
Requirement analysis, market research and solution activities that take place over and above the regular
design are based on a series of assumptions and it is tasks. For example Close-of-Business, End-of-Month
important to ensure that the assumptions are correct by reporting, Quarterly reports etc.
validating that the goals are achievable. This validation
can be done without having to execute the load tests, 6. SEASONAL WORKLOAD MODELS:
just by doing some reverse calculations. These models are business critical.There are a few
For example, lets assume that the average Think time domains that every so often, experience a substantial
between pages i.e. TT is set at an average of 10Secs and variation in their load pattern. These are called seasonal
the Response Time target for each web page i.e. RT is workloads. For applications that cater to these domains,
4Secs. ensuring performance and stability during such seasonal
Hence the throughput of a business transaction x that workloads also becomes the responsibility of the
can be achieved by the derived Concurrency Cx is performance test exercise. Some examples of such
Vx = Cx * 3600 / (Lx * (TT+RT)) seasonal workloads are:
where Lx is the length of x i.e. number of pages. If the eCommerce Applications for Retailers: End of Season
achieved Vx is in line with the targeted business Sales, Holidays like Christmas and Thanksgiving
transaction volumes then it is safe to say that the Banking and Financial Applications: End of Year
assumption of think times and the response time Closing
requirements are correct. Job Portals: Graduation Period
Validation can also be done post-execution at either the Human Resource Management Systems: Appraisals
front-end or the back-end. At the front end, there are etc
load generation tools that report counts of execution of
transactions under test. Lets take the example of the
IBM Rational Performance Tester load test tool. In the

www.ijcat.com 17
International Journal of Computer Applications Technology and Research
Volume 6Issue 1, 13-18, 2017, ISSN:-23198656

7. SELECTIVE MODELLING RISK testing of enterprise applications. We have seen how


ANALYSIS: workload models can be derived for simple as well as
There is always some amount of risk involved with complex use cases using the data available from various
selective modeling. Some transaction, some piece of sources on projects.
code, SQL, stored procedure etc always rolls out We listed some factors that can help in defining the
without being performance tested. scope of performance testing activities, the risks
An untested transaction can consume excessive system involved and possible mitigations for addressing
resources, starving other transactions of computational business concerns arising from not performance testing
resources and causing a delay in overall system all transactions.
responses, or in the worst case scenario, crash the In conclusion, there is no one defined method for
system. creating a comprehensive workload model. The
However small, this risk associated with selective selection process has to be a factor of business
modeling can raise several flags if it has the potential to priorities, application complexity and project timelines.
cause loss of revenue for the business. Because it is While there is always some amount of risk involved
highly impractical to load test every transaction, a with performance testing over selective modeling, a lot
mitigation strategy needs to be defined. can be done to mitigate or minimize the possible impact
There is no one thing that can be done to ensure that the on business.
system is risk free from performance problems. Several
efforts have to run in parallel to cover maximum REFERENCES:
ground. AWStats
Use Functional tests, UAT and System tests to detect Google Analytics
bad transactions
Monitor servers during UAT and Functional tests
Load the test environments with near-production
volume data
Analyze offline reports from test servers for any
abnormal system usage
Plan one round of Performance testing with UAT or
Functional tests running in parallel on the same
environment

Tips:
These tips are some lessons that have been learnt from
requirement gathering processes with several customer
and hence are generic and applicable to all domains like
banking, insurance, retail, telecommunication etc:
Make sure you have a complete understanding of how
the business that is being served by the application.
What major functionalities does it cater to and what
external systems does it interact with. Try to relate that
to the solution design
If and when possible, visit the business on-site to
understand the system usage and study the load patterns
Always set targets at peaks and not the average
volumes
Account for growth rates by targeting the volumes
projected for the rollout timeline
For a new system with no existing data, derive the data
volumes. During execution, load test with databases
holding at least near-production volume of data
Ensure that there is room for server maintenance
activities at average loads
Last but the most important, get a sign-off on the
requirements set forth for performance before starting
execution

CONCLUSION:
In this session we have gone over the process of
gathering and defining requirements for performance

www.ijcat.com 18

You might also like