It’s a long way to Continuous Performance Testing

Josef Mayrhofer
Feb 14, 2020
4 min read

Updated: Jul 8, 2022

In recent years, many companies started making software testing parts of the delivery pipeline. Not only functional checks but also security and performance validations are executed fully our partly automated as soon as a new build has been created and deployed on a test environment. Many of us realized that this is an outstanding idea but in reality, we experienced that this is not that simple. In this post, I will introduce some challenges of our shift left testing and share ideas about how mature organizations have fixed them.

Before we explore more about testing early in a DevOps environment, you must understand the intention behind and the benefits involved.

Shift left means bringing quality in our development processes and enabling developers to check functionality, security, and performance in a fully automated fashion. The concept of unit tests or test-driven development touches these ideas, but it is not the silver bullet that solves it. Think about online banking solutions. Your team is validating all kinds of payments via unit tests. These single-user test execution won’t tell you much about the reliability and security of your payment transactions. If there are 5000 payment transactions per hour on production and you miss validating application response times before deployment to prod chances are high that you will experience big surprises.

Problems with shift left

1. Developers hate testing

2. Not enough time to update testing scripts

3. Setup is time-consuming

4. Reporting is weak

5. Reusability is bad

Would continuous testing solve these problems?

Continuous testing means quality is fully integrated into the development process. It’s no longer a neglected cousin required to fulfill some compliance policies. All teams involved in the value stream follow a quality-first approach. Testing is no longer a showstopper or overhead that needs to be avoided. Testing is done in an entirely automated fashion from the point in time after the first build has been created until the last release has been deployed to user acceptance environments.

If nobody likes testing, how should it become fully part of the development process?

The answer to that question is automation. Similar to robotic process automation which is used to remove the manual work we also use automation on preproduction to check the quality of our products. These days we write automation scripts for too many purposes. A functional testing team is coding their functional tests, performance testing and security testing teams are doing the same for their objectives. And, another monitoring team is writing health checks for production to make sure that created services are working as expected. To sum up, we create 4 automation scripts for testing the same functionality. This is simply a huge overhead and a waste of time and money too. No wonder that we are not making enough progress in terms of increasing the automated testing coverage.

Automate the automation

This sounds crazy, isn’t it? It seems to be the solution to our automation problem. We should stop wasting money and think more about how we write testing script only once and reuse them in the entire value stream, for all the quality assurance tasks. Once this puzzle has been solved we will see a tremendous increase in the test coverage and a dramatic reduction in the spendings for software testing. Model-driven testing is a good starting point but it is still not useful for performance and security testing.

The road to continuous performance testing

Over the past 30 years, we were used to bringing performance testing in close before the code is shipped to production. This includes several problems such as

1. Late detection of flaws 2. Rework is expensive 3. Developers are no longer around to fix issues 4. Time to market is at risk 5. Performance shortcomings are ignored

For sure, we have to shift performance testing to the left and start it earlier in the development process but does it work as we expect? Based on my experience, shift left is still a big investment upfront and requires permanent maintenance and care. I’ve seen it in a recent project. Performance engineers implemented a nice Jenkins based performance testing gate. They developed API level performance testing scripts, created the required data sets and executed these CICD performance pipelines daily fully automated. Daily code changes resulted in no longer working testing scripts, and we ended up checking all the scripts every day manually before we executed the CICD performance tests. This semi-automated fashion was not really what we are looking for and we have seen these problems in all our CICD performance testing projects.

How to solve challenges in continuous performance testing?

First of all, we need to fix the scripting problem. Developers could maintain them, but this would hold them back from doing their tasks. Additional automation engineers could be assigned to agile teams, but this would increase the costs as well. Unit tests are still in our developers had, but who should be in charge of all other automated testing scripts? Wouldn’t it be amazing if they could be created based on a formal specification? Well, there is some promising technological enhancement but at the moment nothing solves the automation puzzle. At least one automated testing script has to be implemented. You can tag your functional testing scripts which should be used for performance testing and let a script converter do the job. Convert it to your favorite load or security testing script and let your build pipeline dynamically run these automatically created scripts.

Secondly, we need a fully automated performance gate. We all hate repetitive tasks such as manually reviewing the same key performance indicators over and over again. After a while, we get tired and oversee warning indicators. A much better approach is to agree on a traffic light approach based on preconfigured thresholds such as

Green: Ok, move on Yellow: Attention, analysis is required Red: Stop, fix required

This gives clear indicators and removes all the doubts. Fully automated performance validation is the way to go in the 21st century. Such baselines can be used for production monitoring as well.

Finally, all results must be stored in a time-series database. Dashboards are in place to visualize key performance indicators. Service owners can review important performance indicators and compare response times for different releases on trending charts.

I am hoping that such concepts will be picked up by our industry leads soon. Happy Performance Engineering!