Updated: Jul 5
In recent years, many companies have started to make software testing an integral part of their delivery pipeline. They not only do functional checks but also see that security and performance validations are executed—either fully or partly automated—as soon as a new build has been created and deployed on a test environment. Although this looks like a wonderful idea, experience has taught us that it’s not so simple in reality. In this post, I’ll introduce some of the challenges that shift-left testing can present—and share the ways that mature organizations have overcome them.
Before saying anything more about testing early in a DevOps environment, I’d first like to explain the intentions behind this testing and the benefits it gives.
Shift left is all about bringing quality to our development processes and enabling developers to check functionality, security, and performance in a fully automated fashion. The concept of unit tests or test-driven development do, to some extent, address this aim, this testing certainly isn’t the silver bullet that hits the bulls-eye. Consider your online banking solutions. Your team validates all kinds of payments via unit tests. This single-user test execution won’t tell you much about the reliability and security of your payment transactions. If there are 5000 payment transactions per hour in production, and you miss validating application response times before deployment to prod, the chances are that you’ll experience some big surprises that won’t be very welcome.
Problems with shift left
developers hate testing
insufficient time to update testing scripts
setup is time-consuming
reporting is weak
reusability is bad
Would continuous testing solve these problems?
Continuous testing means that quality is fully integrated into the development process. It’s no longer a neglected cousin required only to fulfill certain compliance policies. Continuous testing means that all teams involved in the value stream need to follow a quality first approach. Testing is not a showstopper or an overhead that needs to be avoided. Testing is done in an entirely automated fashion from the moment the first build has been created until the last release has been deployed to the user-acceptance environments.
If nobody likes testing, how can it become an intrinsic part of the development process?
The answer to that question is—through automation. Similar to the robotic process, automation can free us from doing the manual work. We can also use automation on preproduction to check the quality of our products. These days, too many automation scripts are written for too many purposes. A functional testing team codes their functional tests; performance-testing and security-testing teams do the same for their objectives. Meanwhile, another monitoring team is performs health checks for production to make sure the created services are working as expected. The result? A total of 4 automation scripts for testing the same functionality. It’s a huge overhead and a big waste of time. No wonder our progress, in terms of increasing the automated testing coverage, is so slow.
Automate the automation
This sounds crazy, doesn’t it? It goes a long way in solving our automation problem, though. We should stop wasting money and focus more on writing testing scripts only once—and then reuse them for all the quality-assurance tasks across the entire value stream. Once you’ve cracked this nut you’ll see a tremendous increase in the test coverage and a dramatic reduction in software testing spendings. Model-driven testing is a good starting point, but isn’t very useful for performance and security testing.
The road to continuous performance testing
Over the past 30 years, companies have delayed their performance testing until just before the code was shipped to production. This habit results in a whole series of headaches such as:
late detection of flaws
original developers are no longer around to fix issues
time to market is at risk
performance shortcomings are ignored
Now, everyone agrees we have to shift performance testing to the left and start it earlier in the development process. But does this work as expected? Based on my experience, shift left is still a big upfront investment and requires continuous maintenance and care. I’ve seen it in a recent project. Performance engineers implemented a nice Jenkins-based performance testing gate. They developed API-level performance testing scripts, created the required data sets and executed these fully automated CICD performance pipelines on a daily basis. Frequent code changes resulted in testing scripts that no longer worked, and we ended up having to manually check all the scripts every day before executing the CICD performance tests. This semi-automated working method was not what we were after, but these problems presented themselves in every CICD performance testing project.
How to stop hiccups in continuous performance testing
First of all, we need to fix the scripting problem. Developers can maintain the scripts, but this keeps them away from their main tasks. Assigning additional automation engineers to agile teams will increase the costs as well. Unit tests are still in the hands of our developers, but what about all other automated testing scripts? Wouldn’t it be amazing if they could be created based on a formal specification? Well, there are some promising technological enhancements in the pipeline, but nothing solves the automation puzzle at the moment. At least one automated testing script has to be implemented. You can tag the functional testing scripts to be used for performance testing and let a script converter do the job. Convert it to your favorite load or security testing script and let your build pipeline dynamically run these automatically created scripts. Secondly, you’ll need a fully automated performance gate. We all hate repetitive tasks, such as manually reviewing the same key performance indicators over and over again. After a while, we get tired and miss any warning indicators. A much better option is to agree on a traffic-light approach based on preconfigured thresholds, such as:
Green: Ok, move on
Yellow: Attention, analysis is required
Red: Stop, fix required
This gives clear indicators and removes all doubts. Fully automated performance validation is definitely the way to go in the 21st century. Such baselines can be used for production monitoring as well. Finally, all results must be stored in a time-series database. Having dashboards in place will let you visualize key performance indicators. Service owners can review the important performance indicators, and you can use trending charts to compare the response times for different releases.
I’m hoping that our industry leads will embrace these concepts soon! Happy Performance Engineering!