top of page

Performance Testing of Data Virtualization Platforms

Updated: Apr 15, 2022

The only truth is in your data. Businesses are more and more realizing the significant potential of their distributed data lakes. On the one hand, customer records, products, and process information are not valuable if you look at them in an isolated environment. On the other hand, creating substantial data marts is often linked to the high effort while providing not enough flexibility.

For several reasons, virtual data access layers are a better solution for such data integration and reporting nightmares:

  1. Access protection. Restrict access to confidential data sets. You have a single point of control.

  2. Data analysis across different data sources. Run complex queries on your entire data universe.

  3. Quick time to market. Reduce implementation time to the absolute minimum by using proofed technology.

  4. Provide a single data access layer for all your teams in the reporting or analytics stream.

We’ve learned the significant benefits of such a virtual data access layer, but what is the performance impact of such a solution?

Businesses realize that the speed and reliability of mission-critical applications is critical for their day-to-day operations. Customers are no longer satisfied with the colorful design. Response times of web pages or IT services are much more important nowadays. This performance-aware mindset has reached many decision-makers, and they automatically ask essential questions before deciding on new products such as:

  1. What is the performance impact?

  2. Is this solution reliable?

  3. What are the hardware requirements?

  4. Which product delivers the best performance?

All those of us working in the non-functional testing business agree that we can’t validate such requirements by manual testing or by creating assumptions. A single user test on different data virtualization solutions will give you a rough idea, but it won’t provide insights on the reliability, hardware sizing or the stability. Don’t spend time and money with guessing. Hire an experienced performance engineer who collects your requirements, creates a meaningful load testing approach, implement it and provides the necessary insights.

How to load test a data virtualization layer?

You seem to think data virtualization creates a high overhead. You think that customers won’t accept the negative performance impact of such an additional layer. Before I started this project, I had similar concerns, but my performance analysis acted as an eye-opener.

On a high level, I’ve:

  1. Collected NFRs

  2. Documented Load Testing Approach

  3. Implemented test scripts

  4. Executed tests

  5. Documented results

  6. Supported tuning

Don’t cut down your NFR and load testing conceptualization activities because those are key to the success of your entire journey. We’ve selected essential test cases which reflects the intended use of the data virtualization suite. After several reviews, we came to a short list of not more than 5 test cases and 10 test scenarios, which allowed us to answer all critical questions raised by project lead and enterprise architect.

Some best practices:

  1. Check NFRs before you start

  2. Align hardware sizing

  3. Monitoring of all layers is important

  4. Conduct current and future growth user and data volume test

The outcome of this load test was that there are vast differences between the available solutions. Data virtualization has an overhead, but the access protection and flexibility outweighs the little performance degradation.

Takeaway: Performance testing reduces the risk of selecting the wrong product and helps you to size your hardware and licenses accordingly. The benefits of a load and performance tests are much higher than the corresponding investments.

Keep doing the good things. Happy Performance Testing!

9 views0 comments
bottom of page