9. Does pairwise testing really work? Evidence, data, and case studies.
This lesson provides empirical evidence gathered on multiple real-world projects that compared the effectiveness of Hexawise tests to manually selected tests. The data shows that, as compared to manually-selected test scripts, Hexawise-generated tests are both faster to create than creating tests by hand and Hexawise-generated tests are more thorough and efficient than the tests that were created by hand.
IEEE study of 10 real-world projects: 30-50% faster to create tests
IEEE study of 10 real-world projects: more than twice as many defects per tester hour
A 10-project study published in IEEE found that tests generated by Hexawise resulted in testers finding an average of 2.4 X as many
defects per tester hour as compared to tests that had been manually-selected by experienced testers.
While testers are often surprised that the benefits from packing more coverage into fewer tests can be so large, hundreds of additional projects have shown similar results.
IEEE study of 10 real-world projects: consistently more thorough testing coverage achieved
The same IEEE Computer study also found that using Hexawise-generated tests increased testing thoroughness every time. Testers using Hexawise consistently found more defects. On average, the much smaller number of Hexawise-generated tests found 13% more defecs than the larger number of hand-selected tests. In order to get an "apples to apples" comparison, both sets of tests were designed to be equivalent to one another to test (a) the same systems, (b) at the same times.
The chart above shows the testing thoroughness of 21 Hexawise tests (in green) compared to the thoroughness of 51 actual tests used by a financial services firm (in blue).
8-project BCBS study: testers using Hexawise created tests in much less time.
Testers at BCBSNC found that, once they put their test inputs into Hexawise, creating tests was much faster than selecting and documenting tests by hand.
For example, one tester, who had recently spent more than one full day putting test cases together by hand attended her first Hexawise training session shortly afterwards. During that training session, she generated a powerful set of tests with Hexawise in less than an hour. Her Hexawise-generated tests were also more thorough.
8-project BCBS study: Hexawise-generated tests found three times as many defects per tester hour.
In a project involving insurance claims testing, BCBSNC testers had already selected 48 test scripts to execute. Another tester used Hexawise to create a smaller set of 16 tests for the same system that packed as much coverage as possible into each optimized test. Both sets of tests were executed. They revealed the same two defects but the Hexawise tests took only a third as long to execute.
This result was repeated in 4 other projects: on average, testers only required one third as many Hexawise-generated tests to achieve the same level of testing thoroughness as compared to manually-selected tests.
8-project BCBS study: 69% reduction in the number of required tests
Multiple projects confirmed that the hoped-for efficiency benefits from Hexawise were in fact consistently achieved on real-world projects. Testers using Hexawise were able to create small sets of unusually powerful tests. On average, testers only needed one third as many tests to achieve the same level of testing thoroughness.
8-project BCBS study: Hexawise-generated tests were consistently much more thorough.
The charts above show the thoroughness of two different sets of tests designed test the same system. The coverage of the 58 Hexawise tests (in orange) is far superior to the thoroughness of the 72 actual tests which had recently been used by BCBSNC’s test automation team (in blue). The slope of the Hexawise coverage chart also shows how Hexawise front-loads coverage to find defects as early as possible.
There were 12,088 pairs of values within this system to be tested. The optimized Hexawise tests tested all of them. The original BCBSNC tests, however, had failed to test more than 5,000 of those pairs. In other words, the Hexawise tests, while fewer in number, had 5,000 fewer small gaps in coverage.
Consistent findings from more than 3,500 testers using Hexawise at our largest client: "Hexawise just plain works."
Our largest client has more than 3,500 testers designing tests with Hexawise. The vast majority of those testers decided to sign up one-at-a-time to use their company's unlimited-use enterprise license of Hexawise. That says a lot. In other words, the number of Hexawise users did not grow from dozens to hundreds to thousands of users because of a top-down mandate that imposed a new tool on testers. The testers had to individually choose to sign up for their Hexawise licenses.
If you ask the hundreds of new testers who sign up each month why they decided to create their Hexawise accounts, do you know what they will tell you? Their most popular answer, by an overwhelming margin, is that they heard good things about Hexawise from other testers. Enthusiastic word of mouth recommendations for Hexawise drove global adoption throughout the firm for the obvious reasons that Hexawise works well and it is enjoyable to use. More specifically,
- Software testers recommending Hexawise say it really helps them design tests faster, execute fewer, more powerful tests, and remove many tedious, error-prone steps they had to have to do by hand before.
- Managers recommend Hexawise to other managers for similar reasons: they see Hexawise helps get higher quality products to market in less time. They see fewer more powerful tests being created faster; they receive more objective and insightful reporting on testing coverage achieved; and they're better able to assess "how much testing is enough?"