This is the second in our series on metrics. We started with a discussion on residual risk (here) and now move on to coverage.
Coverage can be measured in dozens of different ways. For example we can look at functional coverage, requirements, architecture and risks. We could also look at code coverage or database coverage. But as a testing metric they are weak. They all have the same fallacy at their core.
Test coverage metrics ignore the actual quality of what was done. Did we actually cover some feature/code or just touch it? I could cover a piece of code with one test, showing it as tested but there might be a hundred different data constellations that would be relevant to that part of the code. Each one might cause it to fail. Coverage is not guaranteed to answer those questions.
let’s take the simplest example. We have a line of code that echo/prints a string to the screen. From a coverage perspective we have a function described in the specification that requires this to happen. From a code perspective it’s a line of code that needs to be touched. So if I get the code line to execute once, I have definitely covered the coding metric and possibly the functional metric too (if that only states a valid output as success criteria).
But what about the test case with a string with 1000 characters length? That will fail because a string in some languages string variables can only take 255 characters. So the coverage metric state that we have “tested” it (see SoftEd blog). We can see though, this metric has at best a tentative link to evaluating whether something is tested. It would have told you that the line has been executed/the functionality was run at least once successfully but it would not have told you anything valuable about the actual testing (or lack of) that was done on that line of code or functionality.
Someone not familiar with the limits of this metric could deduce something, that was fundamentally wrong. A manager could think that coverage means that a sufficient amount of testing has been done. They might make deductions about risk or they could calculate expected future effort by applying speed of coverage. But coverage tells you nothing about what has or has not been tested. It only confirms whether you have ‘touched’ something.
However that in itself has value! As a tester I can use coverage to inform me if I have covered the code I was expecting to. It can show what areas of the code have not been tested at all (there will always be some of that due to error scenarios that can’t be reproduced in a test environment). Functional coverage might aid me in planning sessions for Session Based Test Management (SBTM) but using such a metric without further specification and explanation for reporting would be wrong. It should not be a metric but another tool for testers to peruse to improve their work, where it fits the context.
What I’d suggest as an alternative is to take the numbers away. When reporting on coverage it goes hand in hand with test progress. So talk about the application and the (new) functionality. Describe where the application is at and what you expect will be covered in the future. This story will inform better than a metric. It is far less open to speculation and highlights the fact that coverage and progress is not a static thing. Metrics cannot convey context and complexity.
This is the 2nd article in a series of five on test metrics. This part was written by Oliver Erlewein.
Previous Article – Residual Risk
Next Article – Defect Density