Code coverage
In computer science, test coverage is a measure used to describe the degree to which the source code of a program is executed when a particular test suite runs. A program with high test coverage, measured as a percentage, has had more of its source code executed during testing, which suggests it has a lower chance of containing undetected software bugs compared to a program with low test coverage. Many different metrics can be used to calculate test coverage; some of the most basic are the percentage of program subroutines and the percentage of program statements called during execution of the test suite.
Test coverage was among the first methods invented for systematic software testing. The first published reference was by Miller and Maloney in Communications of the ACM in 1963.
Coverage criteria
To measure what percentage of code has been exercised by a test suite, one or more coverage criteria are used. Coverage criteria are usually defined as rules or requirements, which a test suite needs to satisfy.Basic coverage criteria
There are a number of coverage criteria, the main ones being:- Function coveragehas each function in the program been called?
- Statement coveragehas each statement in the program been executed?
- Edge coveragehas every edge in the Control flow graph been executed?
- Branch coveragehas each branch of each control structure been executed? For example, given an if statement, have both the true and false branches been executed? This is a subset of edge coverage.
- Condition coverage has each Boolean sub-expression evaluated both to true and false?
int foo
Assume this function is a part of some bigger program and this program was run with some test suite.
- If during this execution function 'foo' was called at least once, then function coverage for this function is satisfied.
- Statement coverage for this function will be satisfied if it was called e.g. as
foo
, as in this case, every line in the function is executed includingz = x;
. - Tests calling
foo
andfoo
will satisfy branch coverage because, in the first case, bothif
conditions are met andz = x;
is executed, while in the second case, the first conditionis not satisfied, which prevents executing
z = x;
. - Condition coverage can be satisfied with tests that call
foo
andfoo
. These are necessary because in the first cases,evaluates to
true
, while in the second, it evaluatesfalse
. At the same time, the first case makesfalse
, while the second makes ittrue
.
if a and b then
Condition coverage can be satisfied by two tests:
-
a=true
,b=false
-
a=false
,b=true
if
condition.Fault injection may be necessary to ensure that all conditions and branches of exception handling code have adequate coverage during testing.
Modified condition/decision coverage
A combination of function coverage and branch coverage is sometimes also called decision coverage. This criterion requires that every point of entry and exit in the program has been invoked at least once, and every decision in the program has taken on all possible outcomes at least once. In this context the decision is a boolean expression composed of conditions and zero or more boolean operators. This definition is not the same as branch coverage, however, some do use the term decision coverage as a synonym for branch coverage.Condition/decision coverage requires that both decision and condition coverage be satisfied. However, for safety-critical applications it is often required that modified condition/decision coverage be satisfied. This criterion extends condition/decision criteria with requirements that each condition should affect the decision outcome independently. For example, consider the following code:
if and c then
The condition/decision criteria will be satisfied by the following set of tests:
- a=true, b=true, c=true
- a=false, b=false, c=false
- a=false, b=true, c=false
- a=false, b=true, c=true
- a=false, b=false, c=true
- a=true, b=false, c=true
Multiple condition coverage
- a=false, b=false, c=false
- a=false, b=false, c=true
- a=false, b=true, c=false
- a=false, b=true, c=true
- a=true, b=false, c=false
- a=true, b=false, c=true
- a=true, b=true, c=false
- a=true, b=true, c=true
Parameter value coverage
The idea is that all common possible values for a parameter are tested. For example, common values for a string are: 1) null, 2) empty, 3) whitespace valid string, 5) invalid string, 6) single-byte string, 7) double-byte string. It may also be appropriate to use very long strings. Failure to test each possible parameter value may leave a bug. Testing only one of these could result in 100% code coverage as each line is covered, but as only one of seven options are tested, there is only 14.2% PVC.
Other coverage criteria
There are further coverage criteria, which are used less often:- Linear Code Sequence and Jump coverage a.k.a. JJ-Path coverage has every LCSAJ/JJ-path been executed?
- Path coverageHas every possible route through a given part of the code been executed?
- Entry/exit coverageHas every possible call and return of the function been executed?
- Loop coverageHas every possible loop been executed zero times, once, and more than once?
- State coverageHas each state in a finite-state machine been reached and explored?
- Data-flow coverageHas each variable definition and its usage been reached and explored?
For example, the ECSS-E-ST-40C standard demands 100% statement and decision coverage for two out of four different criticality levels; for the other ones, target coverage values are up to negotiation between supplier and customer.
However, setting specific target values - and, in particular, 100% - has been criticized by practitioners for various reasons
Martin Fowler writes: "I would be suspicious of anything like 100% - it would smell of someone writing tests to make the coverage numbers happy, but not thinking about what they are doing".
Some of the coverage criteria above are connected. For instance, path coverage implies decision, statement and entry/exit coverage. Decision coverage implies statement coverage, because every statement is part of a branch.
Full path coverage, of the type described above, is usually impractical or impossible. Any module with a succession of decisions in it can have up to paths within it; loop constructs can result in an infinite number of paths. Many paths may also be infeasible, in that there is no input to the program under test that can cause that particular path to be executed. However, a general-purpose algorithm for identifying infeasible paths has been proven to be impossible. Basis path testing is for instance a method of achieving complete branch coverage without achieving complete path coverage.
Methods for practical path coverage testing instead attempt to identify classes of code paths that differ only in the number of loop executions, and to achieve "basis path" coverage the tester must cover all the path classes.
In practice
The target software is built with special options or libraries and run under a controlled environment, to map every executed function to the function points in the source code. This allows testing parts of the target software that are rarely or never accessed under normal conditions, and helps reassure that the most important conditions have been tested. The resulting output is then analyzed to see what areas of code have not been exercised and the tests are updated to include these areas as necessary. Combined with other test coverage methods, the aim is to develop a rigorous, yet manageable, set of regression tests.In implementing test coverage policies within a software development environment, one must consider the following:
- What are coverage requirements for the end product certification and if so what level of test coverage is required? The typical level of rigor progression is as follows: Statement, Branch/Decision, Modified Condition/Decision Coverage, LCSAJ
- Will coverage be measured against tests that verify requirements levied on the system under test ?
- Is the object code generated directly traceable to source code statements? Certain certifications, require coverage at the assembly level if this is not the case: "Then, additional verification should be performed on the object code to establish the correctness of such generated code sequences" para-6.4.4.2.
Generally, test coverage tools incur computation and logging in addition to the actual program thereby slowing down the application, so typically this analysis is not done in production. As one might expect, there are classes of software that cannot be feasibly subjected to these coverage tests, though a degree of coverage mapping can be approximated through analysis rather than direct testing.
There are also some sorts of defects which are affected by such tools. In particular, some race conditions or similar real time sensitive operations can be masked when run under test environments; though conversely, some of these defects may become easier to find as a result of the additional overhead of the testing code.
Most professional software developers use C1 and C2 coverage. C1 stands for statement coverage and C2 for branch or condition coverage. With a combination of C1 and C2, it is possible to cover most statements in a code base. Statement coverage would also cover function coverage with entry and exit, loop, path, state flow, control flow and data flow coverage. With these methods, it is possible to achieve nearly 100% code coverage in most software projects.
Usage in industry
Test coverage is one consideration in the safety certification of avionics equipment. The guidelines by which avionics gear is certified by the Federal Aviation Administration is documented in DO-178B and DO-178C.Test coverage is also a requirement in part 6 of the automotive safety standard ISO 26262 Road Vehicles - Functional Safety.