Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 1 | ========================== |
| 2 | Source-based Code Coverage |
| 3 | ========================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
| 11 | This document explains how to use clang's source-based code coverage feature. |
| 12 | It's called "source-based" because it operates on AST and preprocessor |
| 13 | information directly. This allows it to generate very precise coverage data. |
| 14 | |
| 15 | Clang ships two other code coverage implementations: |
| 16 | |
| 17 | * :doc:`SanitizerCoverage` - A low-overhead tool meant for use alongside the |
| 18 | various sanitizers. It can provide up to edge-level coverage. |
| 19 | |
| 20 | * gcov - A GCC-compatible coverage implementation which operates on DebugInfo. |
Vedant Kumar | 6eed0d5 | 2017-02-09 21:33:21 | [diff] [blame] | 21 | This is enabled by ``-ftest-coverage`` or ``--coverage``. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 22 | |
| 23 | From this point onwards "code coverage" will refer to the source-based kind. |
| 24 | |
| 25 | The code coverage workflow |
| 26 | ========================== |
| 27 | |
| 28 | The code coverage workflow consists of three main steps: |
| 29 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 | [diff] [blame] | 30 | * Compiling with coverage enabled. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 31 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 | [diff] [blame] | 32 | * Running the instrumented program. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 33 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 | [diff] [blame] | 34 | * Creating coverage reports. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 35 | |
| 36 | The next few sections work through a complete, copy-'n-paste friendly example |
| 37 | based on this program: |
| 38 | |
Vedant Kumar | 4c1112c | 2016-06-02 01:15:59 | [diff] [blame] | 39 | .. code-block:: cpp |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 40 | |
| 41 | % cat <<EOF > foo.cc |
| 42 | #define BAR(x) ((x) || (x)) |
| 43 | template <typename T> void foo(T x) { |
| 44 | for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
| 45 | } |
| 46 | int main() { |
| 47 | foo<int>(0); |
| 48 | foo<float>(0); |
| 49 | return 0; |
| 50 | } |
| 51 | EOF |
| 52 | |
| 53 | Compiling with coverage enabled |
| 54 | =============================== |
| 55 | |
Vedant Kumar | 6c53d8f | 2016-06-02 02:45:59 | [diff] [blame] | 56 | To compile code with coverage enabled, pass ``-fprofile-instr-generate |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 57 | -fcoverage-mapping`` to the compiler: |
| 58 | |
| 59 | .. code-block:: console |
| 60 | |
| 61 | # Step 1: Compile with coverage enabled. |
| 62 | % clang++ -fprofile-instr-generate -fcoverage-mapping foo.cc -o foo |
| 63 | |
| 64 | Note that linking together code with and without coverage instrumentation is |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 65 | supported. Uninstrumented code simply won't be accounted for in reports. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 66 | |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 67 | To compile code with Modified Condition/Decision Coverage (MC/DC) enabled, |
| 68 | pass ``-fcoverage-mcdc`` in addition to the clang options specified above. |
| 69 | MC/DC is an advanced form of code coverage most applicable in the embedded |
| 70 | space. |
| 71 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 72 | Running the instrumented program |
| 73 | ================================ |
| 74 | |
| 75 | The next step is to run the instrumented program. When the program exits it |
| 76 | will write a **raw profile** to the path specified by the ``LLVM_PROFILE_FILE`` |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 | [diff] [blame] | 77 | environment variable. If that variable does not exist, the profile is written |
| 78 | to ``default.profraw`` in the current directory of the program. If |
| 79 | ``LLVM_PROFILE_FILE`` contains a path to a non-existent directory, the missing |
| 80 | directory structure will be created. Additionally, the following special |
| 81 | **pattern strings** are rewritten: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 82 | |
| 83 | * "%p" expands out to the process ID. |
| 84 | |
| 85 | * "%h" expands out to the hostname of the machine running the program. |
| 86 | |
Vedant Kumar | 62c3727 | 2020-09-08 21:45:41 | [diff] [blame] | 87 | * "%t" expands out to the value of the ``TMPDIR`` environment variable. On |
| 88 | Darwin, this is typically set to a temporary scratch directory. |
| 89 | |
Vedant Kumar | f3300c9 | 2016-06-14 00:42:12 | [diff] [blame] | 90 | * "%Nm" expands out to the instrumented binary's signature. When this pattern |
| 91 | is specified, the runtime creates a pool of N raw profiles which are used for |
| 92 | on-line profile merging. The runtime takes care of selecting a raw profile |
| 93 | from the pool, locking it, and updating it before the program exits. If N is |
Nikolas Klauser | f6d557e | 2023-06-26 01:59:56 | [diff] [blame] | 94 | not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. The |
Zequan Wu | 62c4c61 | 2023-05-11 21:16:16 | [diff] [blame] | 95 | merge pool specifier can only occur once per filename pattern. |
Vedant Kumar | f3300c9 | 2016-06-14 00:42:12 | [diff] [blame] | 96 | |
Vedant Kumar | d889d1e | 2019-09-19 18:56:43 | [diff] [blame] | 97 | * "%c" expands out to nothing, but enables a mode in which profile counter |
| 98 | updates are continuously synced to a file. This means that if the |
| 99 | instrumented program crashes, or is killed by a signal, perfect coverage |
Vedant Kumar | 2492b5a | 2019-11-12 18:24:23 | [diff] [blame] | 100 | information can still be recovered. Continuous mode does not support value |
| 101 | profiling for PGO, and is only supported on Darwin at the moment. Support for |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 102 | Linux may be mostly complete but requires testing, and support for Windows |
| 103 | may require more extensive changes: please get involved if you are interested |
| 104 | in porting this feature. |
Vedant Kumar | d889d1e | 2019-09-19 18:56:43 | [diff] [blame] | 105 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 106 | .. code-block:: console |
| 107 | |
| 108 | # Step 2: Run the program. |
| 109 | % LLVM_PROFILE_FILE="foo.profraw" ./foo |
| 110 | |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 111 | Note that continuous mode is also used on Fuchsia where it's the only supported |
| 112 | mode, but the implementation is different. The Darwin and Linux implementation |
| 113 | relies on padding and the ability to map a file over the existing memory |
| 114 | mapping which is generally only available on POSIX systems and isn't suitable |
| 115 | for other platforms. |
| 116 | |
Nico Weber | b50431de | 2020-02-10 18:51:23 | [diff] [blame] | 117 | On Fuchsia, we rely on the ability to relocate counters at runtime using a |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 118 | level of indirection. On every counter access, we add a bias to the counter |
| 119 | address. This bias is stored in ``__llvm_profile_counter_bias`` symbol that's |
| 120 | provided by the profile runtime and is initially set to zero, meaning no |
Nico Weber | b50431de | 2020-02-10 18:51:23 | [diff] [blame] | 121 | relocation. The runtime can map the profile into memory at arbitrary locations, |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 122 | and set bias to the offset between the original and the new counter location, |
| 123 | at which point every subsequent counter access will be to the new location, |
Nico Weber | b50431de | 2020-02-10 18:51:23 | [diff] [blame] | 124 | which allows updating profile directly akin to the continuous mode. |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 125 | |
| 126 | The advantage of this approach is that doesn't require any special OS support. |
| 127 | The disadvantage is the extra overhead due to additional instructions required |
| 128 | for each counter access (overhead both in terms of binary size and performance) |
| 129 | plus duplication of counters (i.e. one copy in the binary itself and another |
| 130 | copy that's mapped into memory). This implementation can be also enabled for |
| 131 | other platforms by passing the ``-runtime-counter-relocation`` option to the |
| 132 | backend during compilation. |
| 133 | |
Aaron Ballman | 3821391 | 2023-01-28 13:56:22 | [diff] [blame] | 134 | For a program such as the `Lit <https://llvm.org/docs/CommandGuide/lit.html>`_ |
| 135 | testing tool which invokes other programs, it may be necessary to set |
| 136 | ``LLVM_PROFILE_FILE`` for each invocation. The pattern strings "%p" or "%Nm" |
| 137 | may help to avoid corruption due to concurrency. Note that "%p" is also a Lit |
| 138 | token and needs to be escaped as "%%p". |
Flash Sheridan | 01f13f4 | 2023-01-28 01:51:18 | [diff] [blame] | 139 | |
Petr Hosek | d3db13a | 2019-10-04 20:29:56 | [diff] [blame] | 140 | .. code-block:: console |
| 141 | |
| 142 | % clang++ -fprofile-instr-generate -fcoverage-mapping -mllvm -runtime-counter-relocation foo.cc -o foo |
| 143 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 144 | Creating coverage reports |
| 145 | ========================= |
| 146 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 | [diff] [blame] | 147 | Raw profiles have to be **indexed** before they can be used to generate |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 148 | coverage reports. This is done using the "merge" tool in ``llvm-profdata`` |
| 149 | (which can combine multiple raw profiles and index them at the same time): |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 150 | |
| 151 | .. code-block:: console |
| 152 | |
| 153 | # Step 3(a): Index the raw profile. |
| 154 | % llvm-profdata merge -sparse foo.profraw -o foo.profdata |
| 155 | |
Flash Sheridan | 01f13f4 | 2023-01-28 01:51:18 | [diff] [blame] | 156 | For an example of merging multiple profiles created by testing, |
| 157 | see the LLVM `coverage build script <https://github.com/llvm/llvm-zorg/blob/main/zorg/jenkins/jobs/jobs/llvm-coverage>`_. |
| 158 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 159 | There are multiple different ways to render coverage reports. The simplest |
| 160 | option is to generate a line-oriented report: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 161 | |
| 162 | .. code-block:: console |
| 163 | |
| 164 | # Step 3(b): Create a line-oriented coverage report. |
| 165 | % llvm-cov show ./foo -instr-profile=foo.profdata |
| 166 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 167 | This report includes a summary view as well as dedicated sub-views for |
| 168 | templated functions and their instantiations. For our example program, we get |
| 169 | distinct views for ``foo<int>(...)`` and ``foo<float>(...)``. If |
| 170 | ``-show-line-counts-or-regions`` is enabled, ``llvm-cov`` displays sub-line |
| 171 | region counts (even in macro expansions): |
| 172 | |
George Burgess IV | bc8cc5ac | 2016-06-21 02:19:43 | [diff] [blame] | 173 | .. code-block:: none |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 174 | |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 175 | 1| 20|#define BAR(x) ((x) || (x)) |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 176 | ^20 ^2 |
| 177 | 2| 2|template <typename T> void foo(T x) { |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 178 | 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 179 | ^22 ^20 ^20^20 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 180 | 4| 2|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 181 | ------------------ |
| 182 | | void foo<int>(int): |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 183 | | 2| 1|template <typename T> void foo(T x) { |
| 184 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 185 | | ^11 ^10 ^10^10 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 186 | | 4| 1|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 187 | ------------------ |
| 188 | | void foo<float>(int): |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 189 | | 2| 1|template <typename T> void foo(T x) { |
| 190 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 191 | | ^11 ^10 ^10^10 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 192 | | 4| 1|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 193 | ------------------ |
| 194 | |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 195 | If ``--show-branches=count`` and ``--show-expansions`` are also enabled, the |
| 196 | sub-views will show detailed branch coverage information in addition to the |
| 197 | region counts: |
| 198 | |
| 199 | .. code-block:: none |
| 200 | |
| 201 | ------------------ |
| 202 | | void foo<float>(int): |
| 203 | | 2| 1|template <typename T> void foo(T x) { |
| 204 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
| 205 | | ^11 ^10 ^10^10 |
| 206 | | ------------------ |
| 207 | | | | 1| 10|#define BAR(x) ((x) || (x)) |
| 208 | | | | ^10 ^1 |
| 209 | | | | ------------------ |
| 210 | | | | | Branch (1:17): [True: 9, False: 1] |
| 211 | | | | | Branch (1:24): [True: 0, False: 1] |
| 212 | | | | ------------------ |
| 213 | | ------------------ |
| 214 | | | Branch (3:23): [True: 10, False: 1] |
| 215 | | ------------------ |
| 216 | | 4| 1|} |
| 217 | ------------------ |
| 218 | |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 219 | If the application was instrumented for Modified Condition/Decision Coverage |
| 220 | (MC/DC) using the clang option ``-fcoverage-mcdc``, an MC/DC subview can be |
| 221 | enabled using ``--show-mcdc`` that will show detailed MC/DC information for |
| 222 | each complex condition boolean expression containing at most six conditions. |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 223 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 224 | To generate a file-level summary of coverage statistics instead of a |
| 225 | line-oriented report, try: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 226 | |
| 227 | .. code-block:: console |
| 228 | |
| 229 | # Step 3(c): Create a coverage summary. |
| 230 | % llvm-cov report ./foo -instr-profile=foo.profdata |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 231 | Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover Branches Missed Branches Cover |
| 232 | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 233 | /tmp/foo.cc 13 0 100.00% 3 0 100.00% 13 0 100.00% 12 2 83.33% |
| 234 | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 235 | TOTAL 13 0 100.00% 3 0 100.00% 13 0 100.00% 12 2 83.33% |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 236 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 237 | The ``llvm-cov`` tool supports specifying a custom demangler, writing out |
| 238 | reports in a directory structure, and generating html reports. For the full |
| 239 | list of options, please refer to the `command guide |
Sylvestre Ledru | bc5c3f5 | 2018-11-04 17:02:00 | [diff] [blame] | 240 | <https://llvm.org/docs/CommandGuide/llvm-cov.html>`_. |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 241 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 242 | A few final notes: |
| 243 | |
| 244 | * The ``-sparse`` flag is optional but can result in dramatically smaller |
| 245 | indexed profiles. This option should not be used if the indexed profile will |
| 246 | be reused for PGO. |
| 247 | |
| 248 | * Raw profiles can be discarded after they are indexed. Advanced use of the |
| 249 | profile runtime library allows an instrumented program to merge profiling |
| 250 | information directly into an existing raw profile on disk. The details are |
| 251 | out of scope. |
| 252 | |
| 253 | * The ``llvm-profdata`` tool can be used to merge together multiple raw or |
| 254 | indexed profiles. To combine profiling data from multiple runs of a program, |
| 255 | try e.g: |
| 256 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 257 | .. code-block:: console |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 258 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 259 | % llvm-profdata merge -sparse foo1.profraw foo2.profdata -o foo3.profdata |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 260 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 261 | Exporting coverage data |
| 262 | ======================= |
| 263 | |
| 264 | Coverage data can be exported into JSON using the ``llvm-cov export`` |
| 265 | sub-command. There is a comprehensive reference which defines the structure of |
| 266 | the exported data at a high level in the llvm-cov source code. |
| 267 | |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 268 | Interpreting reports |
| 269 | ==================== |
| 270 | |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 271 | There are six statistics tracked in a coverage summary: |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 272 | |
| 273 | * Function coverage is the percentage of functions which have been executed at |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 274 | least once. A function is considered to be executed if any of its |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 275 | instantiations are executed. |
| 276 | |
| 277 | * Instantiation coverage is the percentage of function instantiations which |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 278 | have been executed at least once. Template functions and static inline |
| 279 | functions from headers are two kinds of functions which may have multiple |
Vedant Kumar | 0c4935b | 2021-02-12 20:04:57 | [diff] [blame] | 280 | instantiations. This statistic is hidden by default in reports, but can be |
| 281 | enabled via the ``-show-instantiation-summary`` option. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 282 | |
| 283 | * Line coverage is the percentage of code lines which have been executed at |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 284 | least once. Only executable lines within function bodies are considered to be |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 285 | code lines. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 286 | |
| 287 | * Region coverage is the percentage of code regions which have been executed at |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 288 | least once. A code region may span multiple lines (e.g in a large function |
| 289 | body with no control flow). However, it's also possible for a single line to |
| 290 | contain multiple code regions (e.g in "return x || y && z"). |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 291 | |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 292 | * Branch coverage is the percentage of "true" and "false" branches that have |
| 293 | been taken at least once. Each branch is tied to individual conditions in the |
| 294 | source code that may each evaluate to either "true" or "false". These |
| 295 | conditions may comprise larger boolean expressions linked by boolean logical |
| 296 | operators. For example, "x = (y == 2) || (z < 10)" is a boolean expression |
| 297 | that is comprised of two individual conditions, each of which evaluates to |
| 298 | either true or false, producing four total branch outcomes. |
| 299 | |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 300 | * Modified Condition/Decision Coverage (MC/DC) is the percentage of individual |
| 301 | branch conditions that have been shown to independently affect the decision |
| 302 | outcome of the boolean expression they comprise. This is accomplished using |
| 303 | the analysis of executed control flow through the expression (i.e. test |
| 304 | vectors) to show that as a condition's outcome is varied between "true" and |
| 305 | false", the decision's outcome also varies between "true" and false", while |
| 306 | the outcome of all other conditions is held fixed (or they are masked out as |
| 307 | unevaluatable, as happens in languages whose logical operators have |
| 308 | short-circuit semantics). MC/DC builds on top of branch coverage and |
| 309 | requires that all code blocks and all execution paths have been tested. This |
| 310 | statistic is hidden by default in reports, but it can be enabled via the |
| 311 | ``-show-mcdc-summary`` option as long as code was also compiled using the |
| 312 | clang option ``-fcoverage-mcdc``. |
| 313 | |
| 314 | * Boolean expressions that are only comprised of one condition (and therefore |
| 315 | have no logical operators) are not included in MC/DC analysis and are |
| 316 | trivially deducible using branch coverage. |
| 317 | |
| 318 | Of these six statistics, function coverage is usually the least granular while |
| 319 | branch coverage (with MC/DC) is the most granular. 100% branch coverage for a |
| 320 | function implies 100% region coverage for a function. The project-wide totals |
| 321 | for each statistic are listed in the summary. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 | [diff] [blame] | 322 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 | [diff] [blame] | 323 | Format compatibility guarantees |
| 324 | =============================== |
| 325 | |
| 326 | * There are no backwards or forwards compatibility guarantees for the raw |
| 327 | profile format. Raw profiles may be dependent on the specific compiler |
| 328 | revision used to generate them. It's inadvisable to store raw profiles for |
| 329 | long periods of time. |
| 330 | |
| 331 | * Tools must retain **backwards** compatibility with indexed profile formats. |
| 332 | These formats are not forwards-compatible: i.e, a tool which uses format |
| 333 | version X will not be able to understand format version (X+k). |
| 334 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 | [diff] [blame] | 335 | * Tools must also retain **backwards** compatibility with the format of the |
| 336 | coverage mappings emitted into instrumented binaries. These formats are not |
| 337 | forwards-compatible. |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 338 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 339 | * The JSON coverage export format has a (major, minor, patch) version triple. |
| 340 | Only a major version increment indicates a backwards-incompatible change. A |
| 341 | minor version increment is for added functionality, and patch version |
| 342 | increments are for bugfixes. |
| 343 | |
Vedant Kumar | 13bd6fb | 2021-02-12 20:04:27 | [diff] [blame] | 344 | Impact of llvm optimizations on coverage reports |
| 345 | ================================================ |
| 346 | |
| 347 | llvm optimizations (such as inlining or CFG simplification) should have no |
| 348 | impact on coverage report quality. This is due to the fact that the mapping |
| 349 | from source regions to profile counters is immutable, and is generated before |
| 350 | the llvm optimizer kicks in. The optimizer can't prove that profile counter |
| 351 | instrumentation is safe to delete (because it's not: it affects the profile the |
| 352 | program emits), and so leaves it alone. |
| 353 | |
| 354 | Note that this coverage feature does not rely on information that can degrade |
| 355 | during the course of optimization, such as debug info line tables. |
| 356 | |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 | [diff] [blame] | 357 | Using the profiling runtime without static initializers |
| 358 | ======================================================= |
| 359 | |
| 360 | By default the compiler runtime uses a static initializer to determine the |
| 361 | profile output path and to register a writer function. To collect profiles |
| 362 | without using static initializers, do this manually: |
| 363 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 | [diff] [blame] | 364 | * Export a ``int __llvm_profile_runtime`` symbol from each instrumented shared |
| 365 | library and executable. When the linker finds a definition of this symbol, it |
| 366 | knows to skip loading the object which contains the profiling runtime's |
| 367 | static initializer. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 | [diff] [blame] | 368 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 | [diff] [blame] | 369 | * Forward-declare ``void __llvm_profile_initialize_file(void)`` and call it |
| 370 | once from each instrumented executable. This function parses |
| 371 | ``LLVM_PROFILE_FILE``, sets the output path, and truncates any existing files |
| 372 | at that path. To get the same behavior without truncating existing files, |
| 373 | pass a filename pattern string to ``void __llvm_profile_set_filename(char |
| 374 | *)``. These calls can be placed anywhere so long as they precede all calls |
| 375 | to ``__llvm_profile_write_file``. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 | [diff] [blame] | 376 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 | [diff] [blame] | 377 | * Forward-declare ``int __llvm_profile_write_file(void)`` and call it to write |
Vedant Kumar | 89262b6 | 2016-06-08 22:32:03 | [diff] [blame] | 378 | out a profile. This function returns 0 when it succeeds, and a non-zero value |
| 379 | otherwise. Calling this function multiple times appends profile data to an |
| 380 | existing on-disk raw profile. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 | [diff] [blame] | 381 | |
Nico Weber | b1706ca | 2017-01-25 16:01:32 | [diff] [blame] | 382 | In C++ files, declare these as ``extern "C"``. |
| 383 | |
Duncan P. N. Exon Smith | d4ee603 | 2021-04-21 22:00:51 | [diff] [blame] | 384 | Using the profiling runtime without a filesystem |
| 385 | ------------------------------------------------ |
| 386 | |
| 387 | The profiling runtime also supports freestanding environments that lack a |
| 388 | filesystem. The runtime ships as a static archive that's structured to make |
| 389 | dependencies on a hosted environment optional, depending on what features |
| 390 | the client application uses. |
| 391 | |
| 392 | The first step is to export ``__llvm_profile_runtime``, as above, to disable |
| 393 | the default static initializers. Instead of calling the ``*_file()`` APIs |
| 394 | described above, use the following to save the profile directly to a buffer |
| 395 | under your control: |
| 396 | |
| 397 | * Forward-declare ``uint64_t __llvm_profile_get_size_for_buffer(void)`` and |
| 398 | call it to determine the size of the profile. You'll need to allocate a |
| 399 | buffer of this size. |
| 400 | |
| 401 | * Forward-declare ``int __llvm_profile_write_buffer(char *Buffer)`` and call it |
| 402 | to copy the current counters to ``Buffer``, which is expected to already be |
| 403 | allocated and big enough for the profile. |
| 404 | |
| 405 | * Optionally, forward-declare ``void __llvm_profile_reset_counters(void)`` and |
| 406 | call it to reset the counters before entering a specific section to be |
| 407 | profiled. This is only useful if there is some setup that should be excluded |
| 408 | from the profile. |
| 409 | |
| 410 | In C++ files, declare these as ``extern "C"``. |
| 411 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 | [diff] [blame] | 412 | Collecting coverage reports for the llvm project |
| 413 | ================================================ |
| 414 | |
| 415 | To prepare a coverage report for llvm (and any of its sub-projects), add |
| 416 | ``-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On`` to the cmake configuration. Raw |
| 417 | profiles will be written to ``$BUILD_DIR/profiles/``. To prepare an html |
| 418 | report, run ``llvm/utils/prepare-code-coverage-artifact.py``. |
| 419 | |
| 420 | To specify an alternate directory for raw profiles, use |
| 421 | ``-DLLVM_PROFILE_DATA_DIR``. To change the size of the profile merge pool, use |
| 422 | ``-DLLVM_PROFILE_MERGE_POOL_SIZE``. |
| 423 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 424 | Drawbacks and limitations |
| 425 | ========================= |
| 426 | |
Vedant Kumar | 82cd770 | 2017-06-19 21:22:05 | [diff] [blame] | 427 | * Prior to version 2.26, the GNU binutils BFD linker is not able link programs |
Vedant Kumar | 1c5f312 | 2017-06-19 21:26:04 | [diff] [blame] | 428 | compiled with ``-fcoverage-mapping`` in its ``--gc-sections`` mode. Possible |
| 429 | workarounds include disabling ``--gc-sections``, upgrading to a newer version |
| 430 | of BFD, or using the Gold linker. |
Vedant Kumar | 82cd770 | 2017-06-19 21:22:05 | [diff] [blame] | 431 | |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 | [diff] [blame] | 432 | * Code coverage does not handle unpredictable changes in control flow or stack |
| 433 | unwinding in the presence of exceptions precisely. Consider the following |
| 434 | function: |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 435 | |
| 436 | .. code-block:: cpp |
| 437 | |
| 438 | int f() { |
| 439 | may_throw(); |
| 440 | return 0; |
| 441 | } |
| 442 | |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 | [diff] [blame] | 443 | If the call to ``may_throw()`` propagates an exception into ``f``, the code |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 | [diff] [blame] | 444 | coverage tool may mark the ``return`` statement as executed even though it is |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 | [diff] [blame] | 445 | not. A call to ``longjmp()`` can have similar effects. |
Vedant Kumar | 859bf4d | 2019-11-21 22:17:04 | [diff] [blame] | 446 | |
| 447 | Clang implementation details |
| 448 | ============================ |
| 449 | |
| 450 | This section may be of interest to those wishing to understand or improve |
| 451 | the clang code coverage implementation. |
| 452 | |
| 453 | Gap regions |
| 454 | ----------- |
| 455 | |
| 456 | Gap regions are source regions with counts. A reporting tool cannot set a line |
| 457 | execution count to the count from a gap region unless that region is the only |
| 458 | one on a line. |
| 459 | |
| 460 | Gap regions are used to eliminate unnatural artifacts in coverage reports, such |
| 461 | as red "unexecuted" highlights present at the end of an otherwise covered line, |
| 462 | or blue "executed" highlights present at the start of a line that is otherwise |
| 463 | not executed. |
| 464 | |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 465 | Branch regions |
| 466 | -------------- |
| 467 | When viewing branch coverage details in source-based file-level sub-views using |
| 468 | ``--show-branches``, it is recommended that users show all macro expansions |
| 469 | (using option ``--show-expansions``) since macros may contain hidden branch |
| 470 | conditions. The coverage summary report will always include these macro-based |
| 471 | boolean expressions in the overall branch coverage count for a function or |
| 472 | source file. |
| 473 | |
| 474 | Branch coverage is not tracked for constant folded branch conditions since |
| 475 | branches are not generated for these cases. In the source-based file-level |
| 476 | sub-view, these branches will simply be shown as ``[Folded - Ignored]`` so that |
| 477 | users are informed about what happened. |
| 478 | |
| 479 | Branch coverage is tied directly to branch-generating conditions in the source |
| 480 | code. Users should not see hidden branches that aren't actually tied to the |
| 481 | source code. |
| 482 | |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 483 | MC/DC Instrumentation |
| 484 | --------------------- |
| 485 | |
| 486 | When instrumenting for Modified Condition/Decision Coverage (MC/DC) using the |
NAKAMURA Takumi | 71f8b44 | 2024-06-13 11:09:02 | [diff] [blame] | 487 | clang option ``-fcoverage-mcdc``, there are two hard limits. |
| 488 | |
| 489 | The maximum number of terms is limited to 32767, which is practical for |
| 490 | handwritten expressions. To be more restrictive in order to enforce coding rules, |
| 491 | use ``-Xclang -fmcdc-max-conditions=n``. Expressions with exceeded condition |
| 492 | counts ``n`` will generate warnings and will be excluded in the MC/DC coverage. |
| 493 | |
| 494 | The number of test vectors (the maximum number of possible combinations of |
| 495 | expressions) is limited to 2,147,483,646. In this case, approximately |
| 496 | 256MiB (==2GiB/8) is used to record test vectors. |
| 497 | |
| 498 | To reduce memory usage, users can limit the maximum number of test vectors per |
| 499 | expression with ``-Xclang -fmcdc-max-test-vectors=m``. |
| 500 | If the number of test vectors resulting from the analysis of an expression |
| 501 | exceeds ``m``, a warning will be issued and the expression will be excluded |
| 502 | from the MC/DC coverage. |
| 503 | |
| 504 | The number of test vectors ``m``, for ``n`` terms in an expression, can be |
| 505 | ``m <= 2^n`` in the theoretical worst case, but is usually much smaller. |
| 506 | In simple cases, such as expressions consisting of a sequence of single |
| 507 | operators, ``m == n+1``. For example, ``(a && b && c && d && e && f && g)`` |
| 508 | requires 8 test vectors. |
| 509 | |
| 510 | Expressions such as ``((a0 && b0) || (a1 && b1) || ...)`` can cause the |
| 511 | number of test vectors to increase exponentially. |
Alan Phipps | 8789b7e | 2024-01-22 20:27:16 | [diff] [blame] | 512 | |
| 513 | Also, if a boolean expression is embedded in the nest of another boolean |
| 514 | expression but separated by a non-logical operator, this is also not supported. |
| 515 | For example, in ``x = (a && b && c && func(d && f))``, the ``d && f`` case |
| 516 | starts a new boolean expression that is separated from the other conditions by |
| 517 | the operator ``func()``. When this is encountered, a warning will be generated |
| 518 | and the boolean expression will not be instrumented. |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 519 | |
Vedant Kumar | 859bf4d | 2019-11-21 22:17:04 | [diff] [blame] | 520 | Switch statements |
| 521 | ----------------- |
| 522 | |
| 523 | The region mapping for a switch body consists of a gap region that covers the |
| 524 | entire body (starting from the '{' in 'switch (...) {', and terminating where the |
| 525 | last case ends). This gap region has a zero count: this causes "gap" areas in |
| 526 | between case statements, which contain no executable code, to appear uncovered. |
| 527 | |
| 528 | When a switch case is visited, the parent region is extended: if the parent |
| 529 | region has no start location, its start location becomes the start of the case. |
| 530 | This is used to support switch statements without a ``CompoundStmt`` body, in |
| 531 | which the switch body and the single case share a count. |
| 532 | |
| 533 | For switches with ``CompoundStmt`` bodies, a new region is created at the start |
| 534 | of each switch case. |
Alan Phipps | 9f2967b | 2020-12-28 17:20:48 | [diff] [blame] | 535 | |
| 536 | Branch regions are also generated for each switch case, including the default |
| 537 | case. If there is no explicitly defined default case in the source code, a |
| 538 | branch region is generated to correspond to the implicit default case that is |
| 539 | generated by the compiler. The implicit branch region is tied to the line and |
| 540 | column number of the switch statement condition since no source code for the |
| 541 | implicit case exists. |