Nico Weber | 488df506 | 2019-07-19 14:11:37 | [diff] [blame] | 1 | Deterministic builds |
| 2 | ==================== |
| 3 | |
| 4 | Chromium's build is deterministic. This means that building Chromium at the |
| 5 | same revision will produce exactly the same binary in two builds, even if |
| 6 | these builds are on different machines, in build directories with different |
| 7 | names, or if one build is a clobber build and the other build is an incremental |
| 8 | build with the full build done at a different revision. This is a project goal, |
| 9 | and we have bots that verify that it's true. |
| 10 | |
| 11 | Furthermore, even if a binary is built at two different revisions but none of |
| 12 | the revisions in between logically affect a binary, then builds at those two |
| 13 | revisions should produce exactly the same binary too (imagine a revision that |
| 14 | modifies code `chrome/` while we're looking at `base_unittests`). This isn't |
| 15 | enforced by bots, and it's currently not always true in Chromium's build -- but |
| 16 | it's true for some binaries at least, and it's supposed to become more true |
| 17 | over time. |
| 18 | |
| 19 | Having deterministic builds is important, among other things, so that swarming |
| 20 | can cache test results based on the hash of test inputs. |
| 21 | |
| 22 | This document currently describes how to handle failures on the deterministic |
| 23 | bots. |
| 24 | |
| 25 | There's also |
| 26 | https://www.chromium.org/developers/testing/isolated-testing/deterministic-builds; |
| 27 | over time all documentation over there will move to here. |
| 28 | |
| 29 | Handling failures on the deterministic bots |
| 30 | ------------------------------------------- |
| 31 | |
| 32 | This section describes what to do when `compare_build_artifacts` is failing on |
| 33 | a bot. |
| 34 | |
| 35 | The deterministic bots make sure that building the same revision of chromium |
| 36 | always produces the same output. |
| 37 | |
| 38 | To analyze the failing step, it's useful to understand what the step is doing. |
| 39 | |
| 40 | There are two types of checks. |
| 41 | |
| 42 | 1. The full determinism check makes sure that build artifacts are independent |
| 43 | of the name of the build directory, and that full and incremental builds |
| 44 | produce the same output. This is done by having bots that have two build |
| 45 | directories: `out/Release` does incremental builds, and `out/Release.2` |
| 46 | does full clobber builds. After doing the two builds, the bot checks |
| 47 | that all built files needed to run tests on swarming are identical in the |
| 48 | two build directories. The full determinism check is currently used on |
| 49 | Linux and Windows bots. (`Deterministic Linux (dbg)` has one more check: |
| 50 | it doesn't use goma for the incremental build, to check that using goma |
| 51 | doesn't affect built files either.) |
| 52 | |
| 53 | 2. The simple determinism check does a clobber build in `out/Release`, moves |
| 54 | this to a different location (`out/Release.1`), then does another clobber |
| 55 | build in `out/Release`, moves that to another location (`out/Release.2`), |
| 56 | and then does the same comparison as done in the full build. Since both |
| 57 | builds are done at the same path, and since both are clobber builds, |
| 58 | this doesn't check that the build is independent of the name of the build |
| 59 | directory, and it doesn't check that incremental and full builds produce |
| 60 | the same results. This check is used on Android and macOS, but over time |
| 61 | all platforms should move to the full determinism check. |
| 62 | |
| 63 | ### Understanding `compare_build_artifacts` error output |
| 64 | |
| 65 | `compare_build_artifacts` prints a list of all files it compares, followed by |
| 66 | `": None`" for files that have no difference. Files that are different between |
| 67 | the two build directories are followed by `": DIFFERENT(expected)"` or |
| 68 | `": DIFFERENT(unexpected)"`, followed by e.g. `"different size: 195312640 != |
| 69 | 195311616"` if the two files have different size, or by e.g. `"70 out of |
| 70 | 5091840 bytes are different (0.00%)"` if they're the same size. |
| 71 | |
| 72 | You can ignore lines that say `": None"` or `": DIFFERENT(expected)"`, these |
| 73 | don't turn the step red. `": DIFFERENT(expected)"` is for files that are known |
| 74 | to not yet be deterministic; these are listed in |
| 75 | [`src/tools/determinism/deterministic_build_whitelist.pyl`][1]. If the |
| 76 | deterministic bots turn red, you usually do *not* want to add an entry to this |
| 77 | list, but figure out what introduced the nondeterminism and revert that. |
| 78 | |
| 79 | [1]: https://chromium.googlesource.com/chromium/src/+/HEAD/tools/determinism/deterministic_build_whitelist.pyl |
| 80 | |
| 81 | If only a few bytes are different, the script prints a diff of the hexdump |
| 82 | of the two files. Most of the time, you can ignore this. |
| 83 | |
| 84 | After this list of filenames, the script prints a summary that looks like |
| 85 | |
| 86 | ``` |
| 87 | Equals: 5454 |
| 88 | Expected diffs: 3 |
| 89 | Unexpected diffs: 60 |
| 90 | Unexpected files with diffs: |
| 91 | ``` |
| 92 | |
| 93 | followed by a list of all files that contained `": DIFFERENT(unexpected)"`. |
| 94 | This is the most interesting part of the output. |
| 95 | |
| 96 | After that, the script tries to compute all build inputs of each file with |
| 97 | a difference, and compares the inputs. For example, if a .exe is different, |
| 98 | this will try to find all .obj files the .exe consists of, and try to compare |
| 99 | these too. Nowadays, the compile step is usually deterministic, so this can |
| 100 | usually be ignored too. Here's an example output: |
| 101 | |
| 102 | ``` |
| 103 | fixed_build_dir C:\b\s\w\ir\cache\builder\src\out\Release exists. will try to use orig dir. |
| 104 | Checking verifier_test_dll_2.dll.pdb difference: (1 deps) |
| 105 | ``` |
| 106 | |
| 107 | ### Diagnosing bot redness |
| 108 | |
| 109 | Things to do, in order of involvedness and effectiveness: |
| 110 | |
| 111 | - Look at the list of files following `"Unexpected files with diffs:"` and check |
| 112 | if they have something in common. If the blame list on the first red build |
| 113 | has a change to that common thing, try reverting it and see if it helps. |
| 114 | If many, seemingly unrelated files have differences, look for changes to |
| 115 | the build config (Ctrl-F ".gn") or for toolchain changes (Ctrl-F "clang"). |
| 116 | |
| 117 | - The deterministic bots try to upload a tar archive to Google Storage. |
| 118 | Use `gsutil.py ls gs://chrome-determinism` to see available archives, |
| 119 | and use e.g. `gsutil.py cp gs://chrome-determinism/Windows\ |
| 120 | deterministic/9998/deterministic_build_diffs.tgz .` to copy one archive to |
| 121 | your workstation. You can then look at the diffs in more detail. See |
| 122 | https://bugs.chromium.org/p/chromium/issues/detail?id=985285#c6 for an |
| 123 | example. |
| 124 | |
| 125 | - Try to reproduce the problem locally. First, set up two build directories |
| 126 | with identical args.gn, then do a full build at the last known green |
| 127 | revision in the first build directory: |
| 128 | |
| 129 | ``` |
| 130 | $ gn clean out/gn |
| 131 | $ autoninja -C out/gn base_unittests |
| 132 | ``` |
| 133 | |
| 134 | Then, sync to the first bad revision (make sure to also run `gclient sync` |
| 135 | to update dependencies), do an incremental build in the |
| 136 | first build directory and a full build in the second build directory, and |
| 137 | run `compare_build_artifacts.py` to compare the outputs: |
| 138 | |
| 139 | ``` |
| 140 | $ autoninja -C out/gn base_unittests |
| 141 | $ gn clean out/gn2 |
| 142 | $ autoninja -C out/gn2 base_unittests |
| 143 | $ tools/determinism/compare_build_artifacts.py \ |
| 144 | --first-build-dir out/gn \ |
| 145 | --second-build-dir out/gn2 \ |
| 146 | --target-platform linux |
| 147 | ``` |
| 148 | |
| 149 | This will hopefully reproduce the error, and then you can binary search |
| 150 | between good and bad revisions to identify the bad commit. |
| 151 | |
| 152 | |
| 153 | Things *not* to do: |
| 154 | |
| 155 | - Don't clobber the deterministic bots. Clobbering a deterministic bot will |
| 156 | turn it green if build nondeterminism is caused by incremental and full |
| 157 | clobber builds producing different outputs. However, this is one of the |
| 158 | things we want these bots to catch, and clobbering them only removes the |
| 159 | symptom on this one bot -- all CQ bots will still have nondeterministic |
| 160 | incremental builds, which is (among other things) bad for caching. So while |
| 161 | clobbering a deterministic bot might make it green, it's papering over issues |
| 162 | that the deterministic bots are supposed to catch. |
| 163 | |
| 164 | - Don't add entries to `src/tools/determinism/deterministic_build_whitelist.py`. |
| 165 | Instead, try to revert commits introducing nondeterminism. |