Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 1 | # GPU Pixel Testing With Gold |
| 2 | |
Brian Sheedy | c4650ad0 | 2019-07-29 17:31:38 | [diff] [blame] | 3 | This page describes various extra details of the Skia Gold service |
| 4 | that the GPU pixel tests use. For information on running the tests locally, see |
| 5 | [this section][local pixel testing]. For common information on triaging, |
| 6 | modification, or general pixel wrangling, see [GPU Pixel Wrangling] or these |
| 7 | sections ([1][pixel debugging], [2][pixel updating]) of the general GPU testing |
| 8 | documentation. |
| 9 | |
| 10 | [local pixel testing]: gpu_testing.md#Running-the-pixel-tests-locally |
Brian Sheedy | 5a4c0a39 | 2021-09-22 21:28:35 | [diff] [blame] | 11 | [GPU Pixel Wrangling]: http://go/gpu-pixel-wrangler |
Brian Sheedy | c4650ad0 | 2019-07-29 17:31:38 | [diff] [blame] | 12 | [pixel debugging]: gpu_testing.md#Debugging-Pixel-Test-Failures-on-the-GPU-Bots |
| 13 | [pixel updating]: gpu_testing.md#Updating-and-Adding-New-Pixel-Tests-to-the-GPU-Bots |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 14 | |
| 15 | [TOC] |
| 16 | |
| 17 | ## Skia Gold |
| 18 | |
| 19 | [Gold][gold documentation] is an image diff service developed by the Skia team. |
| 20 | It was originally developed solely for Skia's usage and only supported |
| 21 | post-submit tests, but has been picked up by other projects such as Chromium and |
| 22 | PDFium and now supports trybots. Unlike other image diff solutions in Chromium, |
| 23 | comparisons are done in an external service instead of locally on the testing |
| 24 | machine. |
| 25 | |
| 26 | [gold documentation]: https://skia.org/dev/testing/skiagold |
| 27 | |
| 28 | ### Why Gold |
| 29 | |
| 30 | Gold has three main advantages over the traditional local image comparison |
| 31 | historically used by Chromium: |
| 32 | |
| 33 | 1. Triage time can be much lower. Because triaging is handled by an external |
| 34 | service, new golden images don't need to go through the CQ and wait for |
| 35 | waterfall bots to pick up the CL. Once an image is triaged in Gold, it |
| 36 | becomes immediately available for future test runs. |
| 37 | 2. Gold supports multiple approved images per test. It is not uncommon for |
| 38 | tests to produce images that are visually indistinguishable, but differ in |
| 39 | a handful of pixels by a small RGB value. Fuzzy image diffing can solve this |
| 40 | problem, but introduces its own set of issues such as possibly causing a test |
| 41 | to erroneously pass. Since most tests that exhibit this behavior only actually |
| 42 | produce 2 or 3 possible valid images, being able to say that any of those |
| 43 | images are acceptable is simpler and less error-prone. |
| 44 | 3. Better image storage. Traditionally, images had to either be included |
| 45 | directly in the repository or uploaded to a Google Storage bucket and pulled in |
| 46 | using the image's hash. The former allowed users to easily see which images were |
| 47 | currently approved, but storing large sized or numerous binary files in git is |
| 48 | generally discouraged due to the way git's history works. The latter worked |
| 49 | around the git issues, but made it much more difficult to actually see what was |
| 50 | being used since the only thing the user had to go on was a hash. Gold moves the |
| 51 | images out of the repository, but provides a GUI interface for easily seeing |
| 52 | which images are currently approved for a particular test. |
| 53 | |
| 54 | ### How It Works |
| 55 | |
| 56 | Gold consists of two main parts: the Gold instance/service and the `goldctl` |
| 57 | binary. A Gold instance in turn consists of two parts: a Google Storage bucket |
| 58 | that data is uploaded to and a server running on GCE that ingests the data and |
| 59 | provides a way to triage diffs. `goldctl` simply provides a standardized way |
| 60 | of interacting with Gold - uploading data to the correct place, retrieving |
| 61 | baselines/golden information, etc. |
| 62 | |
| 63 | In general, the following order of events occurs when running a Gold-enabled |
| 64 | test: |
| 65 | |
| 66 | 1. The test produces an image and passes it to `goldctl`, along with some |
| 67 | information about the hardware and software configuration that the image was |
| 68 | produced on, the test name, etc. |
| 69 | 2. `goldctl` checks whether the hash of the produced image is in the list of |
| 70 | approved hashes. |
| 71 | 1. If it is, `goldctl` exits with a non-failing return code and nothing else |
| 72 | happens. At this point, the test is finished. |
| 73 | 2. If it is not, `goldctl` uploads the image and metadata to the storage |
| 74 | bucket and exits with a failing return code. |
| 75 | 3. The server sees the new data in the bucket and ingests it, showing a new |
| 76 | untriaged image in the GUI. |
| 77 | 4. A user approves the new image in the GUI, and the server adds the image's |
| 78 | hash to the baselines. See the [Waterfall Bots](#Waterfall-Bots) and |
| 79 | [Trybots](#Trybots) sections for specifics on this. |
| 80 | 5. The next time the test is run, the new image is in the baselines, and |
| 81 | assuming the test produces the same image again, the test passes. |
| 82 | |
| 83 | While this is the general order of events, there are several differences between |
| 84 | waterfall/CI bots and trybots. |
| 85 | |
| 86 | #### Waterfall Bots |
| 87 | |
| 88 | Waterfall bots are the simpler of the two bot types. There is only a single |
| 89 | set of baselines to worry about, which is whatever baselines were approved for |
| 90 | a git revision. Additionally, any new images that are produced on waterfalls are |
| 91 | all lumped into the same group of "untriaged images on master", and any images |
| 92 | that are approved from here will immediately be added to the set of baselines |
| 93 | for master. |
| 94 | |
| 95 | Since not all waterfall bots have a trybot counterpart that can be relied upon |
| 96 | to catch newly produced images before a CL is committed, it is likely that a |
| 97 | change that produces new goldens on the CQ will end up making some of the |
| 98 | waterfall bots red for a bit, particularly those on chromium.gpu.fyi. They will |
| 99 | remain red until the new images are triaged as positive or the tests stop |
| 100 | producing the untriaged images. So, it is best to keep an eye out for a few |
| 101 | hours after your CL is committed for any new images from the waterfall bots that |
| 102 | need triaging. |
| 103 | |
| 104 | #### Trybots |
| 105 | |
| 106 | Trybots are a little more complicated when it comes to retrieving and approving |
| 107 | images. First, the set of baselines that are provided when requested by a test |
| 108 | is the union of the master baselines for the current revision and any baselines |
| 109 | that are unique to the CL. For example, if an image with the hash `abcd` is in |
| 110 | the master baselines for `FooTest` and the CL being tested has also approved |
| 111 | an image with the hash `abef` for `FooTest`, then the provided baselines will |
| 112 | contain both `abcd` and `abef` for `FooTest`. |
| 113 | |
| 114 | When an image associated with a CL is approved, the approval only applies to |
| 115 | that CL until the CL is merged. Once this happens, any baselines produced by the |
| 116 | CL are automatically merged into the master baselines for whatever git revision |
| 117 | the CL was merged as. In the above example, if the CL was merged as commit |
| 118 | `ffff`, then both `abcd` and `abef` would be approved images on master from |
| 119 | `ffff` onward. |
| 120 | |
Brian Sheedy | c4650ad0 | 2019-07-29 17:31:38 | [diff] [blame] | 121 | ## Triaging Less Common Failures |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 122 | |
Brian Sheedy | c4650ad0 | 2019-07-29 17:31:38 | [diff] [blame] | 123 | ### Triaging Images Without A Specific Build |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 124 | |
Brian Sheedy | c4650ad0 | 2019-07-29 17:31:38 | [diff] [blame] | 125 | You can see all currently untriaged images that are currently being produced on |
| 126 | ToT on the [GPU Gold instance's main page][gpu gold instance] and currently |
| 127 | untriaged images for a CL by substituting the Gerrit CL number into |
Brian Sheedy | 5a4c0a39 | 2021-09-22 21:28:35 | [diff] [blame] | 128 | `https://chrome-gold.skia.org/search?issue=[CL Number]&unt=true&master=true`. |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 129 | |
Brian Sheedy | 5a4c0a39 | 2021-09-22 21:28:35 | [diff] [blame] | 130 | [gpu gold instance]: https://chrome-gold.skia.org |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 131 | |
Brian Sheedy | 697b2007 | 2019-07-09 21:59:36 | [diff] [blame] | 132 | It's possible, particularly if a test is regularly producing multiple images, |
| 133 | for an image to be untriaged but not show up on the front page of the Gold |
Brian Sheedy | 833ee08e | 2019-07-12 18:35:27 | [diff] [blame] | 134 | instance (for details, see [this crbug comment][untriaged non tot comment]). To |
Brian Sheedy | 697b2007 | 2019-07-09 21:59:36 | [diff] [blame] | 135 | see all such images, visit [this link][untriaged non tot]. |
| 136 | |
| 137 | [untriaged non tot comment]: https://bugs.chromium.org/p/skia/issues/detail?id=9189#c4 |
Brian Sheedy | 5a4c0a39 | 2021-09-22 21:28:35 | [diff] [blame] | 138 | [untriaged non tot]: https://chrome-gold.skia.org/search?fdiffmax=-1&fref=false&frgbamax=255&frgbamin=0&head=false&include=false&limit=50&master=false&match=name&metric=combined&neg=false&offset=0&pos=false&query=source_type%3Dchrome-gpu&sort=desc&unt=true |
Brian Sheedy | 697b2007 | 2019-07-09 21:59:36 | [diff] [blame] | 139 | |
Brian Sheedy | 0e59d43 | 2019-07-13 00:01:35 | [diff] [blame] | 140 | ### Finding A Failed Build |
| 141 | |
| 142 | If for some reason you know that a test run produced a bad image, but do not |
| 143 | have a direct link to the failed build (e.g. you found a bad image using the |
| 144 | untriaged non-ToT link from above), you may want to find the failed Swarming |
| 145 | task to help debug the issue. Gold currently provides a list of CLs that were |
| 146 | under test when a particular image was produced, but does not provide a link to |
| 147 | the build that produced it, so the following workaround can be used. |
| 148 | |
Erik Staab | a081605d | 2022-11-08 18:34:49 | [diff] [blame] | 149 | Assuming the failure is relatively recent (within the past month or so), you |
| 150 | can use the test history view to help find the failed run. To do so, search for |
| 151 | the test name at `https://ci.chromium.org/ui/search?t=TESTS` and look through |
| 152 | the history for the failed build (represented in red). Click on the group of |
| 153 | builds and follow the link for the failing build, from which you can get to the |
| 154 | Swarming task like normal by scrolling to the failed step and clicking on the |
| 155 | link for the failed shard number. |
Brian Sheedy | 0e59d43 | 2019-07-13 00:01:35 | [diff] [blame] | 156 | |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 157 | ### Triaging A Specific Image |
| 158 | |
| 159 | If for some reason an image is not showing up in Gold but you know the hash, you |
| 160 | can manually navigate to the page for it by filling in the correct information |
Brian Sheedy | 5a4c0a39 | 2021-09-22 21:28:35 | [diff] [blame] | 161 | to `https://chrome-gold.skia.org/detail?test=[test_name]&digest=[hash]`. |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 162 | From there, you should be able to triage it as normal. |
| 163 | |
| 164 | If this happens, please also file a bug in [Skia's bug tracker][skia crbug] so |
| 165 | that the root cause can be investigated and fixed. It's likely that you will |
| 166 | be unable to directly edit the owner, CC list, etc. directly, in which case |
| 167 | ping kjlubick@ with a link to the filed bug to help speed up triaging. Include |
| 168 | as much detail as possible, such as a links to the failed swarming task and |
| 169 | the triage link for the problematic image. |
| 170 | |
| 171 | [skia crbug]: https://bugs.chromium.org/p/skia |
| 172 | |
Brian Sheedy | 84a46f9 | 2020-04-30 21:32:15 | [diff] [blame] | 173 | ## Inexact Matching |
| 174 | |
| 175 | By default, Gold uses exact matching with support for multiple baselines per |
| 176 | test. This works well for most of the GPU tests, but there are a handful of |
| 177 | tests such as `Pixel_CSS3DBlueBox` that are prone to noise which causes them to |
| 178 | need additional triaging at times. |
| 179 | |
| 180 | For cases like this, using inexact matching can help, as it allows a comparison |
| 181 | to pass if there are only minor differences between the produced image and a |
| 182 | known-good image. Images that pass in this way will be automatically approved |
| 183 | in Gold, so there is still a record of exactly what was produced. |
| 184 | |
| 185 | To enable this functionality, simply add a `matching_algorithm` field to the |
| 186 | `PixelTestPage` definition for the test (see other uses of this in the file for |
| 187 | concrete examples). |
| 188 | |
| 189 | In order to determine which values to use, you can use the script located at |
| 190 | `//content/test/gpu/gold_inexact_matching/determine_gold_inexact_parameters.py`. |
| 191 | |
| 192 | More complete documentation can be found in the `--help` output of the script, |
| 193 | but in general: |
| 194 | * Use the `binary_search` optimization algorithm if you only want to vary |
| 195 | a single parameter, e.g. you only want to use a Sobel filter. |
| 196 | * Use the `local_minima` optimization algorithm if you want to vary multiple |
| 197 | parameters, such as using fuzzy diffing + a Sobel filter together. |
| 198 | * The default boundaries and weights generally work and give good results, but |
| 199 | you may need to tune them to better suit your particular test, e.g. |
| 200 | increasing the maximum number of differing pixels if your image is large. |
| 201 | |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 202 | ## Working On Gold |
| 203 | |
| 204 | ### Modifying Gold And goldctl |
| 205 | |
| 206 | Although uncommon, changes to the Gold service and `goldctl` binary may be |
| 207 | needed. To do so, simply get a checkout of the |
| 208 | [Skia infrastructure repo][skia infra repo] and go through the same steps as |
| 209 | a Chromium CL (`git cl upload`, etc.). |
| 210 | |
| 211 | [skia infra repo]: https://skia.googlesource.com/buildbot/ |
| 212 | |
| 213 | The Gold service code is located in the `//golden/` directory, while `goldctl` |
| 214 | is located in `//gold-client/`. Once your change is merged, you will have to |
| 215 | either contact [email protected] to roll the service version or follow the |
| 216 | steps in [Rolling goldctl](#Rolling-goldctl) to roll the `goldctl` version used |
| 217 | by Chromium. |
| 218 | |
| 219 | ### Rolling goldctl |
| 220 | |
| 221 | `goldctl` is available as a CIPD package and is DEPSed in as part of `gclient |
| 222 | sync` To update the binary used in Chromium, perform the following steps: |
| 223 | |
| 224 | 1. (One-time only) get an [infra checkout][infra repo] |
Brian Sheedy | bb416e8 | 2023-02-03 17:04:40 | [diff] [blame] | 225 | 1. Run `infra $ eval ``./go/env.py`` ` to ensure that the environment in the |
| 226 | terminal is correct |
| 227 | 1. Run `infra $ cd go/src/infra` |
| 228 | 1. Run `infra/go/src/infra $ go get go.skia.org/infra` |
| 229 | 1. Run `infra/go/src/infra $ go mod tidy` |
| 230 | 1. Upload the changelist ([sample CL][sample roll cl]) |
| 231 | 1. Once the CL is merged, the goldctl autoroller should automatically detect it |
| 232 | and create Chromium CLs to roll the DEPS version. |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 233 | |
| 234 | [infra repo]: https://chromium.googlesource.com/infra/infra/ |
Brian Sheedy | bb416e8 | 2023-02-03 17:04:40 | [diff] [blame] | 235 | [sample roll cl]: https://chromium-review.googlesource.com/c/infra/infra/+/4218809 |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 236 | |
| 237 | If you want to make sure that `goldctl` builds after the update before |
| 238 | committing (e.g. to ensure that no extra third party dependencies were added), |
Brian Sheedy | bb416e8 | 2023-02-03 17:04:40 | [diff] [blame] | 239 | run the following after the `go mod tidy` step: |
Brian Sheedy | 49f394255 | 2019-06-13 22:19:01 | [diff] [blame] | 240 | |
Brian Sheedy | bb416e8 | 2023-02-03 17:04:40 | [diff] [blame] | 241 | 1. `infra/go/src/infra $ rm -f "$GOBIN/goldctl"` to avoid accidentally checking |
| 242 | a stale binary at the end |
| 243 | 1. `infra/go/src/infra $ go install -v go.skia.org/infra/gold-client/cmd/goldctl` |
| 244 | 1. `infra/go/src/infra $ "$GOBIN/goldctl` to ensure that the binary runs |