blob: 4c180d011f2b347b6ab5e1694a97f210bfb778c6 [file] [log] [blame] [view]
Kai Ninomiyaa6429fb32018-03-30 01:30:561# GPU Bot Details
2
Kenneth Russell9618adde2018-05-03 03:16:053This page describes in detail how the GPU bots are set up, which files affect
Kai Ninomiyaa6429fb32018-03-30 01:30:564their configuration, and how to both modify their behavior and add new bots.
5
6[TOC]
7
8## Overview of the GPU bots' setup
9
10Chromium's GPU bots, compared to the majority of the project's test machines,
11are physical pieces of hardware. When end users run the Chrome browser, they
12are almost surely running it on a physical piece of hardware with a real
13graphics processor. There are some portions of the code base which simply can
14not be exercised by running the browser in a virtual machine, or on a software
15implementation of the underlying graphics libraries. The GPU bots were
16developed and deployed in order to cover these code paths, and avoid
17regressions that are otherwise inevitable in a project the size of the Chromium
18browser.
19
20The GPU bots are utilized on the [chromium.gpu] and [chromium.gpu.fyi]
21waterfalls, and various tryservers, as described in [Using the GPU Bots].
22
Kenneth Russell9618adde2018-05-03 03:16:0523[chromium.gpu]: https://ci.chromium.org/p/chromium/g/chromium.gpu/console
24[chromium.gpu.fyi]: https://ci.chromium.org/p/chromium/g/chromium.gpu.fyi/console
Kai Ninomiyaa6429fb32018-03-30 01:30:5625[Using the GPU Bots]: gpu_testing.md#Using-the-GPU-Bots
26
Kenneth Russell9618adde2018-05-03 03:16:0527All of the physical hardware for the bots lives in the Swarming pool, and most
John Budorickb2ff2242019-11-14 17:35:5928of it in the chromium.tests.gpu Swarming pool. The waterfall bots are simply
29virtual machines which spawn Swarming tasks with the appropriate tags to get
30them to run on the desired GPU and operating system type. So, for example, the
31[Win10 x64 Release (NVIDIA)] bot is actually a virtual machine which spawns all
32of its jobs with the Swarming parameters:
Kai Ninomiyaa6429fb32018-03-30 01:30:5633
Takuto Ikuta4fd6b4792019-08-19 21:37:3134[Win10 x64 Release (NVIDIA)]: https://ci.chromium.org/p/chromium/builders/ci/Win10%20x64%20Release%20%28NVIDIA%29
Kai Ninomiyaa6429fb32018-03-30 01:30:5635
36```json
37{
Brian Sheedy3ee93c52022-11-11 18:29:5338 "gpu": "10de:2184",
Kai Ninomiyaa6429fb32018-03-30 01:30:5639 "os": "Windows-10",
John Budorickb2ff2242019-11-14 17:35:5940 "pool": "chromium.tests.gpu"
Kai Ninomiyaa6429fb32018-03-30 01:30:5641}
42```
43
44Since the GPUs in the Swarming pool are mostly homogeneous, this is sufficient
45to target the pool of Windows 10-like NVIDIA machines. (There are a few Windows
467-like NVIDIA bots in the pool, which necessitates the OS specifier.)
47
48Details about the bots can be found on [chromium-swarm.appspot.com] and by
Takuto Ikuta2d01a492021-06-04 00:28:5849using `src/tools/luci-go/swarming`, for example `swarming bots`.
Kai Ninomiyaa6429fb32018-03-30 01:30:5650If you are authenticated with @google.com credentials you will be able to make
51queries of the bots and see, for example, which GPUs are available.
52
53[chromium-swarm.appspot.com]: https://chromium-swarm.appspot.com/
54
55The waterfall bots run tests on a single GPU type in order to make it easier to
56see regressions or flakiness that affect only a certain type of GPU.
57
Ben Pastene9cf11392022-11-14 19:36:2558The tryservers like `win-rel` which include GPU tests, on the other hand, run
59tests on more than one GPU type. As of this writing, the Windows tryservers ran
60tests on NVIDIA and Intel GPUs; the Mac tryservers ran tests on Intel and AMD
61GPUs. The way these tryservers' tests are specified is simply by *mirroring*
62how one or more waterfall bots work. This is an inherent property of the
63[`chromium_trybot` recipe][chromium_trybot.py], which was designed to eliminate
64differences in behavior between the tryservers and waterfall bots. Since the
65tryservers mirror waterfall bots, if the waterfall bot is working, the
66tryserver must almost inherently be working as well.
Kai Ninomiyaa6429fb32018-03-30 01:30:5667
John Palmer046f9872021-05-24 01:24:5668[chromium_trybot.py]: https://chromium.googlesource.com/chromium/tools/build/+/main/recipes/recipes/chromium_trybot.py
Kai Ninomiyaa6429fb32018-03-30 01:30:5669
Yuly Novikov8e92b172020-02-07 17:40:1270There are some GPU configurations on the waterfall backed by only one machine,
71or a very small number of machines in the Swarming pool. A few examples are:
Kai Ninomiyaa6429fb32018-03-30 01:30:5672
73<!-- XXX: update this list -->
Brian Sheedy3ee93c52022-11-11 18:29:5374* [Mac Pro Release (AMD)](https://ci.chromium.org/p/chromium/builders/ci/Mac%20Pro%20FYI%20Release%20(AMD))
75* [Linux Release (AMD RX 5500 XT)](https://ci.chromium.org/p/chromium/builders/ci/Linux%20FYI%20Release%20(AMD%20RX%205500%20XT))
Kai Ninomiyaa6429fb32018-03-30 01:30:5676
77There are a couple of reasons to continue to support running tests on a
78specific machine: it might be too expensive to deploy the required multiple
Brian Sheedy3ee93c52022-11-11 18:29:5379copies of said hardware, the hardware pool may have naturally died over time, or
80the configuration might not be reliable enough to begin scaling it up.
Kai Ninomiyaa6429fb32018-03-30 01:30:5681
82## Adding a new isolated test to the bots
83
84Adding a new test step to the bots requires that the test run via an isolate.
85Isolates describe both the binary and data dependencies of an executable, and
Brian Sheedy3ee93c52022-11-11 18:29:5386are the underpinning of how the Swarming system works. See the [LUCI]
87documentation for background on [Isolates] and [Swarming]. Note that with the
88transition towards less Chromium-specific tools, you may see terms such as
89"CAS inputs" instead of "isolate". These newer systems are functionally
90identical to the older ones from a user's perspective, so the terms can be
91safely interchanged.
Kai Ninomiyaa6429fb32018-03-30 01:30:5692
Yuly Novikov8e92b172020-02-07 17:40:1293[LUCI]: https://github.com/luci/luci-py
94[Isolates]: https://github.com/luci/luci-py/blob/master/appengine/isolate/doc/README.md
95[Swarming]: https://github.com/luci/luci-py/blob/master/appengine/swarming/doc/README.md
Kai Ninomiyaa6429fb32018-03-30 01:30:5696
97### Adding a new isolate
98
991. Define your target using the `template("test")` template in
Takuto Ikutaf5333252019-11-06 16:07:08100 [`src/testing/test.gni`][testing/test.gni]. See `test("gl_tests")` in
Kai Ninomiyaa6429fb32018-03-30 01:30:56101 [`src/gpu/BUILD.gn`][gpu/BUILD.gn] for an example. For a more complex
102 example which invokes a series of scripts which finally launches the
Brian Sheedy3ee93c52022-11-11 18:29:53103 browser, see `telemetry_gpu_integration_test` in
104 [`chrome/test/BUILD.gn`][chrome/test/BUILD.gn].
1052. Add an entry to
106 [`src/testing/buildbot/gn_isolate_map.pyl`][gn_isolate_map.pyl] that refers
107 to your target. Find a similar target to yours in order to determine the
Yuly Novikov8e92b172020-02-07 17:40:12108 `type`. The type is referenced in [`src/tools/mb/mb.py`][mb.py].
Kai Ninomiyaa6429fb32018-03-30 01:30:56109
John Palmer046f9872021-05-24 01:24:56110[testing/test.gni]: https://chromium.googlesource.com/chromium/src/+/main/testing/test.gni
111[gpu/BUILD.gn]: https://chromium.googlesource.com/chromium/src/+/main/gpu/BUILD.gn
112[chrome/test/BUILD.gn]: https://chromium.googlesource.com/chromium/src/+/main/chrome/test/BUILD.gn
113[gn_isolate_map.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/gn_isolate_map.pyl
114[mb.py]: https://chromium.googlesource.com/chromium/src/+/main/tools/mb/mb.py
Kai Ninomiyaa6429fb32018-03-30 01:30:56115
116At this point you can build and upload your isolate to the isolate server.
117
118See [Isolated Testing for SWEs] for the most up-to-date instructions. These
119instructions are a copy which show how to run an isolate that's been uploaded
120to the isolate server on your local machine rather than on Swarming.
121
122[Isolated Testing for SWEs]: https://www.chromium.org/developers/testing/isolated-testing/for-swes
123
124If `cd`'d into `src/`:
125
1261. `./tools/mb/mb.py isolate //out/Release [target name]`
127 * For example: `./tools/mb/mb.py isolate //out/Release angle_end2end_tests`
Junji Watanabe160300022021-09-27 03:09:531281. `./tools/luci-go/isolate batcharchive -cas-instance chromium-swarm out/Release/[target name].isolated.gen.json`
129 * For example: `./tools/luci-go/isolate batcharchive -cas-instance chromium-swarm out/Release/angle_end2end_tests.isolated.gen.json`
Kai Ninomiyaa6429fb32018-03-30 01:30:56130See the section below on [isolate server credentials](#Isolate-server-credentials).
131
132### Adding your new isolate to the tests that are run on the bots
133
134See [Adding new steps to the GPU bots] for details on this process.
135
136[Adding new steps to the GPU bots]: gpu_testing.md#Adding-new-steps-to-the-GPU-Bots
137
138## Relevant files that control the operation of the GPU bots
139
Yuly Novikov8e92b172020-02-07 17:40:12140In the [`chromium/src`][chromium/src] workspace:
Kai Ninomiyaa6429fb32018-03-30 01:30:56141
Yuly Novikov8e92b172020-02-07 17:40:12142* [`src/testing/buildbot`][src/testing/buildbot]:
143 * [`chromium.gpu.json`][chromium.gpu.json] and
144 [`chromium.gpu.fyi.json`][chromium.gpu.fyi.json] define which steps are
145 run on which bots. These files are autogenerated. Don't modify them
146 directly!
147 * [`waterfalls.pyl`][waterfalls.pyl],
Brian Sheedy3ee93c52022-11-11 18:29:53148 [`test_suites.pyl`][test_suites.pyl], [`mixins.pyl`][mixins.pyl],
149 [`test_suite_exceptions.pyl`][test_suite_exceptions.pyl], and
150 [`buildbot_json_magic_substitutions.py`][buildbot_substitutions.py]
151 define the configuration for the autogenerated json files above.
Yuly Novikov8e92b172020-02-07 17:40:12152 Run [`generate_buildbot_json.py`][generate_buildbot_json.py] to
153 generate the json files after you modify these pyl files.
154 * [`generate_buildbot_json.py`][generate_buildbot_json.py]
155 * The generator script for all the waterfalls, including
156 `chromium.gpu.json` and `chromium.gpu.fyi.json`.
157 * See the [README for generate_buildbot_json.py] for documentation
158 on this script and the descriptions of the waterfalls and test
159 suites.
160 * When modifying this script, don't forget to also run it, to
161 regenerate the JSON files. Don't worry; the presubmit step will
162 catch this if you forget.
163 * See [Adding new steps to the GPU bots] for more details.
164 * [`gn_isolate_map.pyl`][gn_isolate_map.pyl] defines all of the isolates'
165 behavior in the GN build.
Kai Ninomiyaa6429fb32018-03-30 01:30:56166* [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
167 * Defines the GN arguments for all of the bots.
Yuly Novikov8e92b172020-02-07 17:40:12168* [`src/infra/config`][src/infra/config]:
169 * Definitions of how bots are organized on the waterfall,
170 how builds are triggered, which VMs or machines are used for the
171 builder itself, i.e. for compilation and scheduling swarmed tasks
Brian Sheedy3ee93c52022-11-11 18:29:53172 on GPU hardware, and underlying build configuration details, including
173 which CI bots are mirrored on which trybots. The build config/mirroring
174 information was previously in the [`tools/build`][tools/build] repo, but
175 all GPU-related configurations have been migrated to be fully src-side.
176 See
John Palmer046f9872021-05-24 01:24:56177 [README.md](https://chromium.googlesource.com/chromium/src/+/main/infra/config/README.md)
Yuly Novikov8e92b172020-02-07 17:40:12178 in this directory for up to date information.
Kai Ninomiyaa6429fb32018-03-30 01:30:56179
Yuly Novikov8e92b172020-02-07 17:40:12180[chromium/src]: https://chromium.googlesource.com/chromium/src/
John Palmer046f9872021-05-24 01:24:56181[src/testing/buildbot]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot
182[src/infra/config]: https://chromium.googlesource.com/chromium/src/+/main/infra/config
183[chromium.gpu.json]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.json
184[chromium.gpu.fyi.json]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/chromium.gpu.fyi.json
185[gn_isolate_map.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/gn_isolate_map.pyl
186[mb_config.pyl]: https://chromium.googlesource.com/chromium/src/+/main/tools/mb/mb_config.pyl
187[generate_buildbot_json.py]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/generate_buildbot_json.py
188[mixins.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/mixins.pyl
189[waterfalls.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/waterfalls.pyl
190[test_suites.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/test_suites.pyl
191[test_suite_exceptions.pyl]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/test_suite_exceptions.pyl
Brian Sheedy3ee93c52022-11-11 18:29:53192[buildbot_substitutions.py]: https://chromium.googlesource.com/chromium/src/+/main/testing/buildbot/buildbot_json_magic_substitutions.py
193[tools/build]: https://chromium.googlesource.com/chromium/tools/build/
Kenneth Russell8a386d42018-06-02 09:48:01194[README for generate_buildbot_json.py]: ../../testing/buildbot/README.md
Kai Ninomiyaa6429fb32018-03-30 01:30:56195
Yuly Novikov8e92b172020-02-07 17:40:12196In the [`infradata/config`][infradata/config] workspace (Google internal only,
197sorry):
Kai Ninomiyaa6429fb32018-03-30 01:30:56198
Yuly Novikov8e92b172020-02-07 17:40:12199* [`gpu.star`][gpu.star]
200 * Defines a `chromium.tests.gpu` Swarming pool which contains all of the
201 specialized hardware, except some hardware shared with Chromium:
202 for example, the Windows and Linux NVIDIA
Kai Ninomiyaa6429fb32018-03-30 01:30:56203 bots, the Windows AMD bots, and the MacBook Pros with NVIDIA and AMD
204 GPUs. New GPU hardware should be added to this pool.
Yuly Novikov8e92b172020-02-07 17:40:12205 * Also defines the GCEs, Mac VMs and Mac machines used for CI builders
206 on GPU and GPU.FYI waterfalls and trybots.
Yuly Novikov8e92b172020-02-07 17:40:12207* [`pools.cfg`][pools.cfg]
208 * Defines the Swarming pools for GCEs and Mac VMs used for manually
209 triggered trybots.
Kai Ninomiyaa6429fb32018-03-30 01:30:56210
211[infradata/config]: https://chrome-internal.googlesource.com/infradata/config
John Palmer046f9872021-05-24 01:24:56212[gpu.star]: https://chrome-internal.googlesource.com/infradata/config/+/main/configs/chromium-swarm/starlark/bots/chromium/gpu.star
213[chromium.star]: https://chrome-internal.googlesource.com/infradata/config/+/main/configs/chromium-swarm/starlark/bots/chromium/chromium.star
214[pools.cfg]: https://chrome-internal.googlesource.com/infradata/config/+/main/configs/chromium-swarm/pools.cfg
215[main.star]: https://chrome-internal.googlesource.com/infradata/config/+/main/main.star
216[vms.cfg]: https://chrome-internal.googlesource.com/infradata/config/+/main/configs/gce-provider/vms.cfg
Kai Ninomiyaa6429fb32018-03-30 01:30:56217
218## Walkthroughs of various maintenance scenarios
219
220This section describes various common scenarios that might arise when
221maintaining the GPU bots, and how they'd be addressed.
222
223### How to add a new test or an entire new step to the bots
224
225This is described in [Adding new tests to the GPU bots].
226
John Palmer046f9872021-05-24 01:24:56227[Adding new tests to the GPU bots]: https://chromium.googlesource.com/chromium/src/+/main/docs/gpu/gpu_testing.md#Adding-New-Tests-to-the-GPU-Bots
Kai Ninomiyaa6429fb32018-03-30 01:30:56228
Jamie Madillf71bf712019-01-09 14:41:21229### How to set up new virtual machine instances
230
231The tests use virtual machines to build binaries and to trigger tests on
Yuly Novikov8e92b172020-02-07 17:40:12232physical hardware. VMs don't run any tests themselves. There are 3 types of
233bots:
Jamie Madillf71bf712019-01-09 14:41:21234
Yuly Novikov8e92b172020-02-07 17:40:12235* Builders - these bots build test binaries, upload them to storage and trigger
236 tester bots (see below). Builds must be done on the same OS on which the
237 tests will run, except for Android tests, which are built on Linux.
238* Testers - these bots trigger tests to execute in Swarming and merge results
239 from multiple shards. 2-core Linux GCEs are sufficient for this task.
240* Builder/testers - these are the combination of the above and have same OS
241 constraints as builders. All trybots are of this type, while for CI bots
242 it is optional.
Jamie Madillf71bf712019-01-09 14:41:21243
Yuly Novikov8e92b172020-02-07 17:40:12244The process is:
Jamie Madillf71bf712019-01-09 14:41:21245
Yuly Novikov8e92b172020-02-07 17:40:122461. Follow [go/request-chrome-resources](go/request-chrome-resources) to get
247 approval for the VMs. Use `GPU` project resource group.
248 See this [example ticket](http://crbug.com/1012805).
249 You'll need to determine how many VMs are required, which OSes, how many
250 cores and in which swarming pools they will be (see below for different
251 scenarios).
252 * If setting up a new GPU hardware pool, some VMs will also be needed
253 for manual trybots, usually 2 VMs as of this writing.
254 * Additional action is needed for Mac VMs, the GPU resource owner will
255 assign the bug to Labs to deploy them. See this
256 [example ticket](http://crbug.com/964355).
2571. Once GCE resource request is approved / Mac VMs are deployed, the VMs need
258 to be added to the right Swarming pools in a CL in the
259 [`infradata/config`][infradata/config] (Google internal) workspace.
260 1. GCEs for Windows CI builders and builder/testers should be added to
Yuly Novikov55b23a62020-10-02 18:23:43261 `luci-chromium-gpu-ci-win10-8` group in [`gpu.star`][gpu.star].
Yuly Novikov8e92b172020-02-07 17:40:12262 1. GCEs for Linux and Android CI builders and builder/testers should be added to
Yuly Novikov55b23a62020-10-02 18:23:43263 `luci-chromium-gpu-ci-xenial-8` group in [`gpu.star`][gpu.star].
Yuly Novikov8e92b172020-02-07 17:40:12264 1. VMs for Mac CI builders and builder/testers should be added to
Yuly Novikov55b23a62020-10-02 18:23:43265 `builderfull_gpu_ci_bots` group in [`gpu.star`][gpu.star].
Yuly Novikov8e92b172020-02-07 17:40:12266 [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1166889).
267 1. GCEs for CI testers for all OSes should be added to
Yuly Novikov55b23a62020-10-02 18:23:43268 `luci-chromium-gpu-ci-xenial-2` group in [`gpu.star`][gpu.star].
Yuly Novikov8e92b172020-02-07 17:40:12269 [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2016410).
270 1. GCEs and VMs for CQ and optional CQ GPU trybots for should be added to
271 a corresponding `gpu_try_bots` group in [`gpu.star`][gpu.star].
272 [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1561384).
273 These trybots are "builderful", i.e. these GCEs can't be shared among
274 different bots. This is done in order to limit the number of concurrent
275 builds on these bots (until [crbug.com/949379](crbug.com/949379) is
276 fixed) to prevent oversubscribing GPU hardware.
277 `win_optional_gpu_tests_rel` is an exception, its GCEs come from
278 `luci-chromium-try-win10-*-8` groups in
279 [`chromium.star`][chromium.star], see
280 [CL](https://chrome-internal-review.googlesource.com/c/infradata/config/+/1708723).
281 This can cause oversubscription to Windows GPU hardware, however,
282 Chrome Infra insisted on making this bot builderless due to frequent
283 interruptions they get from limiting the number of concurrent builds on
284 it, see discussion in
285 [CL](https://chromium-review.googlesource.com/c/chromium/src/+/1775098).
286 1. GCEs and VMs for manual GPU trybots should be added to a corresponding
287 pool in "Manually-triggered GPU trybots" in [`gpu.star`][gpu.star].
288 If adding a new pool, it should also be added to
289 [`pools.cfg`][pools.cfg].
290 [Example](https://chrome-internal-review.googlesource.com/c/infradata/config/+/2433332).
291 This is a different mechanism to limit the load on GPU hardware,
292 by having a small pool of GCEs which corresponds to some GPU hardware
293 resource, and all trybots that target this GPU hardware compete for
294 GCEs from this small pool.
295 1. Run [`main.star`][main.star] to regenerate
296 `configs/chromium-swarm/bots.cfg` and `configs/gce-provider/vms.cfg`.
Takuto Ikuta2d01a492021-06-04 00:28:58297 Double-check your work there.
Yuly Novikov8e92b172020-02-07 17:40:12298 Note that previously [`vms.cfg`][vms.cfg] had to be edited manually.
299 Part of the difficulty was in choosing a zone. This should soon no
300 longer be necessary per [crbug.com/942301](http://crbug.com/942301),
301 but consult with the Chrome Infra team to find out which of the
302 [zones](https://cloud.google.com/compute/docs/regions-zones/) has
Yuly Novikov55b23a62020-10-02 18:23:43303 available capacity. This also can be checked on viceroy
304 [dashboard](https://viceroy.corp.google.com/chrome_infra/Quota/chrome?duration=7d).
Yuly Novikov8e92b172020-02-07 17:40:12305 1. Get this reviewed and landed. This step associates the VM or pool of VMs
306 with the bot's name on the waterfall for "builderful" bots or increases
Takuto Ikuta2d01a492021-06-04 00:28:58307 swarmed pool capacity for "builderless" bots.
Yuly Novikov8e92b172020-02-07 17:40:12308 Note: CR+1 is not sticky in this repo, so you'll have to ping for
309 re-review after every change, like rebase.
Jamie Madillf71bf712019-01-09 14:41:21310
Kenneth Russell3a8e5c022018-05-04 21:14:49311### How to add a new tester bot to the chromium.gpu.fyi waterfall
Kai Ninomiyaa6429fb32018-03-30 01:30:56312
313When deploying a new GPU configuration, it should be added to the
314chromium.gpu.fyi waterfall first. The chromium.gpu waterfall should be reserved
315for those GPUs which are tested on the commit queue. (Some of the bots violate
316this rule – namely, the Debug bots – though we should strive to eliminate these
317differences.) Once the new configuration is ready to be fully deployed on
318tryservers, bots can be added to the chromium.gpu waterfall, and the tryservers
319changed to mirror them.
320
321In order to add Release and Debug waterfall bots for a new configuration,
322experience has shown that at least 4 physical machines are needed in the
323swarming pool. The reason is that the tests all run in parallel on the Swarming
324cluster, so the load induced on the swarming bots is higher than it would be
Kenneth Russell9618adde2018-05-03 03:16:05325if the tests were run strictly serially.
Kai Ninomiyaa6429fb32018-03-30 01:30:56326
Kenneth Russell9618adde2018-05-03 03:16:05327With these prerequisites, these are the steps to add a new (swarmed) tester bot.
328(Actually, pair of bots -- Release and Debug. If deploying just one or the
329other, ignore the other configuration.) These instructions assume that you are
330reusing one of the existing builders, like [`GPU FYI Win Builder`][GPU FYI Win
331Builder].
Kai Ninomiyaa6429fb32018-03-30 01:30:56332
3331. Work with the Chrome Infrastructure Labs team to get the (minimum 4)
334 physical machines added to the Swarming pool. Use
Takuto Ikuta2d01a492021-06-04 00:28:58335 [chromium-swarm.appspot.com] or `src/tools/luci-go/swarming bots`
Kai Ninomiyaa6429fb32018-03-30 01:30:56336 to determine the PCI IDs of the GPUs in the bots. (These instructions will
337 need to be updated for Android bots which don't have PCI buses.)
Kenneth Russell9618adde2018-05-03 03:16:05338
John Budorickb2ff2242019-11-14 17:35:59339 1. Make sure to add these new machines to the chromium.tests.gpu Swarming
Yuly Novikov8e92b172020-02-07 17:40:12340 pool by creating a CL against [`gpu.star`][gpu.star] in the
341 [`infradata/config`][infradata/config] (Google internal) workspace.
342 Git configure your user.email to @google.com if necessary. Here is one
343 [example CL](https://chrome-internal-review.googlesource.com/913528)
344 and a
345 [second example](https://chrome-internal-review.googlesource.com/1111456).
Kenneth Russell9618adde2018-05-03 03:16:05346
Yuly Novikov8e92b172020-02-07 17:40:12347 1. Run [`main.star`][main.star] to regenerate
348 `configs/chromium-swarm/bots.cfg`. Double-check your work there.
Kenneth Russellfb27e2d2019-03-29 22:19:55349
3501. Allocate new virtual machines for the bots as described in [How to set up
351 new virtual machine
352 instances](#How-to-set-up-new-virtual-machine-instances).
Kenneth Russell9618adde2018-05-03 03:16:05353
Kenneth Russell9618adde2018-05-03 03:16:053541. Create a CL in the Chromium workspace which does the following. Here's an
Yuly Novikov8e92b172020-02-07 17:40:12355 [example CL](https://chromium-review.googlesource.com/c/chromium/src/+/1752291).
356 1. Adds the new machines to [`waterfalls.pyl`][waterfalls.pyl] directly or
357 to [`mixins.pyl`][mixins.pyl], referencing the new mixin in
358 [`waterfalls.pyl`][waterfalls.pyl].
Kai Ninomiyaa6429fb32018-03-30 01:30:56359 1. The swarming dimensions are crucial. These must match the GPU and
360 OS type of the physical hardware in the Swarming pool. This is what
361 causes the VMs to spawn their tests on the correct hardware. Make
John Budorickb2ff2242019-11-14 17:35:59362 sure to use the chromium.tests.gpu pool, and that the new machines
363 were specifically added to that pool.
Kai Ninomiyaa6429fb32018-03-30 01:30:56364 1. Make triply sure that there are no collisions between the new
365 hardware you're adding and hardware already in the Swarming pool.
366 For example, it used to be the case that all of the Windows NVIDIA
367 bots ran the same OS version. Later, the Windows 8 flavor bots were
368 added. In order to avoid accidentally running tests on Windows 8
369 when Windows 7 was intended, the OS in the swarming dimensions of
370 the Win7 bots had to be changed from `win` to
371 `Windows-2008ServerR2-SP1` (the Win7-like flavor running in our
372 data center). Similarly, the Win8 bots had to have a very precise
373 OS description (`Windows-2012ServerR2-SP0`).
Kenneth Russell9618adde2018-05-03 03:16:05374 1. If you're deploying a new bot that's similar to another existing
Kenneth Russell8a386d42018-06-02 09:48:01375 configuration, please search around in
Yuly Novikov8e92b172020-02-07 17:40:12376 [`test_suite_exceptions.pyl`][test_suite_exceptions.pyl] for
377 references to the other bot's name and see if your new bot needs
378 to be added to any exclusion lists. For example, some of the tests
379 don't run on certain Win bots because of missing OpenGL extensions.
380 1. Run [`generate_buildbot_json.py`][generate_buildbot_json.py] to
381 regenerate `src/testing/buildbot/chromium.gpu.fyi.json`.
Brian Sheedye5afe42b2022-01-05 02:03:09382 1. Updates [`chromium.gpu.star`][chromium.gpu.star] or
383 [`chromium.gpu.fyi.star`][chromium.gpu.fyi.star] and its related
384 generated files [`cr-buildbucket.cfg`][cr-buildbucket.cfg],
Brian Sheedya7bd47b2020-05-12 01:10:01385 [`luci-scheduler.cfg`][luci-scheduler.cfg], and
Brian Sheedy5936c692021-12-15 23:41:38386 [`luci-milo.cfg`][luci-milo.cfg]:
Yuly Novikov8e92b172020-02-07 17:40:12387 * Use the appropriate definition for the type of the bot being added,
388 for example, `ci.gpu_fyi_thin_tester()` should be used for all CI
389 tester bots on GPU FYI waterfall.
390 * Make sure to set `triggered_by` property to the builder which
391 triggers the testers (like `'GPU Win FYI Builder'`).
Brian Sheedya7bd47b2020-05-12 01:10:01392 * Include a `ci.console_view_entry` for the builder's
393 `console_view_entry` argument. Look at the short names and
394 categories to try and come up with a reasonable organization.
Brian Sheedy3ee93c52022-11-11 18:29:53395 * Make sure to set the `serialize_tests` property to `True` in the
396 builder config. This is specified for waterfall bots but not trybots
397 and helps avoid overloading the physical hardware. Additionally,
398 if the bot is configured as a split builder/tester pair, ensure that
399 the tester's builder config matches the parent builder and the
400 tester is marked as being triggered by the parent builder.
Yuly Novikov8e92b172020-02-07 17:40:12401 1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the
402 generated files. Double-check your work there.
Kenneth Russell9618adde2018-05-03 03:16:05403 1. If you were adding a new builder, you would need to also add the new
Yuly Novikov55b23a62020-10-02 18:23:43404 machine to [`src/tools/mb/mb_config.pyl`][mb_config.pyl].
Kenneth Russell139881b2018-05-04 00:45:20405
Yuly Novikov8e92b172020-02-07 17:40:124061. If the number of physical machines for the new bot permits, you should also
407 add a manually-triggered trybot at the same time that the CI bot is added.
408 This is described in [How to add a new manually-triggered trybot].
409
Brian Sheedy1ac3f672021-01-06 23:43:03410While the above instructions assume that an existing parent builder will be
Brian Sheedy3ee93c52022-11-11 18:29:53411be used, a new one can be set up by doing the same steps, but also adding the
412new parent builder at the same time. There are plenty of existing parent
413builder/child tester bots that you can use as a reference.
Brian Sheedy1ac3f672021-01-06 23:43:03414
John Palmer046f9872021-05-24 01:24:56415[How to add a new manually-triggered trybot]: https://chromium.googlesource.com/chromium/src/+/main/docs/gpu/gpu_testing_bot_details.md#How-to-add-a-new-manually_triggered-trybot
Yuly Novikov8e92b172020-02-07 17:40:12416
Brian Sheedye5afe42b2022-01-05 02:03:09417[chromium.gpu.star]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/subprojects/ci/chromium.gpu.star
418[chromium.gpu.fyi.star]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/subprojects/ci/chromium.gpu.fyi.star
John Palmer046f9872021-05-24 01:24:56419[cr-buildbucket.cfg]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/generated/cr-buildbucket.cfg
420[luci-scheduler.cfg]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/generated/luci-scheduler.cfg
421[luci-milo.cfg]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/generated/luci-milo.cfg
Yuly Novikov8e92b172020-02-07 17:40:12422[GPU FYI Win Builder]: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/GPU%20FYI%20Win%20Builder
Kai Ninomiyaa6429fb32018-03-30 01:30:56423
Yuly Novikov33b7b5892021-10-13 00:17:27424### How to remove an existing bot from the chromium.gpu.fyi waterfall
425
426Basically, one needs to follow
427[How to add a new tester bot to the chromium.gpu.fyi waterfall](#how-to-add-a-new-tester-bot-to-the-chromium_gpu_fyi-waterfall)
428step in reverse.
429To prevent bot failures during deletion process, pause the bot on
430https://luci-scheduler.appspot.com/.
431
Kenneth Russell3a8e5c022018-05-04 21:14:49432### How to start running tests on a new GPU type on an existing try bot
Kai Ninomiyaa6429fb32018-03-30 01:30:56433
Ben Pastene9cf11392022-11-14 19:36:25434Let's say that you want to cause the `win-rel` try bot to run tests on
435CoolNewGPUType in addition to the types it currently runs (as of this writing
436only NVIDIA). To do this:
Kai Ninomiyaa6429fb32018-03-30 01:30:56437
Yuly Novikov8e92b172020-02-07 17:40:124381. Make sure there is enough hardware capacity using the available tools to
439 report utilization of the Swarming pool.
4401. Deploy Release and Debug testers on the `chromium.gpu` waterfall, following
441 the instructions for the `chromium.gpu.fyi` waterfall above. Make sure
442 the flakiness on the new bots is comparable to existing `chromium.gpu` bots
443 before proceeding.
Brian Sheedy3ee93c52022-11-11 18:29:534441. Create a CL in the [`chromium/src`][chromium/src] workspace that adds the
Ben Pastene9cf11392022-11-14 19:36:25445 new Release tester to `win-rel`'s `mirrors` list. Rerun
Brian Sheedy3ee93c52022-11-11 18:29:53446 `infra/config/main.star`.
Yuly Novikov8e92b172020-02-07 17:40:124471. Once the above CL lands, the commit queue will **immediately** start
Kai Ninomiyaa6429fb32018-03-30 01:30:56448 running tests on the CoolNewGPUType configuration. Be vigilant and make
449 sure that tryjobs are green. If they are red for any reason, revert the CL
450 and figure out offline what went wrong.
451
Kenneth Russell3a8e5c022018-05-04 21:14:49452### How to add a new manually-triggered trybot
453
Yuly Novikov8e92b172020-02-07 17:40:12454Manually-triggered trybots are needed for investigating failures on a GPU type
455which doesn't have a corresponding CQ trybot (due to lack of GPU resources).
456Even for GPU types that have CQ trybots, it is convenient to have
457manually-triggered trybots as well, since the CQ trybot often runs on more than
458one GPU type, or some test suites which run on CI bot can be disabled on CQ
Brian Sheedy3ee93c52022-11-11 18:29:53459trybot (when the CQ bot has
460[no CI equivalent](https://chromium.googlesource.com/chromium/src/+/main/docs/gpu/gpu_testing_bot_details.md#how-to-add-a-new-try-bot-that-runs-a-subset-of-tests-or-extra-tests)).
Yuly Novikov8e92b172020-02-07 17:40:12461Thus, all CI bots in `chromium.gpu` and `chromium.gpu.fyi` have corresponding
462manually-triggered trybots, except a few which don't have enough hardware
463to support it. A manually-triggered trybot should be added at the same time
464a CI bot is added.
Kenneth Russell3a8e5c022018-05-04 21:14:49465
466Here are the steps to set up a new trybot which runs tests just on one
467particular GPU type. Let's consider that we are adding a manually-triggered
468trybot for the Win7 NVIDIA GPUs in Release mode. We will call the new bot
Yuly Novikov8e92b172020-02-07 17:40:12469`gpu-fyi-try-win7-nvidia-rel-64`.
Kenneth Russell3a8e5c022018-05-04 21:14:49470
Yuly Novikov8e92b172020-02-07 17:40:124711. If there already exist some manually-triggered trybot which runs tests on
472 the same group of machines (i.e. same GPU, OS and driver), the new trybot
473 will have to share the VMs with it. Otherwise, create a new pool of VMs for
474 the new hardware and allocate the VMs as described in
475 [How to set up new virtual machine instances](#How-to-set-up-new-virtual-machine-instances),
476 following the "Manually-triggered GPU trybots" instructions.
Kenneth Russell3a8e5c022018-05-04 21:14:49477
Brian Sheedya7bd47b2020-05-12 01:10:014781. Create a CL in the Chromium workspace which does the following. Here's a
479 [reference CL](https://chromium-review.googlesource.com/c/chromium/src/+/2191276)
Yuly Novikov8e92b172020-02-07 17:40:12480 exemplifying the new "GCE pool per GPU hardware pool" way.
481 1. Updates [`gpu.try.star`][gpu.try.star] and its related generated file
482 [`cr-buildbucket.cfg`][cr-buildbucket.cfg]:
483 * Add the new trybot with the right `builder` define and VMs pool.
484 For `gpu-fyi-try-win7-nvidia-rel-64` this would be
485 `gpu_win_builder()` and `luci.chromium.gpu.win7.nvidia.try`.
Brian Sheedy3ee93c52022-11-11 18:29:53486 * Add the relevant CI bots to the new trybot's `mirrors` list. If the
487 CI tester has a parent builder, the parent should be in the list as
488 well.
Yuly Novikov8e92b172020-02-07 17:40:12489 1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the
490 generated files. Double-check your work there.
491 1. Adds the new trybot to [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
492 and [`src/tools/mb/mb_config_buckets.pyl`][mb_config_buckets.pyl].
493 Use the same mixin as does the builder for the CI bot this trybot
494 mirrors, in case of `gpu-fyi-try-win7-nvidia-rel-64` this is
495 `GPU FYI Win x64 Builder` and thus `gpu_fyi_tests_release_trybot`.
Kenneth Russell3a8e5c022018-05-04 21:14:49496 1. Get this CL reviewed and landed.
497
Kenneth Russellfc566142018-06-26 22:34:15498At this point the new trybot should automatically show up in the
499"Choose tryjobs" pop-up in the Gerrit UI, under the
500`luci.chromium.try` heading, because it was deployed via LUCI. It
501should be possible to send a CL to it.
Kenneth Russell3a8e5c022018-05-04 21:14:49502
Kenneth Russellfc566142018-06-26 22:34:15503(It should not be necessary to modify buildbucket.config as is
504mentioned at the bottom of the "Choose tryjobs" pop-up. Contact the
505chrome-infra team if this doesn't work as expected.)
Kenneth Russell3a8e5c022018-05-04 21:14:49506
John Palmer046f9872021-05-24 01:24:56507[gpu.try.star]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/subprojects/gpu.try.star
508[luci.chromium.try.star]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/consoles/luci.chromium.try.star
509[tryserver.chromium.win.star]: https://chromium.googlesource.com/chromium/src/+/main/infra/config/consoles/tryserver.chromium.win.star
Kenneth Russell3a8e5c022018-05-04 21:14:49510
511
Jamie Madillda894ce2019-04-08 17:19:17512### How to add a new try bot that runs a subset of tests or extra tests
Kenneth Russell3a8e5c022018-05-04 21:14:49513
Jamie Madillda894ce2019-04-08 17:19:17514Several projects (ANGLE, Dawn) run custom tests using the Chromium recipes. They
515use try bot bot configs that run subsets of Chromium or additional slower tests
516that can't be run on the main CQ.
Kai Ninomiyaa6429fb32018-03-30 01:30:56517
Brian Sheedy3ee93c52022-11-11 18:29:53518These trybots are a little different because they do not mirror any waterfall
519bots.
Kai Ninomiyaa6429fb32018-03-30 01:30:56520
Brian Sheedy3ee93c52022-11-11 18:29:53521Let's say the `android_optional_gpu_tests_rel` bot did not exist yet and you
522wanted to add it. The process is similar to adding a CI bot, but modifying
523slightly different files.
Kai Ninomiyaa6429fb32018-03-30 01:30:56524
Yuly Novikov8e92b172020-02-07 17:40:125251. Allocate new virtual machines for the bots as described in
526 [How to set up new virtual machine instances](#How-to-set-up-new-virtual-machine-instances).
5271. Make sure there is enough hardware capacity using the available tools to
528 report utilization of the Swarming pool.
Brian Sheedy3ee93c52022-11-11 18:29:535291. Create a CL in the Chromium workspace the does the following.
530 1. Add your new bot `android_optional_gpu_tests_rel` to the
531 tryserver.chromium.android waterfall in
Yuly Novikov8e92b172020-02-07 17:40:12532 [`waterfalls.pyl`][waterfalls.pyl].
Brian Sheedy3ee93c52022-11-11 18:29:53533 [Here][android_optional_waterfalls_pyl] is an explicit example using the
534 real bot.
Yuly Novikov8e92b172020-02-07 17:40:12535 1. Re-run
536 [`src/testing/buildbot/generate_buildbot_json.py`][generate_buildbot_json.py]
537 to regenerate the JSON files.
Brian Sheedy3ee93c52022-11-11 18:29:53538 1. Add the bot definition to the relevant tryserver `.star` file. In this
539 example, that would be `tryserver.chromium.android.star`.
540 [Here][android_optional_tryserver_star] is an explicit example using the
541 real bot.
Yuly Novikov8e92b172020-02-07 17:40:12542 1. Run `main.star` in [`src/infra/config`][src/infra/config] to update the
543 generated files: [`luci-milo.cfg`][luci-milo.cfg],
544 [`luci-scheduler.cfg`][luci-scheduler.cfg],
545 [`cr-buildbucket.cfg`][cr-buildbucket.cfg]. Double-check your work
546 there.
Yuly Novikov55b23a62020-10-02 18:23:43547 1. Update [`src/tools/mb/mb_config.pyl`][mb_config.pyl]
Brian Sheedy3ee93c52022-11-11 18:29:53548 to include `android_optional_gpu_tests_rel`.
5491. After your CL lands you should be able to find and run
550 `android_optional_gpu_tests_rel` on CLs using Choose Trybots in Gerrit.
Kai Ninomiyaa6429fb32018-03-30 01:30:56551
Brian Sheedy3ee93c52022-11-11 18:29:53552[android_optional_waterfalls_pyl]: https://chromium.googlesource.com/chromium/src/+/024e15a6bb8b2e74ba3a5782831be6a1c11ddf43/testing/buildbot/waterfalls.pyl#6665
553[android_optional_tryserver_star]: https://chromium.googlesource.com/chromium/src/+/b05a42bd3d0e84d55392ae984a69946e56203c71/infra/config/subprojects/chromium/try/tryserver.chromium.android.star#703
Yuly Novikov8e92b172020-02-07 17:40:12554
555
Yuly Novikov3fbea992019-06-28 18:25:42556### How to test and deploy a driver and/or OS update
Kai Ninomiyaa6429fb32018-03-30 01:30:56557
Yuly Novikov3fbea992019-06-28 18:25:42558Let's say that you want to roll out an update to the graphics drivers or the OS
559on one of the configurations like the Linux NVIDIA bots. In order to verify
560that the new driver or OS won't destabilize Chromium's commit queue,
561it's necessary to run the new driver or OS on one of the waterfalls for a day
562or two to make sure the tests are reliably green before rolling out the driver
563or OS update. To do this:
Kai Ninomiyaa6429fb32018-03-30 01:30:56564
Kenneth Russell9618adde2018-05-03 03:16:055651. Make sure that all of the current Swarming jobs for this OS and GPU
Yuly Novikov3fbea992019-06-28 18:25:42566 configuration are targeted at the "stable" version of the driver and the OS
Yuly Novikov8e92b172020-02-07 17:40:12567 in [`waterfalls.pyl`][waterfalls.pyl] and [`mixins.pyl`][mixins.pyl].
Yuly Novikov3fbea992019-06-28 18:25:425681. File a `Build Infrastructure` bug, component `Infra>Labs`, to have ~4 of
569 the physical machines already in the Swarming pool upgraded to the new
570 version of the driver or the OS.
Kenneth Russell9618adde2018-05-03 03:16:055711. If an "experimental" version of this bot doesn't yet exist, follow the
572 instructions above for [How to add a new tester bot to the chromium.gpu.fyi
573 waterfall](#How-to-add-a-new-tester-bot-to-the-chromium_gpu_fyi-waterfall)
Brian Sheedy5936c692021-12-15 23:41:38574 to deploy one. However, you do not need to request additional GCE resources
575 since there should be enough spare capacity in the GPU builderless pool to
Brian Sheedye5afe42b2022-01-05 02:03:09576 handle them. Additionally, ensure that the bot definition in
577 [`chromium.gpu.fyi.star`][chromium.gpu.fyi.star] includes a `list_view`
578 argument specifying `chromium.gpu.experimental`.
5791. If an "experimental" version does already exist, re-add it to its default
580 console in [`chromium.gpu.fyi.star`][chromium.gpu.fyi.star] by uncommenting
581 its `console_view_entry` argument and unpause it in the [luci scheduler].
Yuly Novikov3fbea992019-06-28 18:25:425821. Have this experimental bot target the new version of the driver or the OS
Yuly Novikov8e92b172020-02-07 17:40:12583 in [`waterfalls.pyl`][waterfalls.pyl] and [`mixins.pyl`][mixins.pyl].
584 [Sample CL][sample driver cl].
Kenneth Russell9618adde2018-05-03 03:16:055851. Hopefully, the new machine will pass the pixel tests. If it doesn't, then
Brian Sheedy1cea4d42019-08-12 18:09:49586 it'll be necessary to follow the instructions on
587 [updating Gold baselines (step #4)][updating gold baselines].
Kenneth Russell9618adde2018-05-03 03:16:055881. Watch the new machine for a day or two to make sure it's stable.
Brian Sheedy811cca72020-05-21 21:34:145891. When it is, add the experimental driver/OS to the `_stable` mixin using the
590 swarming OR operator `|`. For example:
Yuly Novikov3fbea992019-06-28 18:25:42591
Yuly Novikovf13babb2019-04-24 23:46:57592 ```
Brian Sheedy811cca72020-05-21 21:34:14593 'win10_intel_hd_630_stable': {
594 'swarming': {
595 'dimensions': {
596 'gpu': '8086:5912-26.20.100.7870|8086:5912-26.20.100.8141',
597 'os': 'Windows-10',
598 'pool': 'chromium.tests.gpu',
599 },
Yuly Novikov3fbea992019-06-28 18:25:42600 },
Yuly Novikov3fbea992019-06-28 18:25:42601 }
602 ```
603
Brian Sheedy811cca72020-05-21 21:34:14604 This will cause tests triggered using the `_stable` mixin to run on either
605 the old stable dimension or the experimental/new stable dimension.
606
607 **NOTE** There is a hard cap of 8 combinations in swarming, so you can only
608 use the OR operator in up to 3 dimensions if each dimension only has two
609 options. More than two options per dimension is allowed as long as the total
610 number of combinations is 8 or less.
Kenneth Russell384a1732019-03-16 02:36:026111. After it lands, ask the Chrome Infrastructure Labs team to roll out the
Kenneth Russell9618adde2018-05-03 03:16:05612 driver update across all of the similarly configured bots in the swarming
613 pool.
6141. If necessary, update pixel test expectations and remove the suppressions
Kai Ninomiyaa6429fb32018-03-30 01:30:56615 added above.
Brian Sheedy811cca72020-05-21 21:34:146161. Remove the old driver or OS version from the `_stable` mixin, leaving just
617 the new stable version.
Brian Sheedy5936c692021-12-15 23:41:386181. Clean up the "experimental" version of the bot by pausing it in the
Brian Sheedye5afe42b2022-01-05 02:03:09619 [luci scheduler] and commenting out its `console_view_entry` argument in
620 [`chromium.gpu.fyi.star`][chromium.gpu.fyi.star].
Kai Ninomiyaa6429fb32018-03-30 01:30:56621
Kenneth Russell9618adde2018-05-03 03:16:05622Note that we leave the experimental bot in place. We could reclaim it, but it
623seems worthwhile to continuously test the "next" version of graphics drivers as
624well as the current stable ones.
Kai Ninomiyaa6429fb32018-03-30 01:30:56625
Brian Sheedy1cea4d42019-08-12 18:09:49626[sample driver cl]: https://chromium-review.googlesource.com/c/chromium/src/+/1726875
Brian Sheedy5a4c0a392021-09-22 21:28:35627[updating gold baselines]: http://go/gpu-pixel-wrangler-info#how-to-keep-the-bots-green
Brian Sheedy5936c692021-12-15 23:41:38628[luci scheduler]: https://luci-scheduler.appspot.com/
Kai Ninomiyaa6429fb32018-03-30 01:30:56629
630## Credentials for various servers
631
632Working with the GPU bots requires credentials to various services: the isolate
633server, the swarming server, and cloud storage.
634
635### Isolate server credentials
636
637To upload and download isolates you must first authenticate to the isolate
638server. From a Chromium checkout, run:
639
Takuto Ikuta2d01a492021-06-04 00:28:58640* `./src/tools/luci-go/isolate login`
Kai Ninomiyaa6429fb32018-03-30 01:30:56641
642This will open a web browser to complete the authentication flow. A @google.com
643email address is required in order to properly authenticate.
644
645To test your authentication, find a hash for a recent isolate. Consult the
646instructions on [Running Binaries from the Bots Locally] to find a random hash
Takuto Ikutaf5333252019-11-06 16:07:08647from a target like `gl_tests`. Then run the following:
Kai Ninomiyaa6429fb32018-03-30 01:30:56648
649[Running Binaries from the Bots Locally]: https://www.chromium.org/developers/testing/gpu-testing#TOC-Running-Binaries-from-the-Bots-Locally
650
Kai Ninomiyaa6429fb32018-03-30 01:30:56651### Swarming server credentials
652
653The swarming server uses the same `auth.py` script as the isolate server. You
654will need to authenticate if you want to manually download the results of
655previous swarming jobs, trigger your own jobs, or run `swarming.py reproduce`
656to re-run a remote job on your local workstation. Follow the instructions
657above, replacing the service with `https://chromium-swarm.appspot.com`.
658
659### Cloud storage credentials
660
661Authentication to Google Cloud Storage is needed for a couple of reasons:
662uploading pixel test results to the cloud, and potentially uploading and
663downloading builds as well, at least in Debug mode. Use the copy of gsutil in
664`depot_tools/third_party/gsutil/gsutil`, and follow the [Google Cloud Storage
665instructions] to authenticate. You must use your @google.com email address and
666be a member of the Chrome GPU team in order to receive read-write access to the
667appropriate cloud storage buckets. Roughly:
668
6691. Run `gsutil config`
6702. Copy/paste the URL into your browser
6713. Log in with your @google.com account
6724. Allow the app to access the information it requests
6735. Copy-paste the resulting key back into your Terminal
6746. Press "enter" when prompted for a project-id (i.e., leave it empty)
675
676At this point you should be able to write to the cloud storage bucket.
677
678Navigate to
679<https://console.developers.google.com/storage/chromium-gpu-archive> to view
680the contents of the cloud storage bucket.
681
682[Google Cloud Storage instructions]: https://developers.google.com/storage/docs/gsutil