perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 1 | # Memory Benchmarks |
| 2 | |
| 3 | This document describes benchmarks available to track Chrome's and |
| 4 | WebView's memory usage, where they live, what they measure, how to run them, |
| 5 | and on how to diagnose regressions. |
| 6 | |
| 7 | [TOC] |
| 8 | |
| 9 | ## Glossary |
| 10 | |
| 11 | * **User story:** a set of actions to perform on a browser or device (e.g. |
| 12 | open google homepage, type "foo", click search, scroll down, visit first |
| 13 | result, etc.). |
| 14 | * **Metric:** a data aggregation process that takes a Chrome trace as input |
| 15 | (produced by a [Telemetry][] run) and produces a set of summary numbers as |
| 16 | output (e.g. total GPU memory used). |
| 17 | * **Benchmark:** a combination of (one or more) user stories and (one or |
| 18 | more) metrics. |
| 19 | |
| 20 | [Telemetry]: https://github.com/catapult-project/catapult/blob/master/telemetry/README.md |
| 21 | |
| 22 | ## System Health |
| 23 | |
| 24 | *System health* is an effort to unify top-level benchmarks (as opposite to |
| 25 | micro-benchmarks and regression tests) that are suitable to capture |
| 26 | representative user stories. |
| 27 | |
| 28 | ### Benchmarks |
| 29 | |
| 30 | System health memory benchmarks are: |
| 31 | |
| 32 | * [system_health.memory_mobile][system_health] - |
| 33 | user stories running on Android devices. |
| 34 | * [system_health.memory_desktop][system_health] - |
| 35 | user stories running on desktop platforms. |
| 36 | |
| 37 | These benchmarks are run continuously on the [chromium.perf][] waterfall, |
| 38 | collecting and reporting results on the |
| 39 | [Chrome Performance Dashboard][chromeperf]. |
| 40 | |
| 41 | Other benchmarks maintained by the memory-infra team are discussed in the |
| 42 | [appendix](#Other-benchmarks). |
| 43 | |
| 44 | [system_health]: https://chromium.googlesource.com/chromium/src/+/master/tools/perf/page_sets/system_health/ |
| 45 | [chromium.perf]: https://build.chromium.org/p/chromium.perf/waterfall |
| 46 | [chromeperf]: https://chromeperf.appspot.com/report |
| 47 | |
| 48 | ### User stories |
| 49 | |
| 50 | System health user stories are classified by the kind of interactions they |
| 51 | perform with the browser: |
| 52 | |
| 53 | * `browse` stories navigate to a URL and interact with the page; e.g. |
| 54 | scroll, click on elements, navigate to subpages, navigate back. |
| 55 | * `load` stories just navigate to a URL and wait for the page to |
| 56 | load. |
| 57 | * `background` stories navigate to a URL, possibly interact with the |
| 58 | page, and then bring another app to the foreground (thus pushing the |
| 59 | browser to the background). |
| 60 | * `long_running` stories interact with a page for a longer period |
| 61 | of time (~5 mins). |
| 62 | * `blank` has a single story that just navigates to **about:blank**. |
| 63 | |
| 64 | The full name of a story has the form `{interaction}:{category}:{site}` where: |
| 65 | |
| 66 | * `interaction` is one the labels given above; |
| 67 | * `category` is used to group together sites with a similar purpose, |
| 68 | e.g. `news`, `social`, `tools`; |
| 69 | * `site` is a short name identifying the website in which the story mostly |
| 70 | takes place, e.g. `cnn`, `facebook`, `gmail`. |
| 71 | |
| 72 | For example `browse:news:cnn` and `background:social:facebook` are two system |
| 73 | health user stories. |
| 74 | |
| 75 | Today, for most stories a garbage collection is forced at the end of the |
| 76 | story and a memory dump is then triggered. Metrics report the values |
| 77 | obtained from this single measurement. |
| 78 | |
| 79 | ## Continuous monitoring |
| 80 | |
| 81 |  |
| 82 | |
| 83 | To view data from one of the benchmarks on the |
| 84 | [Chrome Performance Dashboard][chromeperf] you should select: |
| 85 | |
| 86 | * **Test suite:** The name of a *[benchmark](#Benchmarks)*. |
| 87 | * **Bot:** The name of a *platform or device configuration*. Sign in to also |
| 88 | see internal bots. |
| 89 | * **Subtest (1):** The name of a *[metric](#Understanding-memory-metrics)*. |
| 90 | * **Subtest (2):** The name of a *story group*; these have the form |
| 91 | `{interaction}_{category}` for system health stories. |
| 92 | * **Subtest (3):** The name of a *[user story](#User-stories)* |
| 93 | (with `:` replaced by `_`). |
| 94 | |
xunjieli | 0c0ed3be | 2017-06-23 14:08:35 | [diff] [blame] | 95 | If you are investigating a Perf dashboard alert and would like to see the |
| 96 | details, you can click on any point of the graph. It gives you the commit range, |
| 97 | buildbot output and a link to the trace file taken during the buildbot run. |
sullivan | a65c32f | 2017-07-06 17:39:04 | [diff] [blame] | 98 | (More information about reading trace files [here][memory-infra]) |
| 99 | |
| 100 | [memory-infra]: /docs/memory-infra/README.md |
xunjieli | 0c0ed3be | 2017-06-23 14:08:35 | [diff] [blame] | 101 | |
| 102 |  |
| 103 | |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 104 | ## How to run the benchmarks |
| 105 | |
| 106 | Benchmarks may be run on a local platform/device or remotely on a try job. |
| 107 | |
| 108 | ### How to run locally |
| 109 | |
| 110 | After building, e.g. `ChromePublic.apk`, you can run a specific system health |
| 111 | story with the command: |
| 112 | |
| 113 | ``` |
| 114 | $SRC/tools/perf/run_benchmark run system_health.memory_mobile \ |
| 115 | --browser android-chromium --story-filter load:search:google |
| 116 | ``` |
| 117 | |
| 118 | This will run the story with a default of 3 repetitions and produce a |
| 119 | `results.html` file comparing results from this and any previous benchmark |
Juan A. Navarro Perez | ee12a2a | 2017-10-02 16:20:18 | [diff] [blame] | 120 | runs. In addition, you'll also get individual [trace files][memory-infra] |
Juan Antonio Navarro Perez | 33b0d14 | 2018-01-19 14:54:35 | [diff] [blame] | 121 | for each story run by the benchmark. **Note:** by default only high level |
| 122 | metrics are shown, you may need to tick the "Show all" check box in order to |
| 123 | view some of the lower level memory metrics. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 124 | |
| 125 |  |
| 126 | |
| 127 | Other useful options for this command are: |
| 128 | |
| 129 | * `--pageset-repeat [n]` - override the default number of repetitions |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 130 | * `--reset-results` - clear results from any previous benchmark runs in the |
| 131 | `results.html` file. |
| 132 | * `--results-label [label]` - give meaningful names to your benchmark runs, |
| 133 | this way it is easier to compare them. |
| 134 | |
| 135 | For WebView make sure to [replace the system WebView][webview_install] |
| 136 | on your device and use `--browser android-webview`. |
| 137 | |
sullivan | a65c32f | 2017-07-06 17:39:04 | [diff] [blame] | 138 | [memory-infra]: /docs/memory-infra/README.md |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 139 | [webview_install]: https://www.chromium.org/developers/how-tos/build-instructions-android-webview |
| 140 | |
| 141 | ### How to run a try job |
| 142 | |
| 143 | Given a patch on a chromium checkout, try jobs provide a convenient way to |
| 144 | evaluate its memory implications on devices or platforms which |
| 145 | may not be immediately available to developers. |
| 146 | |
| 147 | To start a try job [upload a CL][contributing] and run the command, e.g.: |
| 148 | |
| 149 | ``` |
| 150 | $SRC/tools/perf/run_benchmark try android-nexus5 system_health.memory_mobile |
| 151 | ``` |
| 152 | |
| 153 | This will run all of the system health stories for you, and conveniently |
| 154 | provide a `results.html` file comparing measurements with/without your patch. |
| 155 | Options like `--story-filter` and `--pageset-repeat` may also be passed to |
| 156 | this command. |
| 157 | |
| 158 | To see the full list of available try bots run the command: |
| 159 | |
| 160 | ``` |
| 161 | $SRC/tools/perf/run_benchmark try list |
| 162 | ``` |
| 163 | |
| 164 | [contributing]: https://www.chromium.org/developers/contributing-code |
| 165 | |
| 166 | ## Understanding memory metrics |
| 167 | |
| 168 | There is a large number of [memory-infra][] metrics, breaking down usage |
| 169 | attributed to different components and processes. |
| 170 | |
| 171 |  |
| 172 | |
| 173 | Most memory metrics have the form |
| 174 | `memory:{browser}:{processes}:{source}:{component}:{kind}` |
| 175 | where: |
| 176 | |
| 177 | * **browser:** One of `chrome` or `webview`. |
| 178 | * **processess:** One of `browser_process`, `renderer_processess`, |
| 179 | `gpu_process`, or `all_processess`. |
| 180 | * **source:** One of `reported_by_chrome` or `reported_by_os` |
| 181 | * **component:** May be a Chrome component, e.g. `skia` or `sqlite`; |
| 182 | details about a specific component, e.g. `v8:heap`; or a class of memory |
xunjieli | 0c0ed3be | 2017-06-23 14:08:35 | [diff] [blame] | 183 | as seen by the OS, e.g. `system_memory:native_heap` or `gpu_memory`. If |
| 184 | reported by chrome, the metrics are gathered by `MemoryDumpProvider`s, |
| 185 | probes placed in the specific components' codebase. For example, in |
| 186 | "memory:chrome:all_processes:reported_by_chrome:net:effective_size_avg," |
| 187 | the component is "net" which is Chrome's network stack and |
| 188 | "reported_by_chrome" means that this metric is gathered via probes in |
| 189 | the network stack. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 190 | * **kind:** The kind of memory being reported. For metrics reported by |
| 191 | Chrome this usually is `effective_size` (others are `locked_size` |
| 192 | and `allocated_objects_size`); for metrics by the OS this usually is |
| 193 | `proportional_resident_size` (others are `peak_resident_size` and |
| 194 | `private_dirty_size`). |
| 195 | |
sullivan | a65c32f | 2017-07-06 17:39:04 | [diff] [blame] | 196 | [memory-infra]: /docs/memory-infra/README.md |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 197 | |
| 198 | ## Appendix |
| 199 | |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 200 | There are a few other benchmarks maintained by the memory-infra team. |
| 201 | These also use the same set of metrics as system health, but have differences |
| 202 | on the kind of stories that they run. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 203 | |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 204 | ### memory.top_10_mobile |
| 205 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 206 | The [memory.top_10_mobile][memory_py] benchmark is in the process of being deprecated |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 207 | in favor of system health benchmarks. This process, however, hasn't been |
| 208 | finalized and currently they are still the reference benchmark used for |
| 209 | decision making in the Android release process. Therefore, **it is important |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 210 | to diagnose and fix regressions caught by this benchmark**. |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 211 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 212 | The benchmark's work flow is: |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 213 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 214 | - Cycle between: |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 215 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 216 | - load a page on Chrome, wait for it to load, [force garbage collection |
| 217 | and measure memory][measure]; |
| 218 | - push Chrome to the background, force garbage collection and measure |
| 219 | memory again. |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 220 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 221 | - Repeat for each of 10 pages *without closing the browser*. |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 222 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 223 | - Close the browser, re-open and repeat the full page set a total of 5 times. |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 224 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 225 | - Story groups are either `foreground` or `background` depending on the state |
| 226 | of the browser at the time of measurement. |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 227 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 228 | The main difference to watch out between this and system health benchmarks is |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 229 | that, since a single browser instance is kept open and shared by many |
| 230 | individual stories, they are not independent of each other. In particular, **do |
| 231 | not use the `--story-filter` argument when trying to reproduce regressions** |
| 232 | on these benchmarks, as doing so will affect the results. |
| 233 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 234 | [memory_py]: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/memory.py |
perezju | a247cc7 | 2017-01-17 16:40:50 | [diff] [blame] | 235 | [measure]: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/actions/action_runner.py#L133 |
| 236 | |
| 237 | ### Dual browser benchmarks |
| 238 | |
| 239 | Dual browser benchmarks are intended to assess the memory implications of |
| 240 | shared resources between Chrome and WebView. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 241 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 242 | * [memory.dual_browser_test][memory_extra_py] - cycle between doing Google |
| 243 | searches on a WebView-based browser (a stand-in for the Google Search app) |
| 244 | and loading pages on Chrome. Runs on Android devices only. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 245 | |
| 246 | Story groups are either `on_chrome` or `on_webview`, indicating the browser |
| 247 | in foreground at the moment when the memory measurement was made. |
| 248 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 249 | * [memory.long_running_dual_browser_test][memory_extra_py] - same as above, |
| 250 | but the test is run for 60 iterations keeping both browsers alive for the |
| 251 | whole duration of the test and without forcing garbage collection. Intended |
| 252 | as a last-resort net to catch memory leaks not apparent on shorter tests. |
perezju | 8f6b9c0 | 2017-01-09 17:17:24 | [diff] [blame] | 253 | |
Juan A. Navarro Perez | 5456937 | 2017-05-30 11:53:38 | [diff] [blame] | 254 | [memory_extra_py]: https://cs.chromium.org/chromium/src/tools/perf/contrib/memory_extras/memory_extras.py |