Blame - docs/memory-infra/memory_benchmarks.md - chromium/src.git

blob: 7153db41b0a1cf5d9dd8ae312786a947595e8ec0 [file] [log] [blame] [view]

perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	1	# Memory Benchmarks
				2
				3	This document describes benchmarks available to track Chrome's and
				4	WebView's memory usage, where they live, what they measure, how to run them,
				5	and on how to diagnose regressions.
				6
				7	[TOC]
				8
				9	## Glossary
				10
				11	* User story: a set of actions to perform on a browser or device (e.g.
				12	open google homepage, type "foo", click search, scroll down, visit first
				13	result, etc.).
				14	* Metric: a data aggregation process that takes a Chrome trace as input
				15	(produced by a [Telemetry][] run) and produces a set of summary numbers as
				16	output (e.g. total GPU memory used).
				17	* Benchmark: a combination of (one or more) user stories and (one or
				18	more) metrics.
				19
				20	[Telemetry]: https://github.com/catapult-project/catapult/blob/master/telemetry/README.md
				21
				22	## System Health
				23
				24	System health is an effort to unify top-level benchmarks (as opposite to
				25	micro-benchmarks and regression tests) that are suitable to capture
				26	representative user stories.
				27
				28	### Benchmarks
				29
				30	System health memory benchmarks are:
				31
				32	* [system_health.memory_mobile][system_health] -
				33	user stories running on Android devices.
				34	* [system_health.memory_desktop][system_health] -
				35	user stories running on desktop platforms.
				36
				37	These benchmarks are run continuously on the [chromium.perf][] waterfall,
				38	collecting and reporting results on the
				39	[Chrome Performance Dashboard][chromeperf].
				40
				41	Other benchmarks maintained by the memory-infra team are discussed in the
				42	[appendix](#Other-benchmarks).
				43
				44	[system_health]: https://chromium.googlesource.com/chromium/src/+/master/tools/perf/page_sets/system_health/
				45	[chromium.perf]: https://build.chromium.org/p/chromium.perf/waterfall
				46	[chromeperf]: https://chromeperf.appspot.com/report
				47
				48	### User stories
				49
				50	System health user stories are classified by the kind of interactions they
				51	perform with the browser:
				52
				53	* `browse` stories navigate to a URL and interact with the page; e.g.
				54	scroll, click on elements, navigate to subpages, navigate back.
				55	* `load` stories just navigate to a URL and wait for the page to
				56	load.
				57	* `background` stories navigate to a URL, possibly interact with the
				58	page, and then bring another app to the foreground (thus pushing the
				59	browser to the background).
				60	* `long_running` stories interact with a page for a longer period
				61	of time (~5 mins).
				62	* `blank` has a single story that just navigates to about:blank.
				63
				64	The full name of a story has the form `{interaction}:{category}:{site}` where:
				65
				66	* `interaction` is one the labels given above;
				67	* `category` is used to group together sites with a similar purpose,
				68	e.g. `news`, `social`, `tools`;
				69	* `site` is a short name identifying the website in which the story mostly
				70	takes place, e.g. `cnn`, `facebook`, `gmail`.
				71
				72	For example `browse:news:cnn` and `background:social:facebook` are two system
				73	health user stories.
				74
				75	Today, for most stories a garbage collection is forced at the end of the
				76	story and a memory dump is then triggered. Metrics report the values
				77	obtained from this single measurement.
				78
				79	## Continuous monitoring
				80
				81	![Chrome Performance Dashboard](https://storage.googleapis.com/chromium-docs.appspot.com/79d08f59cf497c761f7099ea427704c14e9afc03.png)
				82
				83	To view data from one of the benchmarks on the
				84	[Chrome Performance Dashboard][chromeperf] you should select:
				85
				86	* Test suite: The name of a [benchmark](#Benchmarks).
				87	* Bot: The name of a platform or device configuration. Sign in to also
				88	see internal bots.
				89	* Subtest (1): The name of a [metric](#Understanding-memory-metrics).
				90	* Subtest (2): The name of a story group; these have the form
				91	`{interaction}_{category}` for system health stories.
				92	* Subtest (3): The name of a [user story](#User-stories)
				93	(with `:` replaced by `_`).
				94
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	95	If you are investigating a Perf dashboard alert and would like to see the
				96	details, you can click on any point of the graph. It gives you the commit range,
				97	buildbot output and a link to the trace file taken during the buildbot run.
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	98	(More information about reading trace files [here][memory-infra])
				99
				100	[memory-infra]: /docs/memory-infra/README.md
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	101
				102	![Chrome Performance Dashboard Alert](https://storage.googleapis.com/chromium-docs.appspot.com/perfdashboard_alert.png)
				103
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	104	## How to run the benchmarks
				105
				106	Benchmarks may be run on a local platform/device or remotely on a try job.
				107
				108	### How to run locally
				109
				110	After building, e.g. `ChromePublic.apk`, you can run a specific system health
				111	story with the command:
				112
				113	```
				114	$SRC/tools/perf/run_benchmark run system_health.memory_mobile \
				115	--browser android-chromium --story-filter load:search:google
				116	```
				117
				118	This will run the story with a default of 3 repetitions and produce a
				119	`results.html` file comparing results from this and any previous benchmark
Juan A. Navarro Perez	ee12a2a	2017-10-02 16:20:18	[diff] [blame]	120	runs. In addition, you'll also get individual [trace files][memory-infra]
Juan Antonio Navarro Perez	33b0d14	2018-01-19 14:54:35	[diff] [blame]	121	for each story run by the benchmark. Note: by default only high level
				122	metrics are shown, you may need to tick the "Show all" check box in order to
				123	view some of the lower level memory metrics.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	124
				125	![Example results.html file](https://storage.googleapis.com/chromium-docs.appspot.com/ea60207d9bb4809178fe75923d6d1a2b241170ef.png)
				126
				127	Other useful options for this command are:
				128
				129	* `--pageset-repeat [n]` - override the default number of repetitions
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	130	* `--reset-results` - clear results from any previous benchmark runs in the
				131	`results.html` file.
				132	* `--results-label [label]` - give meaningful names to your benchmark runs,
				133	this way it is easier to compare them.
				134
				135	For WebView make sure to [replace the system WebView][webview_install]
				136	on your device and use `--browser android-webview`.
				137
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	138	[memory-infra]: /docs/memory-infra/README.md
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	139	[webview_install]: https://www.chromium.org/developers/how-tos/build-instructions-android-webview
				140
				141	### How to run a try job
				142
				143	Given a patch on a chromium checkout, try jobs provide a convenient way to
				144	evaluate its memory implications on devices or platforms which
				145	may not be immediately available to developers.
				146
				147	To start a try job [upload a CL][contributing] and run the command, e.g.:
				148
				149	```
				150	$SRC/tools/perf/run_benchmark try android-nexus5 system_health.memory_mobile
				151	```
				152
				153	This will run all of the system health stories for you, and conveniently
				154	provide a `results.html` file comparing measurements with/without your patch.
				155	Options like `--story-filter` and `--pageset-repeat` may also be passed to
				156	this command.
				157
				158	To see the full list of available try bots run the command:
				159
				160	```
				161	$SRC/tools/perf/run_benchmark try list
				162	```
				163
				164	[contributing]: https://www.chromium.org/developers/contributing-code
				165
				166	## Understanding memory metrics
				167
				168	There is a large number of [memory-infra][] metrics, breaking down usage
				169	attributed to different components and processes.
				170
				171	![memory-infra metrics](https://storage.googleapis.com/chromium-docs.appspot.com/a73239c6367ed0f844500e51ce1e04556cb99b4f.png)
				172
				173	Most memory metrics have the form
				174	`memory:{browser}:{processes}:{source}:{component}:{kind}`
				175	where:
				176
				177	* browser: One of `chrome` or `webview`.
				178	* processess: One of `browser_process`, `renderer_processess`,
				179	`gpu_process`, or `all_processess`.
				180	* source: One of `reported_by_chrome` or `reported_by_os`
				181	* component: May be a Chrome component, e.g. `skia` or `sqlite`;
				182	details about a specific component, e.g. `v8:heap`; or a class of memory
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	183	as seen by the OS, e.g. `system_memory:native_heap` or `gpu_memory`. If
				184	reported by chrome, the metrics are gathered by `MemoryDumpProvider`s,
				185	probes placed in the specific components' codebase. For example, in
				186	"memory:chrome:all_processes:reported_by_chrome:net:effective_size_avg,"
				187	the component is "net" which is Chrome's network stack and
				188	"reported_by_chrome" means that this metric is gathered via probes in
				189	the network stack.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	190	* kind: The kind of memory being reported. For metrics reported by
				191	Chrome this usually is `effective_size` (others are `locked_size`
				192	and `allocated_objects_size`); for metrics by the OS this usually is
				193	`proportional_resident_size` (others are `peak_resident_size` and
				194	`private_dirty_size`).
				195
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	196	[memory-infra]: /docs/memory-infra/README.md
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	197
				198	## Appendix
				199
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	200	There are a few other benchmarks maintained by the memory-infra team.
				201	These also use the same set of metrics as system health, but have differences
				202	on the kind of stories that they run.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	203
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	204	### memory.top_10_mobile
				205
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	206	The [memory.top_10_mobile][memory_py] benchmark is in the process of being deprecated
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	207	in favor of system health benchmarks. This process, however, hasn't been
				208	finalized and currently they are still the reference benchmark used for
				209	decision making in the Android release process. Therefore, **it is important
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	210	to diagnose and fix regressions caught by this benchmark**.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	211
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	212	The benchmark's work flow is:
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	213
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	214	- Cycle between:
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	215
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	216	- load a page on Chrome, wait for it to load, [force garbage collection
				217	and measure memory][measure];
				218	- push Chrome to the background, force garbage collection and measure
				219	memory again.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	220
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	221	- Repeat for each of 10 pages without closing the browser.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	222
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	223	- Close the browser, re-open and repeat the full page set a total of 5 times.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	224
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	225	- Story groups are either `foreground` or `background` depending on the state
				226	of the browser at the time of measurement.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	227
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	228	The main difference to watch out between this and system health benchmarks is
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	229	that, since a single browser instance is kept open and shared by many
				230	individual stories, they are not independent of each other. In particular, **do
				231	not use the `--story-filter` argument when trying to reproduce regressions**
				232	on these benchmarks, as doing so will affect the results.
				233
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	234	[memory_py]: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/memory.py
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	235	[measure]: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/actions/action_runner.py#L133
				236
				237	### Dual browser benchmarks
				238
				239	Dual browser benchmarks are intended to assess the memory implications of
				240	shared resources between Chrome and WebView.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	241
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	242	* [memory.dual_browser_test][memory_extra_py] - cycle between doing Google
				243	searches on a WebView-based browser (a stand-in for the Google Search app)
				244	and loading pages on Chrome. Runs on Android devices only.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	245
				246	Story groups are either `on_chrome` or `on_webview`, indicating the browser
				247	in foreground at the moment when the memory measurement was made.
				248
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	249	* [memory.long_running_dual_browser_test][memory_extra_py] - same as above,
				250	but the test is run for 60 iterations keeping both browsers alive for the
				251	whole duration of the test and without forcing garbage collection. Intended
				252	as a last-resort net to catch memory leaks not apparent on shorter tests.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	253
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	254	[memory_extra_py]: https://cs.chromium.org/chromium/src/tools/perf/contrib/memory_extras/memory_extras.py