Blame - docs/memory-infra/memory_benchmarks.md - chromium/src.git

blob: 6c34a64861c6c3e6e33394b365c6193da70ba9d6 [file] [log] [blame] [view]

perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	1	# Memory Benchmarks
				2
				3	This document describes benchmarks available to track Chrome's and
				4	WebView's memory usage, where they live, what they measure, how to run them,
				5	and on how to diagnose regressions.
				6
				7	[TOC]
				8
				9	## Glossary
				10
				11	* User story: a set of actions to perform on a browser or device (e.g.
				12	open google homepage, type "foo", click search, scroll down, visit first
				13	result, etc.).
				14	* Metric: a data aggregation process that takes a Chrome trace as input
				15	(produced by a [Telemetry][] run) and produces a set of summary numbers as
				16	output (e.g. total GPU memory used).
				17	* Benchmark: a combination of (one or more) user stories and (one or
				18	more) metrics.
				19
				20	[Telemetry]: https://github.com/catapult-project/catapult/blob/master/telemetry/README.md
				21
				22	## System Health
				23
				24	System health is an effort to unify top-level benchmarks (as opposite to
				25	micro-benchmarks and regression tests) that are suitable to capture
				26	representative user stories.
				27
				28	### Benchmarks
				29
				30	System health memory benchmarks are:
				31
				32	* [system_health.memory_mobile][system_health] -
				33	user stories running on Android devices.
				34	* [system_health.memory_desktop][system_health] -
				35	user stories running on desktop platforms.
				36
				37	These benchmarks are run continuously on the [chromium.perf][] waterfall,
				38	collecting and reporting results on the
				39	[Chrome Performance Dashboard][chromeperf].
				40
				41	Other benchmarks maintained by the memory-infra team are discussed in the
				42	[appendix](#Other-benchmarks).
				43
				44	[system_health]: https://chromium.googlesource.com/chromium/src/+/master/tools/perf/page_sets/system_health/
				45	[chromium.perf]: https://build.chromium.org/p/chromium.perf/waterfall
				46	[chromeperf]: https://chromeperf.appspot.com/report
				47
				48	### User stories
				49
				50	System health user stories are classified by the kind of interactions they
				51	perform with the browser:
				52
				53	* `browse` stories navigate to a URL and interact with the page; e.g.
				54	scroll, click on elements, navigate to subpages, navigate back.
				55	* `load` stories just navigate to a URL and wait for the page to
				56	load.
				57	* `background` stories navigate to a URL, possibly interact with the
				58	page, and then bring another app to the foreground (thus pushing the
				59	browser to the background).
				60	* `long_running` stories interact with a page for a longer period
				61	of time (~5 mins).
				62	* `blank` has a single story that just navigates to about:blank.
				63
				64	The full name of a story has the form `{interaction}:{category}:{site}` where:
				65
				66	* `interaction` is one the labels given above;
				67	* `category` is used to group together sites with a similar purpose,
				68	e.g. `news`, `social`, `tools`;
				69	* `site` is a short name identifying the website in which the story mostly
				70	takes place, e.g. `cnn`, `facebook`, `gmail`.
				71
				72	For example `browse:news:cnn` and `background:social:facebook` are two system
				73	health user stories.
				74
				75	Today, for most stories a garbage collection is forced at the end of the
				76	story and a memory dump is then triggered. Metrics report the values
				77	obtained from this single measurement.
				78
				79	## Continuous monitoring
				80
				81	![Chrome Performance Dashboard](https://storage.googleapis.com/chromium-docs.appspot.com/79d08f59cf497c761f7099ea427704c14e9afc03.png)
				82
				83	To view data from one of the benchmarks on the
				84	[Chrome Performance Dashboard][chromeperf] you should select:
				85
				86	* Test suite: The name of a [benchmark](#Benchmarks).
				87	* Bot: The name of a platform or device configuration. Sign in to also
				88	see internal bots.
				89	* Subtest (1): The name of a [metric](#Understanding-memory-metrics).
				90	* Subtest (2): The name of a story group; these have the form
				91	`{interaction}_{category}` for system health stories.
				92	* Subtest (3): The name of a [user story](#User-stories)
				93	(with `:` replaced by `_`).
				94
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	95	If you are investigating a Perf dashboard alert and would like to see the
				96	details, you can click on any point of the graph. It gives you the commit range,
				97	buildbot output and a link to the trace file taken during the buildbot run.
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	98	(More information about reading trace files [here][memory-infra])
				99
				100	[memory-infra]: /docs/memory-infra/README.md
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	101
				102	![Chrome Performance Dashboard Alert](https://storage.googleapis.com/chromium-docs.appspot.com/perfdashboard_alert.png)
				103
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	104	## How to run the benchmarks
				105
Juan Antonio Navarro Perez	9e50ddde	2018-12-18 10:22:49	[diff] [blame]	106	Benchmarks may be run on a local platform/device or remotely on a pinpoint
				107	try job.
				108
				109	### How to run a pinpoint try job
				110
				111	Given a patch already uploaded to code review, try jobs provide a convenient
				112	way to evaluate its memory implications on devices or platforms which
				113	may not be immediately available to developers.
				114
				115	![New pinpoint try job dialog](https://storage.googleapis.com/chromium-docs.appspot.com/yHRMmUqraqJ.png)
				116
				117	To start a try job go to the [pinpoint][] website, click on the `+` button to
				118	create a new job, and fill in the required details:
				119
				120	[pinpoint]: https://pinpoint-dot-chromeperf.appspot.com/
				121
				122	* Bug ID (optional): The id of a crbug.com issue where pinpoint can post
				123	updates when the job finishes.
				124	* Gerrit URL: URL to the patch you want to test. Note that your patch can
				125	live in chromium or any of its sub-repositories!
				126	* Bot: Select a suitable device/platform from the drop-down menu on which
				127	to run your job.
				128	* Benchmark: The name of the benchmark to run. If you are interested in
				129	memory try `system_health.memory_mobile` or `system_health.memory_desktop`
				130	as appropriate.
				131	* Story (optional): A pattern passed to Telemetry's `--story-filter`
				132	option to only run stories that match the pattern.
				133	* Extra Test Arguments (optional): Additional command line arguments for
				134	Telemetry's `run_benchmark`. Of note, if you are interested in running a
				135	small but representative sample of system health stories you can pass
				136	`--story-tag-filter health_check`.
				137
				138	If you have more specific needs, or need to automate the creation of jobs, you
				139	can also consider using [pinpoint_cli][].
				140
				141	[pinpoint_cli]: https://cs.chromium.org/chromium/src/third_party/catapult/experimental/soundwave/bin/pinpoint_cli
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	142
				143	### How to run locally
				144
				145	After building, e.g. `ChromePublic.apk`, you can run a specific system health
				146	story with the command:
				147
				148	```
				149	$SRC/tools/perf/run_benchmark run system_health.memory_mobile \
				150	--browser android-chromium --story-filter load:search:google
				151	```
				152
				153	This will run the story with a default of 3 repetitions and produce a
				154	`results.html` file comparing results from this and any previous benchmark
Juan A. Navarro Perez	ee12a2a	2017-10-02 16:20:18	[diff] [blame]	155	runs. In addition, you'll also get individual [trace files][memory-infra]
Juan Antonio Navarro Perez	33b0d14	2018-01-19 14:54:35	[diff] [blame]	156	for each story run by the benchmark. Note: by default only high level
				157	metrics are shown, you may need to tick the "Show all" check box in order to
				158	view some of the lower level memory metrics.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	159
				160	![Example results.html file](https://storage.googleapis.com/chromium-docs.appspot.com/ea60207d9bb4809178fe75923d6d1a2b241170ef.png)
				161
				162	Other useful options for this command are:
				163
				164	* `--pageset-repeat [n]` - override the default number of repetitions
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	165	* `--reset-results` - clear results from any previous benchmark runs in the
				166	`results.html` file.
				167	* `--results-label [label]` - give meaningful names to your benchmark runs,
				168	this way it is easier to compare them.
				169
				170	For WebView make sure to [replace the system WebView][webview_install]
				171	on your device and use `--browser android-webview`.
				172
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	173	[memory-infra]: /docs/memory-infra/README.md
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	174	[webview_install]: https://www.chromium.org/developers/how-tos/build-instructions-android-webview
				175
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	176	## Understanding memory metrics
				177
				178	There is a large number of [memory-infra][] metrics, breaking down usage
				179	attributed to different components and processes.
				180
				181	![memory-infra metrics](https://storage.googleapis.com/chromium-docs.appspot.com/a73239c6367ed0f844500e51ce1e04556cb99b4f.png)
				182
				183	Most memory metrics have the form
				184	`memory:{browser}:{processes}:{source}:{component}:{kind}`
				185	where:
				186
				187	* browser: One of `chrome` or `webview`.
				188	* processess: One of `browser_process`, `renderer_processess`,
				189	`gpu_process`, or `all_processess`.
				190	* source: One of `reported_by_chrome` or `reported_by_os`
				191	* component: May be a Chrome component, e.g. `skia` or `sqlite`;
				192	details about a specific component, e.g. `v8:heap`; or a class of memory
xunjieli	0c0ed3be	2017-06-23 14:08:35	[diff] [blame]	193	as seen by the OS, e.g. `system_memory:native_heap` or `gpu_memory`. If
				194	reported by chrome, the metrics are gathered by `MemoryDumpProvider`s,
				195	probes placed in the specific components' codebase. For example, in
				196	"memory:chrome:all_processes:reported_by_chrome:net:effective_size_avg,"
				197	the component is "net" which is Chrome's network stack and
				198	"reported_by_chrome" means that this metric is gathered via probes in
				199	the network stack.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	200	* kind: The kind of memory being reported. For metrics reported by
				201	Chrome this usually is `effective_size` (others are `locked_size`
				202	and `allocated_objects_size`); for metrics by the OS this usually is
				203	`proportional_resident_size` (others are `peak_resident_size` and
				204	`private_dirty_size`).
				205
sullivan	a65c32f	2017-07-06 17:39:04	[diff] [blame]	206	[memory-infra]: /docs/memory-infra/README.md
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	207
				208	## Appendix
				209
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	210	There are a few other benchmarks maintained by the memory-infra team.
				211	These also use the same set of metrics as system health, but have differences
				212	on the kind of stories that they run.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	213
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	214	### memory.top_10_mobile
				215
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	216	The [memory.top_10_mobile][memory_py] benchmark is in the process of being deprecated
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	217	in favor of system health benchmarks. This process, however, hasn't been
				218	finalized and currently they are still the reference benchmark used for
				219	decision making in the Android release process. Therefore, **it is important
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	220	to diagnose and fix regressions caught by this benchmark**.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	221
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	222	The benchmark's work flow is:
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	223
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	224	- Cycle between:
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	225
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	226	- load a page on Chrome, wait for it to load, [force garbage collection
				227	and measure memory][measure];
				228	- push Chrome to the background, force garbage collection and measure
				229	memory again.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	230
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	231	- Repeat for each of 10 pages without closing the browser.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	232
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	233	- Close the browser, re-open and repeat the full page set a total of 5 times.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	234
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	235	- Story groups are either `foreground` or `background` depending on the state
				236	of the browser at the time of measurement.
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	237
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	238	The main difference to watch out between this and system health benchmarks is
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	239	that, since a single browser instance is kept open and shared by many
				240	individual stories, they are not independent of each other. In particular, **do
				241	not use the `--story-filter` argument when trying to reproduce regressions**
				242	on these benchmarks, as doing so will affect the results.
				243
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	244	[memory_py]: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/memory.py
perezju	a247cc7	2017-01-17 16:40:50	[diff] [blame]	245	[measure]: https://github.com/catapult-project/catapult/blob/master/telemetry/telemetry/internal/actions/action_runner.py#L133
				246
				247	### Dual browser benchmarks
				248
				249	Dual browser benchmarks are intended to assess the memory implications of
				250	shared resources between Chrome and WebView.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	251
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	252	* [memory.dual_browser_test][memory_extra_py] - cycle between doing Google
				253	searches on a WebView-based browser (a stand-in for the Google Search app)
				254	and loading pages on Chrome. Runs on Android devices only.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	255
				256	Story groups are either `on_chrome` or `on_webview`, indicating the browser
				257	in foreground at the moment when the memory measurement was made.
				258
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	259	* [memory.long_running_dual_browser_test][memory_extra_py] - same as above,
				260	but the test is run for 60 iterations keeping both browsers alive for the
				261	whole duration of the test and without forcing garbage collection. Intended
				262	as a last-resort net to catch memory leaks not apparent on shorter tests.
perezju	8f6b9c0	2017-01-09 17:17:24	[diff] [blame]	263
Juan A. Navarro Perez	5456937	2017-05-30 11:53:38	[diff] [blame]	264	[memory_extra_py]: https://cs.chromium.org/chromium/src/tools/perf/contrib/memory_extras/memory_extras.py