andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 1 | # Linux Profiling |
| 2 | |
qyearsley | c0dc6f4 | 2016-12-02 22:13:39 | [diff] [blame] | 3 | How to profile Chromium on Linux. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 4 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 5 | See |
| 6 | [Profiling Chromium and WebKit](https://sites.google.com/a/chromium.org/dev/developers/profiling-chromium-and-webkit) |
| 7 | for alternative discussion. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 8 | |
| 9 | ## CPU Profiling |
| 10 | |
| 11 | gprof: reported not to work (taking an hour to load on our large binary). |
| 12 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 13 | oprofile: Dean uses it, says it's good. (As of 9/16/9 oprofile only supports |
| 14 | timers on the new Z600 boxes, which doesn't give good granularity for profiling |
| 15 | startup). |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 16 | |
| 17 | TODO(willchan): Talk more about oprofile, gprof, etc. |
| 18 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 19 | Also see |
| 20 | https://sites.google.com/a/chromium.org/dev/developers/profiling-chromium-and-webkit |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 21 | |
| 22 | ### perf |
| 23 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 24 | `perf` is the successor to `oprofile`. It's maintained in the kernel tree, it's |
| 25 | available on Ubuntu in the package `linux-tools`. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 26 | |
| 27 | To capture data, you use `perf record`. Some examples: |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 28 | |
| 29 | ```shell |
| 30 | # captures the full execution of the program |
| 31 | perf record -f -g out/Release/chrome |
| 32 | # captures a particular pid, you can start at the right time, and stop with |
| 33 | # ctrl-C |
| 34 | perf record -f -g -p 1234 |
| 35 | perf record -f -g -a # captures the whole system |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 36 | ``` |
| 37 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 38 | Some versions of the perf command can be confused by process renames. Affected |
| 39 | versions will be unable to resolve Chromium's symbols if it was started through |
| 40 | perf, as in the first example above. It should work correctly if you attach to |
| 41 | an existing Chromium process as shown in the second example. (This is known to |
| 42 | be broken as late as 3.2.5 and fixed as early as 3.11.rc3.g36f571. The actual |
| 43 | affected range is likely much smaller. You can download and build your own perf |
| 44 | from source.) |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 45 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 46 | The last one is useful on limited systems with few cores and low memory |
| 47 | bandwidth, where the CPU cycles are shared between several processes (e.g. |
| 48 | chrome browser, renderer, plugin, X, pulseaudio, etc.) |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 49 | |
| 50 | To look at the data, you use: |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 51 | |
| 52 | perf report |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 53 | |
| 54 | This will use the previously captured data (`perf.data`). |
| 55 | |
| 56 | ### google-perftools |
| 57 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 58 | google-perftools code is enabled when the `use_allocator` variable in gyp is set |
| 59 | to `tcmalloc` (currently the default). That will build the tcmalloc library, |
| 60 | including the cpu profiling and heap profiling code into Chromium. In order to |
| 61 | get stacktraces in release builds on 64 bit, you will need to build with some |
| 62 | extra flags enabled by setting `profiling=1` in gyp. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 63 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 64 | If the stack traces in your profiles are incomplete, this may be due to missing |
| 65 | frame pointers in some of the libraries. A workaround is to use the |
| 66 | `linux_keep_shadow_stacks=1` gyp option. This will keep a shadow stack using the |
| 67 | `-finstrument-functions` option of gcc and consult the stack when unwinding. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 68 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 69 | In order to enable cpu profiling, run Chromium with the environment variable |
| 70 | `CPUPROFILE` set to a filename. For example: |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 71 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 72 | CPUPROFILE=/tmp/cpuprofile out/Release/chrome |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 73 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 74 | After the program exits successfully, the cpu profile will be available at the |
| 75 | filename specified in the CPUPROFILE environment variable. You can then analyze |
| 76 | it using the pprof script (distributed with google-perftools, installed by |
| 77 | default on Googler Linux workstations). For example: |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 78 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 79 | pprof --gv out/Release/chrome /tmp/cpuprofile |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 80 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 81 | This will generate a visual representation of the cpu profile as a postscript |
| 82 | file and load it up using `gv`. For more powerful commands, please refer to the |
| 83 | pprof help output and the google-perftools documentation. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 84 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 85 | Note that due to the current design of google-perftools' profiling tools, it is |
| 86 | only possible to profile the browser process. You can also profile and pass the |
| 87 | `--single-process` flag for a rough idea of what the render process looks like, |
| 88 | but keep in mind that you'll be seeing a mixed browser/renderer codepath that is |
| 89 | not used in production. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 90 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 91 | For further information, please refer to |
| 92 | http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 93 | |
| 94 | ## Heap Profiling |
| 95 | |
| 96 | ### google-perftools |
| 97 | |
| 98 | #### Turning on heap profiles |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 99 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 100 | Follow the instructions for enabling profiling as described above in the |
| 101 | google-perftools section under CPU Profiling. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 102 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 103 | To turn on the heap profiler on a Chromium build with tcmalloc, use the |
| 104 | `HEAPPROFILE` environment variable to specify a filename for the heap profile. |
| 105 | For example: |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 106 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 107 | HEAPPROFILE=/tmp/heapprofile out/Release/chrome |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 108 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 109 | After the program exits successfully, the heap profile will be available at the |
| 110 | filename specified in the `HEAPPROFILE` environment variable. |
| 111 | |
| 112 | Some tests fork short-living processes which have a small memory footprint. To |
| 113 | catch those, use the `HEAP_PROFILE_ALLOCATION_INTERVAL` environment variable. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 114 | |
| 115 | #### Dumping a profile of a running process |
| 116 | |
| 117 | To programmatically generate a heap profile before exit, use code like: |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 118 | |
| 119 | #include "third_party/tcmalloc/chromium/src/google/heap-profiler.h" |
| 120 | |
| 121 | // "foobar" will be included in the message printed to the console |
| 122 | HeapProfilerDump("foobar"); |
| 123 | |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 124 | For example, you might hook that up to some action in the UI. |
| 125 | |
| 126 | Or you can use gdb to attach at any point: |
| 127 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 128 | 1. Attach gdb to the process: `$ gdb -p 12345` |
| 129 | 1. Cause it to dump a profile: `(gdb) p HeapProfilerDump("foobar")` |
| 130 | 1. The filename will be printed on the console you started Chrome from; e.g. |
| 131 | "`Dumping heap profile to heap.0001.heap (foobar)`" |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 132 | |
| 133 | #### Analyzing dumps |
| 134 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 135 | You can then analyze dumps using the `pprof` script (distributed with |
| 136 | google-perftools, installed by default on Googler Linux workstations; on Ubuntu |
| 137 | it is called `google-pprof`). For example: |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 138 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 139 | pprof --gv out/Release/chrome /tmp/heapprofile |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 140 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 141 | This will generate a visual representation of the heap profile as a postscript |
| 142 | file and load it up using `gv`. For more powerful commands, please refer to the |
| 143 | pprof help output and the google-perftools documentation. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 144 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 145 | (pprof is slow. Googlers can try the not-open-source cpprof; Evan wrote an open |
| 146 | source alternative [available on github](https://github.com/martine/hp).) |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 147 | |
| 148 | #### Sandbox |
| 149 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 150 | Sandboxed renderer subprocesses will fail to write out heap profiling dumps. To |
| 151 | work around this, turn off the sandbox (via `export CHROME_DEVEL_SANDBOX=`). |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 152 | |
| 153 | #### Troubleshooting |
| 154 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 155 | * "Hooked allocator frame not found": build with `-Dcomponent=static_library`. |
| 156 | `tcmalloc` gets confused when the allocator routines are in a different |
| 157 | `.so` than the rest of the code. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 158 | |
| 159 | #### More reading |
| 160 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 161 | For further information, please refer to |
| 162 | http://google-perftools.googlecode.com/svn/trunk/doc/heapprofile.html. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 163 | |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 164 | ## Paint profiling |
| 165 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 166 | You can use Xephyr to profile how chrome repaints the screen. Xephyr is a |
| 167 | virtual X server like Xnest with debugging options which draws red rectangles to |
| 168 | where applications are drawing before drawing the actual information. |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 169 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 170 | export XEPHYR_PAUSE=10000 |
| 171 | Xephyr :1 -ac -screen 800x600 & |
| 172 | DISPLAY=:1 out/Debug/chrome |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 173 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 174 | When ready to start debugging issue the following command, which will tell |
| 175 | Xephyr to start drawing red rectangles: |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 176 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 177 | kill -USR1 `pidof Xephyr` |
andybons | 3322f76 | 2015-08-24 21:37:09 | [diff] [blame] | 178 | |
andybons | ad92aa3 | 2015-08-31 02:27:44 | [diff] [blame] | 179 | For further information, please refer to |
| 180 | http://cgit.freedesktop.org/xorg/xserver/tree/hw/kdrive/ephyr/README. |