blob: e19e21637e8b201531f444bc6d8da2f60895c98d [file] [log] [blame] [view]
asankaddd5dc22015-03-20 15:52:401# Chrome Network Bug Triage : Suggested Workflow
2
3[TOC]
4
asankaddd5dc22015-03-20 15:52:405## Identifying unlabeled network bugs on the tracker
6
7* Look at new uncomfirmed bugs since noon PST on the last triager's rotation.
8 [Use this issue tracker
9 query](https://code.google.com/p/chromium/issues/list?can=2&q=status%3Aunconfirmed&sort=-id&num=1000).
10
eroman4cb6246c2016-02-23 04:00:2711* Read the title of the bug.
asankaddd5dc22015-03-20 15:52:4012
13* If a bug looks like it might be network/download/safe-browsing related,
eroman4cb6246c2016-02-23 04:00:2714 middle click (or command-click on OSX) to open it in a new tab.
asankaddd5dc22015-03-20 15:52:4015
16* If a user provides a crash ID for a crasher for a bug that could be
17 net-related, look at the crash stack at
18 [go/crash](https://goto.google.com/crash), and see if it looks to be network
19 related. Be sure to check if other bug reports have that stack trace, and
20 mark as a dupe if so. Even if the bug isn't network related, paste the stack
21 trace in the bug, so no one else has to look up the crash stack from the ID.
mmenke212fe432016-03-10 16:51:3322 * If there's just a blank form and a crash ID, just ignore the bug.
asankaddd5dc22015-03-20 15:52:4023
24* If network causes are possible, ask for a net-internals log (If it's not a
25 browser crash) and attach the most specific internals-network label that's
eroman96211952016-02-22 21:42:0326 applicable. If there isn't an applicable narrower component, a clear owner
27 for the issue, or there are multiple possibilities, attach the
28 Internals>Network component and proceed with further investigation.
asankaddd5dc22015-03-20 15:52:4029
eroman96211952016-02-22 21:42:0330* If non-network causes also seem possible, attach those components as well.
asankaddd5dc22015-03-20 15:52:4031
rdsmith937fba82016-03-29 21:17:4932## Investigate UMA notifications
33
34For each alert that fires, determine if it's a real alert and file a bug if so.
35
36* Don't file if the alert is coincident with a major volume change. The volume
37 at a particular date can be determined by hovering the mouse over the
38 appropriate location on the alert line.
39
40* Don't file if the alert is on a graph with very low volume (< ~200 data
41 points); it's probably noise, and we probably don't care even if it isn't.
42
43* Don't file if the graph is really noisy (but eyeball it to decide if there is
44 an underlying important shift under the noise).
45
46* Don't file if the alert is in the "Known Ignorable" list:
47 * SimpleCache on Windows
48 * DiskCache on Android.
49
eroman96211952016-02-22 21:42:0350## Investigating component=Internals>Network bugs
asankaddd5dc22015-03-20 15:52:4051
52* It's recommended that while on triage duty, you subscribe to the
eroman12372dc2016-02-22 20:08:1053 Internals>Network component (but not its subcomponents). To do this, go
54 to the issue tracker and then click "Saved Queries".
55 Add a query with these settings:
eroman4cb6246c2016-02-23 04:00:2756 * Saved query name: Network Bug Triage
57 * Project: chromium
58 * Query: component=Internals>Network
59 * Subscription options: Notify Immediately
asankaddd5dc22015-03-20 15:52:4060
eroman96211952016-02-22 21:42:0361* Look through uncomfirmed and untriaged component=Internals>Network bugs,
asankaddd5dc22015-03-20 15:52:4062 prioritizing those updated within the last week. [Use this issue tracker
mmenke212fe432016-03-10 16:51:3363 query](https://bugs.chromium.org/p/chromium/issues/list?can=2&q=component%3DInternals%3ENetwork+status%3AUnconfirmed,Untriaged+-label:Needs-Feedback&sort=-modified).
asankaddd5dc22015-03-20 15:52:4064
65* If more information is needed from the reporter, ask for it and add the
mmenke212fe432016-03-10 16:51:3366 Needs-Feedback label.
asankaddd5dc22015-03-20 15:52:4067
68* While investigating a new issue, change the status to Untriaged.
69
70* If a bug is a potential security issue (Allows for code execution from remote
71 site, allows crossing security boundaries, unchecked array bounds, etc) mark
72 it Type-Bug-Security. If it has privacy implication (History, cookies
73 discoverable by an entity that shouldn't be able to do so, incognito state
74 being saved in memory or on disk beyond the lifetime of incognito tabs, etc),
eroman96211952016-02-22 21:42:0375 mark it with component Privacy.
asankaddd5dc22015-03-20 15:52:4076
eroman96211952016-02-22 21:42:0377* For bugs that already have a more specific network component, go ahead and
mmenke212fe432016-03-10 16:51:3378 remove the Internals>Network component to get them off the next triager's
79 radar and move on.
asankaddd5dc22015-03-20 15:52:4080
81* Try to figure out if it's really a network bug. See common non-network
eroman96211952016-02-22 21:42:0382 components section for description of common components for issues incorrectly
83 tagged as Internals>Network.
asankaddd5dc22015-03-20 15:52:4084
eroman96211952016-02-22 21:42:0385* If it's not, attach appropriate labels/components and go no further.
asankaddd5dc22015-03-20 15:52:4086
eroman96211952016-02-22 21:42:0387* If it may be a network bug, attach additional possibly relevant component if
asankaddd5dc22015-03-20 15:52:4088 any, and continue investigating. Once you either determine it's a
eroman96211952016-02-22 21:42:0389 non-network bug, or figure out accurate more specific network components, your
asankaddd5dc22015-03-20 15:52:4090 job is done, though you should still ask for a net-internals dump if it seems
91 likely to be useful.
92
93* Note that ChromeOS-specific network-related code (Captive portal detection,
94 connectivity detection, login, etc) may not all have appropriate more
eroman96211952016-02-22 21:42:0395 specific subcomponents, but are not in areas handled by the network stack
96 team. Just make sure those have the OS-Chrome label, and any more specific
97 labels if applicable, and then move on.
asankaddd5dc22015-03-20 15:52:4098
99* Gather data and investigate.
100 * Remember to add the Needs-Feedback label whenever waiting for the user to
101 respond with more information, and remove it when not waiting on the
102 user.
103 * Try to reproduce locally. If you can, and it's a regression, use
104 src/tools/bisect-builds.py to figure out when it regressed.
105 * Ask more data from the user as needed (net-internals dumps, repro case,
106 crash ID from about:crashes, run tests, etc).
107 * If asking for an about:net-internals dump, provide this link:
108 https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-details.
109 Can just grab the link from about:net-internals, as needed.
110
eroman96211952016-02-22 21:42:03111* Try to figure out what's going on, and which more specific network component
112 is most appropriate.
asankaddd5dc22015-03-20 15:52:40113
114* If it's a regression, browse through the git history of relevant files to try
115 and figure out when it regressed. CC authors / primary reviewers of any
116 strongly suspect CLs.
117
118* If you are having trouble with an issue, particularly for help understanding
119 net-internals logs, email the public [email protected] list for help
120 debugging. If it's a crasher, or for some other reason discussion needs to
121 be done in private, use chrome-network-debugging@google.com. TODO(mmenke):
122 Write up a net-internals tips and tricks docs.
123
124* If it appears to be a bug in the unowned core of the network stack (i.e. no
eroman96211952016-02-22 21:42:03125 subcomponent applies, or only the Internals>Network>HTTP subcomponent
126 applies, and there's no clear owner), try to figure out the exact cause.
asankaddd5dc22015-03-20 15:52:40127
mmenke212fe432016-03-10 16:51:33128## Looking for new crashers
129
1301. Go to [go/chromecrash](https://goto.google.com/chromecrash).
131
1322. For each platform, look through the releases for which releases to
davidben8ec933c2016-04-21 17:13:35133 investigate. As per [bug-triage.md](bug-triage.md), this should be the most
134 recent canary, the previous canary (if the most recent is less than a day
135 old), and any of dev/beta/stable that were released in the last couple of
136 days.
mmenke212fe432016-03-10 16:51:33137
1383. For each release, in the "Process Type" frame, click on "browser".
139
1404. At the bottom of the "Magic Signature" frame, click "limit 1000" (Or reduce
141 the limit to 100 first, as that's all the triager needs to look at).
142 Reported crashers are sorted in decreasing order of the number of reports for
143 that crash signature.
144
1455. Search the page for *"net::"*.
146
1476. For each found signature:
148 * Ignore signatures that only occur once or twice, as memory corruption can
149 easily cause one-off failures when the sample size is large enough. Also
150 ignore crashers that are not in the top 100 for that platform / release.
151 * If there is a bug already filed, make sure it is correctly describing the
152 current bug (e.g. not closed, or not describing a long-past issue), and
153 make sure that if it is a *net* bug, that it is labeled as such.
154 * Ignore signatures that only come from one or two client IDs, as individual
155 machine malware and breakage can cause one-off failures.
156 * Click on the number of reports field to see details of crash. Ignore it
157 if it doesn't appear to be a network bug.
158 * Otherwise, file a new bug directly from chromecrash.
159 * For each bug you file, include the following information:
160 * The backtrace. Note that the backtrace should not be added to the
161 bug if Restrict-View-Google isn't set on the bug as it may contain
162 PII. Filing the bug from the crash reporter should do this
163 automatically, but check.
164 * The channel in which the bug is seen (canary/dev/beta/stable), and its
165 rank among crashers in the channel.
166 * The frequency of this signature in recent releases. This information
167 is available by:
168 1. Clicking on the signature in the "Magic Signature" list
169 2. Clicking "Edit" on the dremel query at the top of the page
170 3. Removing the "product.version='X.Y.Z.W' AND" string and clicking
171 "Update".
172 4. Clicking "Limit 1000" in the Product Version list in the
173 resulting page (without this, the listing will be restricted to
174 the releases in which the signature is most common, which will
175 often not include the canary/dev release being investigated).
176 5. Choose some subset of that list, or all of it, to include in the
177 bug. Make sure to indicate if there is a defined point in the
178 past before which the signature is not present.
asankaddd5dc22015-03-20 15:52:40179
180## Investigating crashers
181
182* Only investigate crashers that are still occurring, as identified by above
183 section. If a search on go/crash indicates a crasher is no longer occurring,
184 mark it as WontFix.
185
mmenke9ccb0de2015-04-23 16:11:11186* On Windows, you may want to look for weird dlls associated with the crashes.
187 This generally needs crashes from a fair number of different users to reach
188 any conclusions.
189 * To get a list of loaded modules in related crash dumps, select
190 modules->3rd party in the left pane. It can be difficult to distinguish
191 between safe dlls and those likely to cause problems, but even if you're
192 not that familiar with windows, some may stick out. Anti-virus programs,
193 download managers, and more gray hat badware often have meaningful dll
194 names or dll paths (Generally product names or company names). If you
195 see one of these in a significant number of the crash dumps, it may well
196 be the cause.
197 * You can also try selecting the "has malware" option, though that's much
198 less reliable than looking manually.
asankaddd5dc22015-03-20 15:52:40199
200* See if the same users are repeatedly running into the same issue. This can
201 be accomplished by search for (Or clicking on) the client ID associated with
202 a crash report, and seeing if there are multiple reports for the same crash.
203 If this is the case, it may be also be malware, or an issue with an unusual
204 system/chrome/network config.
205
206* Dig through crash reports to figure out when the crash first appeared, and
207 dig through revision history in related files to try and locate a suspect CL.
208 TODO(mmenke): Add more detail here.
209
210* Load crash dumps, try to figure out a cause. See
211 http://www.chromium.org/developers/crash-reports for more information