blob: 00cd65f18ecc63d0989d14ec7b052232e7573bb5 [file] [log] [blame] [view]
Ramin Halavati2e7ffe4f2017-11-13 11:19:351# Network Traffic Annotations
2
3[TOC]
4
5This document presents a practical guide to using network traffic annotations in
6Chrome.
7
8
9## Problem Statement
10
11To make Chrome’s network communication transparent, we would need to be able to
12provide the following answers:
13* What is the intent behind each network request?
14* What user data is sent in the request, and where does it go?
15
Ramin Halavati6c1526e2018-04-06 05:44:3816Besides these requirements, the following information helps users, admins, and
17help desk:
Ramin Halavati2e7ffe4f2017-11-13 11:19:3518* How can a network communication be stopped or controlled?
19* What are the traces of the communication on the client?
20
21It should be noted that the technical details of requests are not necessarily
22important to the users, but in order to provide the intended transparency, we
23need to show that we have covered all bases and there are no back doors.
24
25
26## The Solution
27
28We can provide up to date, in-line documentation on origin, intent, payload, and
29control mechanisms of each network communication. This is done by adding a
30`NetworkTrafficAnnotationTag` to all network communication functions.
31Please note that as the goal is to specify the intent behind each network
32request and its payload, this metadata does not need to be transmitted with the
33request during runtime and it is sufficient to have it in appropriate positions
34in the code. Having that as an argument of all network communication functions
35is a mechanism to enforce its existence and showing the users our intent to
36cover the whole repository.
37
38
39## Best Practices
40
41### Where to add annotation?
42All network requests are ultimately sending data through sockets or native API
43functions, but we should note that the concern is about the main intent of the
44communication and not the implementation details. Therefore we do not need to
45specify this data separately for each call to each function that is used in the
46process and it is sufficient that the most rational point of origin would be
47annotated and the annotation would be passed through the downstream steps.
48Best practices for choosing annotation code site include where:
49 1. The origin of user’s intent or internal requirement for the request is
50 stated.
51 2. The controls over stopping or limiting the request (Chrome settings or
52 policies) are enforced.
53 3. The data that is sent is specified.
54If there is a conflict between where is the best annotation point, please refer
55to the `Partial Annotations` section for an approach to split annotation.
56
57### Merged Requests
58There are cases where requests are received from multiple sources and merged
59into one connection, like when a socket merges several data frames and sends
60them together, or a device location is requested by different components, and
61just one network request is made to fetch it. In these cases, the merge point
62can ensure that all received requests are properly annotated and just pass one
Ramin Halavati6c1526e2018-04-06 05:44:3863of them to the downstream step. It can also pass a local annotation stating that
64it is a merged request on behalf of other requests of type X, which were ensured
65to all have annotations.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3566This decision is driven from the fact that we do not need to transmit the
67annotation metadata in runtime and enforced annotation arguments are just to
68ensure that the request is annotated somewhere upstream.
69
70
71## Coverage
Ramin Halavati6c1526e2018-04-06 05:44:3872Network traffic annotations are currently enforced on all url requests and
73socket writes, except for the code which is not compiled on Windows or Linux.
74This effort may expand to ChromeOS in future and currently there is no plan to
75expand it to other platforms.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3576
77
78## Network Traffic Annotation Tag
79
80`net::NetworkTrafficAnnotationTag` is the main definition for annotations. There
81are few variants of it that are specified in later sections. The goal is to have
82one object of this type or its variants as an argument of all functions that
83create a network request.
84
85### Content of Annotation Tag
Glenn Hartmanndb9723b32022-01-07 14:57:2586Each network traffic annotation should specify the following items, as defined
87in the `NetworkTrafficAnnotation` message of
88`chrome/browser/privacy/traffic_annotation.proto`:
Ramin Halavati2e7ffe4f2017-11-13 11:19:3589* `uniqueـid`: A globally unique identifier that must stay unchanged while the
90 network request carries the same semantic meaning. If the network request gets
91 a new meaning, this ID needs to be changed. The purpose of this ID is to give
92 humans a chance to reference NetworkTrafficAnnotations externally even when
93 those change a little bit (e.g. adding a new piece of data that is sent along
94 with a network request). IDs of one component should have a shared prefix so
95 that sorting all NetworkTrafficAnnotations by unique_id groups those that
96 belong to the same component together.
Glenn Hartmanndb9723b32022-01-07 14:57:2597* `source`: These set of fields specify the location of annotation in
Ramin Halavati2e7ffe4f2017-11-13 11:19:3598 the source code. These fields are automatically set and do not need
99 specification.
Glenn Hartmanndb9723b32022-01-07 14:57:25100* `semantics`: These set of fields specify meta information about the
Ramin Halavati2e7ffe4f2017-11-13 11:19:35101 network request’s content and reason.
102 * `sender`: What component triggers the request. The components should be
103 human readable and don’t need to reflect the components/ directory. Avoid
Ramin Halavati6c1526e2018-04-06 05:44:38104 abbreviations, and use a common value for all annotations in one component.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35105 * `description`: Plaintext description of the network request in language
106 that is understandable by admins (ideally also users). Please avoid
107 acronyms and describe the feature and the feature's value proposition as
108 well.
109 * `trigger`: What user action triggered the network request. Use a textual
110 description. This should be a human readable string.
111 * `data`: What nature of data is being sent. This should be a human readable
112 string. Any user data and/or PII should be pointed out.
113 * `destination`: Target of the network request. It can be either the website
114 that user visits and interacts with, a Google service, a request that does
115 not go to network and just fetches a local resource, or other endpoints
116 like a service hosting PAC scripts. The distinction between a Google owned
117 service and website can be difficult when the user navigates to google.com
118 or searches with the omnibar. Therefore follow the following guideline: If
119 the source code has hardcoded that the request goes to Google (e.g. for
120 ZeroSuggest), use `GOOGLE_OWNED_SERVICE`. If the request can go to other
121 domains and is perceived as a part of a website rather than a native
Nicolas Arciniega15745e22019-10-09 20:42:35122 browser feature, use `WEBSITE`. Use `LOCAL` if the request is processed
Ramin Halavati2e7ffe4f2017-11-13 11:19:35123 locally and doesn't go to network, otherwise use `OTHER`. If `OTHER` is
124 used, please add plain text description in `destination_other`
125 field.
126 * `destination_other`: Human readable description in case the destination
127 points to `OTHER`.
Sugandha Goyal18152792022-10-19 22:01:43128 * `contacts`: A person's or team's email address who are point-of-contact
129 for questions, issues, or bugs related to this network request.
130 This field is meant for internal use and should not be used in any
131 external reports.
Glenn Hartmanndb9723b32022-01-07 14:57:25132* `policy`: These set of fields specify the controls that a user may have
Ramin Halavati2e7ffe4f2017-11-13 11:19:35133 on disabling or limiting the network request and its trace.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35134 * `cookies_allowed`: Specifies if this request stores and uses cookies or
135 not. Use values `YES` or `NO`.
136 * `cookies_store`: If a request sends or stores cookies/channel IDs/... (i.e.
137 if `cookies_allowed` is true), we want to know which cookie store is being
138 used. The answer to this question can typically be derived from the
139 URLRequestContext that is being used. The three most common cases will be:
140 * If `cookies_allowed` is false, leave this field unset.
141 * If the profile's default URLRequestContext is being used (e.g. from
142 `Profile::GetRequestContext())`, this means that the user's normal
143 cookies sent. In this case, put `user` here.
144 * If the system URLRequestContext is being used (for example via
145 `io_thread()->system_url_request_context_getter())`, put `system` here.
146 * Otherwise, please explain (e.g. SafeBrowsing uses a separate cookie
147 store).
148 * `setting`: Human readable description of how to enable/disable a feature
149 that triggers this network request by a user (e.g. “Disable ‘Use a web
150 service to help resolve spelling errors.’ in settings under Advanced”).
151 Note that settings look different on different platforms, make sure your
152 description works everywhere!
153 * `chrome_policy`: Policy configuration that disables or limits this network
154 request. This would be a text serialized protobuf of any enterprise policy.
Glenn Hartmanndb9723b32022-01-07 14:57:25155 See policy list or
156 `out/Debug/gen/components/policy/proto/chrome_settings.proto` for the full
157 list of policies.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35158 * `policy_exception_justification`: If there is no policy to disable or limit
Ramin Halavatic7c0bb032018-01-30 16:44:48159 this request, a justification can be presented here.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35160* `comments`: If required, any human readable extra comments.
161
162### Format and Examples
163Traffic annotations are kept in code as serialized protobuf. To define a
164`NetworkTrafficAnnotationTag`, you may use the function
165`net::DefineNetworkTrafficAnnotation`, with two arguments, the unique id, and
166all other fields bundled together as a serialized protobuf string.
167
168#### Good examples
169```cpp
170 net::NetworkTrafficAnnotationTag traffic_annotation =
171 net::DefineNetworkTrafficAnnotation("spellcheck_lookup", R"(
172 semantics {
173 sender: "Online Spellcheck"
174 description:
175 "Chrome can provide smarter spell-checking by sending text you "
176 "type into the browser to Google's servers, allowing you to use "
177 "the same spell-checking technology used by Google products, such "
178 "as Docs. If the feature is enabled, Chrome will send the entire "
179 "contents of text fields as you type in them to Google along with "
180 "the browser’s default language. Google returns a list of "
181 "suggested spellings, which will be displayed in the context menu."
182 trigger: "User types text into a text field or asks to correct a "
183 "misspelled word."
184 data: "Text a user has typed into a text field. No user identifier "
185 "is sent along with the text."
186 destination: GOOGLE_OWNED_SERVICE
187 }
188 policy {
189 cookies_allowed: NO
190 setting:
191 "You can enable or disable this feature via 'Use a web service to "
192 "help resolve spelling errors.' in Chrome's settings under "
193 "Advanced. The feature is disabled by default."
194 chrome_policy {
195 SpellCheckServiceEnabled {
196 SpellCheckServiceEnabled: false
197 }
198 }
199 })");
200```
201
202```cpp
203 net::NetworkTrafficAnnotationTag traffic_annotation2 =
204 net::DefineNetworkTrafficAnnotation(
205 "safe_browsing_chunk_backup_request",
206 R"(
207 semantics {
208 sender: "Safe Browsing"
209 description:
210 "Safe Browsing updates its local database of bad sites every 30 "
211 "minutes or so. It aims to keep all users up-to-date with the same "
212 "set of hash-prefixes of bad URLs."
213 trigger:
214 "On a timer, approximately every 30 minutes."
215 data:
216 "The state of the local DB is sent so the server can send just the "
217 "changes. This doesn't include any user data."
218 destination: GOOGLE_OWNED_SERVICE
219 }
220 policy {
221 cookies_allowed: YES
222 cookies_store: "Safe Browsing cookie store"
223 setting:
224 "Users can disable Safe Browsing by unchecking 'Protect you and "
225 "your device from dangerous sites' in Chromium settings under "
226 "Privacy. The feature is enabled by default."
227 chrome_policy {
228 SafeBrowsingEnabled {
229 policy_options {mode: MANDATORY}
230 SafeBrowsingEnabled: false
231 }
232 }
233 })");
234```
235
236#### Bad Examples
237```cpp
238 net::NetworkTrafficAnnotationTag bad_traffic_annotation =
239 net::DefineNetworkTrafficAnnotation(
240 ...
241 trigger: "Chrome sends this when [obscure event that is not related to "
242 "anything user-perceivable]."
243 // Please specify the exact user action that results in this request.
244 data: "This sends everything the feature needs to know."
245 // Please be precise, name the data items. If they are too many, name
246 // the sensitive user data and general classes of other data and refer
247 // to a document specifying the details.
248 ...
249 policy_exception_justification: "None."
250 // Check again! Most features can be disabled or limited by a policy.
251 ...
252 })");
253```
254
255#### Empty Template
256You can copy/paste the following template to define an annotation.
257```cpp
258 net::NetworkTrafficAnnotationTag traffic_annotation =
259 net::DefineNetworkTrafficAnnotation("...", R"(
260 semantics {
261 sender: "..."
262 description: "..."
263 trigger: "..."
264 data: "..."
265 destination: WEBSITE/GOOGLE_OWNED_SERVICE/OTHER
266 }
267 policy {
Ramin Halavati2e7ffe4f2017-11-13 11:19:35268 cookies_allowed: NO/YES
269 cookies_store: "..."
270 setting: "..."
271 chrome_policy {
272 [POLICY_NAME] {
273 [POLICY_NAME]: ...
274 }
275 }
276 policy_exception_justification = "..."
277 }
278 comments: "..."
279 )");
280```
281
282
283## Testing for errors
284
285There are several checks that should be done on annotations before submitting a
286change list. These checks include:
287* The annotations are syntactically correct.
288* They have all required fields.
289* Partial annotations and completing parts match (please refer to the next
290 section).
291* Annotations are not incorrectly defined.
292 * e.g., traffic_annotation = NetworkTrafficAnnotation({1}).
293* All usages from Chrome have annotation.
294* Unique ids are unique, through history (even if an annotation gets deprecated,
295 its unique id cannot be reused to keep the stats sound).
Bella Bahaf76f4a2020-08-03 22:22:56296* That the annotation appears in
297 `tools/traffic_annotation/summary/grouping.xml`. When adding a new annotation,
298 it must also be included in `grouping.xml` for reporting purposes (please
299 refer to the **Annotations Review**).
Bella Bah2b772c132020-08-05 17:01:37300
Ramin Halavati2e7ffe4f2017-11-13 11:19:35301
Ramin Halavati6c1526e2018-04-06 05:44:38302### Presubmit tests
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37303To perform tests prior to submit, one can use the `auditor.py`
304script. It runs over the whole repository, extracts
305all the annotations from C++ code, and then checks them for correctness.
Ehsan Kia735f6b42020-02-27 18:14:59306
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37307Running the `auditor.py` script requires a build directory in which you just
308built the `chrome` target. You can invoke it like this:
309`vpython3 tools/traffic_annotation/scripts/auditor/auditor.py
310 --build-path=out/Default`
Ramin Halavati6c1526e2018-04-06 05:44:38311
312### Waterfall tests
313Two commit queue trybots test traffic annotations on changed files using the
314scripts in `tools/traffic_annotation/scripts`. To run these tests faster and to
Nicolas Arciniega15745e22019-10-09 20:42:35315avoid spamming the commit queue if an unforeseen error has happened in
316downstream scripts or tools, they are run in error resilient mode, only on
317changed files, and using heuristics to decide which files to process.
Ramin Halavati6c1526e2018-04-06 05:44:38318An FYI bot runs more detailed tests on the whole repository and with different
319switches, to make sure that the heuristics that trybot tests use and the limited
320scope of tests have not neglected any issues.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35321
322
323## Annotations Review
324
Ramin Halavati6c1526e2018-04-06 05:44:38325Network traffic annotations require review before landing in code and this is
326enforced through keeping a summary of annotations in
Bella Bahaf76f4a2020-08-03 22:22:56327`tools/traffic_annotation/summary/annotations.xml`. Once a new annotation is added,
328one is updated, or deleted, this file should also be updated. To update the
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37329`annotations.xml` file automatically, one can run `auditor.py`
Bella Bahaf76f4a2020-08-03 22:22:56330as specified in presubmit tests. But if it is not possible to do so (e.g., if
331you are changing the code from an unsupported platform or you don’t have a
332compiled build directory), the code can be submitted to the trybot and the test
Bella Bah2b772c132020-08-05 17:01:37333on trybot will tell you the required modifications.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35334
Bella Bahaf76f4a2020-08-03 22:22:56335In order to help make external reports easier, annotation unique ids should be
336mentioned in `tools/traffic_annotation/summary/grouping.xml`. Once a new
337annotation is added, or a preexisting annotation's unique id changes, this file
338should also be updated. When adding a new annotation, make sure it is placed
339within an appropriate group of `grouping.xml`. In the rare case that none of
340the groups are appropriate, one can create a new group for the annotation; the
341arrangement of annotations and group names in `grouping.xml` may be later
342updated by a technical writer to better coincide with the external reports.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35343
344## Partial Annotations (Advanced)
345
346There are cases where the network traffic annotation cannot be fully specified
347in one place. For example, in one place we know the trigger of a network request
348and in another place we know the data that will be sent. In these cases, we
349prefer that both parts of the annotation appear in context so that they are
350updated if code changes. Partial annotations help splitting the network traffic
351annotation into two pieces. In these cases, we call the first part, the partial
352annotation, and the part the completes it, the completing annotation. Partial
353annotations and completing annotations do not need to have all annotation
354fields, but their composition should have all required fields.
355
356### Defining a Partial Annotation
357To define a partial annotation, one can use
358`net::DefinePartialNetworkTrafficAnnotation` function. Besides the unique id and
359annotation text, this function requires the unique id of the completing part.
360For example, a partial annotation that only specifies the semantics part or a
361request with unique id "omnibox_prefetch_image", and is completed later using an
362annotation with unique id "bitmap_fetcher", can be defined as follows:
363
364```cpp
365 net::PartialNetworkTrafficAnnotationTag partial_traffic_annotation =
366 net::DefinePartialNetworkTrafficAnnotation("omnibox_prefetch_image",
367 "bitmap_fetcher", R"(
368 semantics {
369 sender: "Omnibox"
370 Description: "..."
371 Trigger: "..."
372 Data: "..."
373 destination: WEBSITE
374 })");
375```
376
377### Nx1 Partial Annotations
378The cases where several partial annotations may be completed by one completing
379annotation are called Nx1. This also matches where N=1. To define a completing
380annotation for such cases, one can use net::CompleteNetworkTrafficAnnotation
381function. This function receives a unique id, the annotation text, and a
382`net::PartialNetworkTrafficAnnotationTag` object. Here is an example of a
383completing part for the previous example:
384
385```cpp
386 net::NetworkTrafficAnnotationTag traffic_annotation =
387 net::CompleteNetworkTrafficAnnotation("bitmap_fetcher",
388 partial_traffic_annotation, R"(
389 policy {
390 cookies_allowed: YES
391 cookies_store: "user"
392 setting: "..."
393 chrome_policy {...}
394 })");
395```
396
397### 1xN Partial Annotations
398There are cases where one partial traffic annotation may be completed by
399different completing annotations. In these cases,
400`net::BranchedCompleteNetworkTrafficAnnotation` function can be used. This
401function has an extra argument that is common between all branches and is
402referred to by the partial annotation. For the above examples, if there would be
403two different ways of completing the received partial annotation, the following
404the definition can be used:
405
406```cpp
407if (...) {
408 return net::BranchedCompleteNetworkTrafficAnnotation(
409 "bitmap_fetcher_type1", "bitmap_fetcher",
410 partial_traffic_annotation, R"(
411 policy {
412 cookies_allowed: YES
413 cookies_store: "user"
414 setting: "..."
415 chrome_policy {...}
416 })");
417 } else {
418 return net::BranchedCompleteNetworkTrafficAnnotation(
419 "bitmap_fetcher_type2", "bitmap_fetcher",
420 partial_traffic_annotation, R"(
421 policy {
422 cookies_allowed: YES
423 cookies_store: "system"
424 setting: "..."
425 chrome_policy {...}
426 })");
427```
428
429Please refer to `tools/traffic_annotation/sample_traffic_annotation.cc` for more
430detailed examples.
431
432
433## Mutable Annotations (Advanced)
434
435`net::NetworkTrafficAnnotationTag` and `net::PartialNetworkTrafficAnnotationTag`
436are defined with constant internal argument(s), so that once they are created,
437they cannot be modified. There are very few exceptions that may require
438modification of the annotation value, like the ones used by mojo interfaces
439where after serialization, the annotation object is first created, then receives
440value. In these cases, `net::MutableNetworkTrafficAnnotationTag` and
441`net::MutablePartialNetworkTrafficAnnotationTag` can be used which do not have
[email protected]d7c93b3b2018-01-19 08:39:07442this limitation.
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37443
[email protected]d7c93b3b2018-01-19 08:39:07444Mutable annotations have a run time check before being converted into normal
445annotations to ensure their content is valid. Therefore it is suggested that
446they would be used only if there is no other way around it. Use cases are
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37447checked with `auditor.py` to ensure proper initialization values for the
448mutable annotations.
Ramin Halavatic56d1702017-11-30 09:23:01449
450
451## Mojo Interfaces (Advanced)
452
453For serialization of network traffic annotation and partial network traffic
454annotation tags, you can use the mutable mojo interfaces defined in
Nicolas Arciniega15745e22019-10-09 20:42:35455`/services/network/public/mojom`.