blob: b9c7f0ed6e5b8dbeb5366d3952279c05ae7a6c81 [file] [log] [blame] [view]
Ramin Halavati2e7ffe4f2017-11-13 11:19:351# Network Traffic Annotations
2
3[TOC]
4
5This document presents a practical guide to using network traffic annotations in
6Chrome.
7
8
9## Problem Statement
10
11To make Chrome’s network communication transparent, we would need to be able to
12provide the following answers:
13* What is the intent behind each network request?
14* What user data is sent in the request, and where does it go?
15
Ramin Halavati6c1526e2018-04-06 05:44:3816Besides these requirements, the following information helps users, admins, and
17help desk:
Ramin Halavati2e7ffe4f2017-11-13 11:19:3518* How can a network communication be stopped or controlled?
19* What are the traces of the communication on the client?
20
21It should be noted that the technical details of requests are not necessarily
22important to the users, but in order to provide the intended transparency, we
23need to show that we have covered all bases and there are no back doors.
24
25
26## The Solution
27
28We can provide up to date, in-line documentation on origin, intent, payload, and
29control mechanisms of each network communication. This is done by adding a
30`NetworkTrafficAnnotationTag` to all network communication functions.
31Please note that as the goal is to specify the intent behind each network
32request and its payload, this metadata does not need to be transmitted with the
33request during runtime and it is sufficient to have it in appropriate positions
34in the code. Having that as an argument of all network communication functions
35is a mechanism to enforce its existence and showing the users our intent to
36cover the whole repository.
37
38
39## Best Practices
40
41### Where to add annotation?
42All network requests are ultimately sending data through sockets or native API
43functions, but we should note that the concern is about the main intent of the
44communication and not the implementation details. Therefore we do not need to
45specify this data separately for each call to each function that is used in the
46process and it is sufficient that the most rational point of origin would be
47annotated and the annotation would be passed through the downstream steps.
48Best practices for choosing annotation code site include where:
49 1. The origin of user’s intent or internal requirement for the request is
50 stated.
51 2. The controls over stopping or limiting the request (Chrome settings or
52 policies) are enforced.
53 3. The data that is sent is specified.
54If there is a conflict between where is the best annotation point, please refer
55to the `Partial Annotations` section for an approach to split annotation.
56
57### Merged Requests
58There are cases where requests are received from multiple sources and merged
59into one connection, like when a socket merges several data frames and sends
60them together, or a device location is requested by different components, and
61just one network request is made to fetch it. In these cases, the merge point
62can ensure that all received requests are properly annotated and just pass one
Ramin Halavati6c1526e2018-04-06 05:44:3863of them to the downstream step. It can also pass a local annotation stating that
64it is a merged request on behalf of other requests of type X, which were ensured
65to all have annotations.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3566This decision is driven from the fact that we do not need to transmit the
67annotation metadata in runtime and enforced annotation arguments are just to
68ensure that the request is annotated somewhere upstream.
69
70
71## Coverage
Ramin Halavati6c1526e2018-04-06 05:44:3872Network traffic annotations are currently enforced on all url requests and
73socket writes, except for the code which is not compiled on Windows or Linux.
74This effort may expand to ChromeOS in future and currently there is no plan to
75expand it to other platforms.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3576
77
78## Network Traffic Annotation Tag
79
80`net::NetworkTrafficAnnotationTag` is the main definition for annotations. There
81are few variants of it that are specified in later sections. The goal is to have
82one object of this type or its variants as an argument of all functions that
83create a network request.
84
85### Content of Annotation Tag
Glenn Hartmanndb9723b32022-01-07 14:57:2586Each network traffic annotation should specify the following items, as defined
87in the `NetworkTrafficAnnotation` message of
88`chrome/browser/privacy/traffic_annotation.proto`:
Ramin Halavati2e7ffe4f2017-11-13 11:19:3589* `uniqueـid`: A globally unique identifier that must stay unchanged while the
90 network request carries the same semantic meaning. If the network request gets
91 a new meaning, this ID needs to be changed. The purpose of this ID is to give
92 humans a chance to reference NetworkTrafficAnnotations externally even when
93 those change a little bit (e.g. adding a new piece of data that is sent along
94 with a network request). IDs of one component should have a shared prefix so
95 that sorting all NetworkTrafficAnnotations by unique_id groups those that
96 belong to the same component together.
Glenn Hartmanndb9723b32022-01-07 14:57:2597* `source`: These set of fields specify the location of annotation in
Ramin Halavati2e7ffe4f2017-11-13 11:19:3598 the source code. These fields are automatically set and do not need
99 specification.
Glenn Hartmanndb9723b32022-01-07 14:57:25100* `semantics`: These set of fields specify meta information about the
Ramin Halavati2e7ffe4f2017-11-13 11:19:35101 network request’s content and reason.
102 * `sender`: What component triggers the request. The components should be
103 human readable and don’t need to reflect the components/ directory. Avoid
Ramin Halavati6c1526e2018-04-06 05:44:38104 abbreviations, and use a common value for all annotations in one component.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35105 * `description`: Plaintext description of the network request in language
106 that is understandable by admins (ideally also users). Please avoid
107 acronyms and describe the feature and the feature's value proposition as
108 well.
109 * `trigger`: What user action triggered the network request. Use a textual
110 description. This should be a human readable string.
111 * `data`: What nature of data is being sent. This should be a human readable
112 string. Any user data and/or PII should be pointed out.
113 * `destination`: Target of the network request. It can be either the website
114 that user visits and interacts with, a Google service, a request that does
115 not go to network and just fetches a local resource, or other endpoints
116 like a service hosting PAC scripts. The distinction between a Google owned
117 service and website can be difficult when the user navigates to google.com
118 or searches with the omnibar. Therefore follow the following guideline: If
119 the source code has hardcoded that the request goes to Google (e.g. for
120 ZeroSuggest), use `GOOGLE_OWNED_SERVICE`. If the request can go to other
121 domains and is perceived as a part of a website rather than a native
Nicolas Arciniega15745e22019-10-09 20:42:35122 browser feature, use `WEBSITE`. Use `LOCAL` if the request is processed
Ramin Halavati2e7ffe4f2017-11-13 11:19:35123 locally and doesn't go to network, otherwise use `OTHER`. If `OTHER` is
124 used, please add plain text description in `destination_other`
125 field.
126 * `destination_other`: Human readable description in case the destination
127 points to `OTHER`.
Glenn Hartmanndb9723b32022-01-07 14:57:25128* `policy`: These set of fields specify the controls that a user may have
Ramin Halavati2e7ffe4f2017-11-13 11:19:35129 on disabling or limiting the network request and its trace.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35130 * `cookies_allowed`: Specifies if this request stores and uses cookies or
131 not. Use values `YES` or `NO`.
132 * `cookies_store`: If a request sends or stores cookies/channel IDs/... (i.e.
133 if `cookies_allowed` is true), we want to know which cookie store is being
134 used. The answer to this question can typically be derived from the
135 URLRequestContext that is being used. The three most common cases will be:
136 * If `cookies_allowed` is false, leave this field unset.
137 * If the profile's default URLRequestContext is being used (e.g. from
138 `Profile::GetRequestContext())`, this means that the user's normal
139 cookies sent. In this case, put `user` here.
140 * If the system URLRequestContext is being used (for example via
141 `io_thread()->system_url_request_context_getter())`, put `system` here.
142 * Otherwise, please explain (e.g. SafeBrowsing uses a separate cookie
143 store).
144 * `setting`: Human readable description of how to enable/disable a feature
145 that triggers this network request by a user (e.g. “Disable ‘Use a web
146 service to help resolve spelling errors.’ in settings under Advanced”).
147 Note that settings look different on different platforms, make sure your
148 description works everywhere!
149 * `chrome_policy`: Policy configuration that disables or limits this network
150 request. This would be a text serialized protobuf of any enterprise policy.
Glenn Hartmanndb9723b32022-01-07 14:57:25151 See policy list or
152 `out/Debug/gen/components/policy/proto/chrome_settings.proto` for the full
153 list of policies.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35154 * `policy_exception_justification`: If there is no policy to disable or limit
Ramin Halavatic7c0bb032018-01-30 16:44:48155 this request, a justification can be presented here.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35156* `comments`: If required, any human readable extra comments.
157
158### Format and Examples
159Traffic annotations are kept in code as serialized protobuf. To define a
160`NetworkTrafficAnnotationTag`, you may use the function
161`net::DefineNetworkTrafficAnnotation`, with two arguments, the unique id, and
162all other fields bundled together as a serialized protobuf string.
163
164#### Good examples
165```cpp
166 net::NetworkTrafficAnnotationTag traffic_annotation =
167 net::DefineNetworkTrafficAnnotation("spellcheck_lookup", R"(
168 semantics {
169 sender: "Online Spellcheck"
170 description:
171 "Chrome can provide smarter spell-checking by sending text you "
172 "type into the browser to Google's servers, allowing you to use "
173 "the same spell-checking technology used by Google products, such "
174 "as Docs. If the feature is enabled, Chrome will send the entire "
175 "contents of text fields as you type in them to Google along with "
176 "the browser’s default language. Google returns a list of "
177 "suggested spellings, which will be displayed in the context menu."
178 trigger: "User types text into a text field or asks to correct a "
179 "misspelled word."
180 data: "Text a user has typed into a text field. No user identifier "
181 "is sent along with the text."
182 destination: GOOGLE_OWNED_SERVICE
183 }
184 policy {
185 cookies_allowed: NO
186 setting:
187 "You can enable or disable this feature via 'Use a web service to "
188 "help resolve spelling errors.' in Chrome's settings under "
189 "Advanced. The feature is disabled by default."
190 chrome_policy {
191 SpellCheckServiceEnabled {
192 SpellCheckServiceEnabled: false
193 }
194 }
195 })");
196```
197
198```cpp
199 net::NetworkTrafficAnnotationTag traffic_annotation2 =
200 net::DefineNetworkTrafficAnnotation(
201 "safe_browsing_chunk_backup_request",
202 R"(
203 semantics {
204 sender: "Safe Browsing"
205 description:
206 "Safe Browsing updates its local database of bad sites every 30 "
207 "minutes or so. It aims to keep all users up-to-date with the same "
208 "set of hash-prefixes of bad URLs."
209 trigger:
210 "On a timer, approximately every 30 minutes."
211 data:
212 "The state of the local DB is sent so the server can send just the "
213 "changes. This doesn't include any user data."
214 destination: GOOGLE_OWNED_SERVICE
215 }
216 policy {
217 cookies_allowed: YES
218 cookies_store: "Safe Browsing cookie store"
219 setting:
220 "Users can disable Safe Browsing by unchecking 'Protect you and "
221 "your device from dangerous sites' in Chromium settings under "
222 "Privacy. The feature is enabled by default."
223 chrome_policy {
224 SafeBrowsingEnabled {
225 policy_options {mode: MANDATORY}
226 SafeBrowsingEnabled: false
227 }
228 }
229 })");
230```
231
232#### Bad Examples
233```cpp
234 net::NetworkTrafficAnnotationTag bad_traffic_annotation =
235 net::DefineNetworkTrafficAnnotation(
236 ...
237 trigger: "Chrome sends this when [obscure event that is not related to "
238 "anything user-perceivable]."
239 // Please specify the exact user action that results in this request.
240 data: "This sends everything the feature needs to know."
241 // Please be precise, name the data items. If they are too many, name
242 // the sensitive user data and general classes of other data and refer
243 // to a document specifying the details.
244 ...
245 policy_exception_justification: "None."
246 // Check again! Most features can be disabled or limited by a policy.
247 ...
248 })");
249```
250
251#### Empty Template
252You can copy/paste the following template to define an annotation.
253```cpp
254 net::NetworkTrafficAnnotationTag traffic_annotation =
255 net::DefineNetworkTrafficAnnotation("...", R"(
256 semantics {
257 sender: "..."
258 description: "..."
259 trigger: "..."
260 data: "..."
261 destination: WEBSITE/GOOGLE_OWNED_SERVICE/OTHER
262 }
263 policy {
Ramin Halavati2e7ffe4f2017-11-13 11:19:35264 cookies_allowed: NO/YES
265 cookies_store: "..."
266 setting: "..."
267 chrome_policy {
268 [POLICY_NAME] {
269 [POLICY_NAME]: ...
270 }
271 }
272 policy_exception_justification = "..."
273 }
274 comments: "..."
275 )");
276```
277
278
279## Testing for errors
280
281There are several checks that should be done on annotations before submitting a
282change list. These checks include:
283* The annotations are syntactically correct.
284* They have all required fields.
285* Partial annotations and completing parts match (please refer to the next
286 section).
287* Annotations are not incorrectly defined.
288 * e.g., traffic_annotation = NetworkTrafficAnnotation({1}).
289* All usages from Chrome have annotation.
290* Unique ids are unique, through history (even if an annotation gets deprecated,
291 its unique id cannot be reused to keep the stats sound).
Bella Bahaf76f4a2020-08-03 22:22:56292* That the annotation appears in
293 `tools/traffic_annotation/summary/grouping.xml`. When adding a new annotation,
294 it must also be included in `grouping.xml` for reporting purposes (please
295 refer to the **Annotations Review**).
Bella Bah2b772c132020-08-05 17:01:37296
Ramin Halavati2e7ffe4f2017-11-13 11:19:35297
Ramin Halavati6c1526e2018-04-06 05:44:38298### Presubmit tests
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37299To perform tests prior to submit, one can use the `auditor.py`
300script. It runs over the whole repository, extracts
301all the annotations from C++ code, and then checks them for correctness.
Ehsan Kia735f6b42020-02-27 18:14:59302
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37303Running the `auditor.py` script requires a build directory in which you just
304built the `chrome` target. You can invoke it like this:
305`vpython3 tools/traffic_annotation/scripts/auditor/auditor.py
306 --build-path=out/Default`
Ramin Halavati6c1526e2018-04-06 05:44:38307
308### Waterfall tests
309Two commit queue trybots test traffic annotations on changed files using the
310scripts in `tools/traffic_annotation/scripts`. To run these tests faster and to
Nicolas Arciniega15745e22019-10-09 20:42:35311avoid spamming the commit queue if an unforeseen error has happened in
312downstream scripts or tools, they are run in error resilient mode, only on
313changed files, and using heuristics to decide which files to process.
Ramin Halavati6c1526e2018-04-06 05:44:38314An FYI bot runs more detailed tests on the whole repository and with different
315switches, to make sure that the heuristics that trybot tests use and the limited
316scope of tests have not neglected any issues.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35317
318
319## Annotations Review
320
Ramin Halavati6c1526e2018-04-06 05:44:38321Network traffic annotations require review before landing in code and this is
322enforced through keeping a summary of annotations in
Bella Bahaf76f4a2020-08-03 22:22:56323`tools/traffic_annotation/summary/annotations.xml`. Once a new annotation is added,
324one is updated, or deleted, this file should also be updated. To update the
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37325`annotations.xml` file automatically, one can run `auditor.py`
Bella Bahaf76f4a2020-08-03 22:22:56326as specified in presubmit tests. But if it is not possible to do so (e.g., if
327you are changing the code from an unsupported platform or you don’t have a
328compiled build directory), the code can be submitted to the trybot and the test
Bella Bah2b772c132020-08-05 17:01:37329on trybot will tell you the required modifications.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35330
Bella Bahaf76f4a2020-08-03 22:22:56331In order to help make external reports easier, annotation unique ids should be
332mentioned in `tools/traffic_annotation/summary/grouping.xml`. Once a new
333annotation is added, or a preexisting annotation's unique id changes, this file
334should also be updated. When adding a new annotation, make sure it is placed
335within an appropriate group of `grouping.xml`. In the rare case that none of
336the groups are appropriate, one can create a new group for the annotation; the
337arrangement of annotations and group names in `grouping.xml` may be later
338updated by a technical writer to better coincide with the external reports.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35339
340## Partial Annotations (Advanced)
341
342There are cases where the network traffic annotation cannot be fully specified
343in one place. For example, in one place we know the trigger of a network request
344and in another place we know the data that will be sent. In these cases, we
345prefer that both parts of the annotation appear in context so that they are
346updated if code changes. Partial annotations help splitting the network traffic
347annotation into two pieces. In these cases, we call the first part, the partial
348annotation, and the part the completes it, the completing annotation. Partial
349annotations and completing annotations do not need to have all annotation
350fields, but their composition should have all required fields.
351
352### Defining a Partial Annotation
353To define a partial annotation, one can use
354`net::DefinePartialNetworkTrafficAnnotation` function. Besides the unique id and
355annotation text, this function requires the unique id of the completing part.
356For example, a partial annotation that only specifies the semantics part or a
357request with unique id "omnibox_prefetch_image", and is completed later using an
358annotation with unique id "bitmap_fetcher", can be defined as follows:
359
360```cpp
361 net::PartialNetworkTrafficAnnotationTag partial_traffic_annotation =
362 net::DefinePartialNetworkTrafficAnnotation("omnibox_prefetch_image",
363 "bitmap_fetcher", R"(
364 semantics {
365 sender: "Omnibox"
366 Description: "..."
367 Trigger: "..."
368 Data: "..."
369 destination: WEBSITE
370 })");
371```
372
373### Nx1 Partial Annotations
374The cases where several partial annotations may be completed by one completing
375annotation are called Nx1. This also matches where N=1. To define a completing
376annotation for such cases, one can use net::CompleteNetworkTrafficAnnotation
377function. This function receives a unique id, the annotation text, and a
378`net::PartialNetworkTrafficAnnotationTag` object. Here is an example of a
379completing part for the previous example:
380
381```cpp
382 net::NetworkTrafficAnnotationTag traffic_annotation =
383 net::CompleteNetworkTrafficAnnotation("bitmap_fetcher",
384 partial_traffic_annotation, R"(
385 policy {
386 cookies_allowed: YES
387 cookies_store: "user"
388 setting: "..."
389 chrome_policy {...}
390 })");
391```
392
393### 1xN Partial Annotations
394There are cases where one partial traffic annotation may be completed by
395different completing annotations. In these cases,
396`net::BranchedCompleteNetworkTrafficAnnotation` function can be used. This
397function has an extra argument that is common between all branches and is
398referred to by the partial annotation. For the above examples, if there would be
399two different ways of completing the received partial annotation, the following
400the definition can be used:
401
402```cpp
403if (...) {
404 return net::BranchedCompleteNetworkTrafficAnnotation(
405 "bitmap_fetcher_type1", "bitmap_fetcher",
406 partial_traffic_annotation, R"(
407 policy {
408 cookies_allowed: YES
409 cookies_store: "user"
410 setting: "..."
411 chrome_policy {...}
412 })");
413 } else {
414 return net::BranchedCompleteNetworkTrafficAnnotation(
415 "bitmap_fetcher_type2", "bitmap_fetcher",
416 partial_traffic_annotation, R"(
417 policy {
418 cookies_allowed: YES
419 cookies_store: "system"
420 setting: "..."
421 chrome_policy {...}
422 })");
423```
424
425Please refer to `tools/traffic_annotation/sample_traffic_annotation.cc` for more
426detailed examples.
427
428
429## Mutable Annotations (Advanced)
430
431`net::NetworkTrafficAnnotationTag` and `net::PartialNetworkTrafficAnnotationTag`
432are defined with constant internal argument(s), so that once they are created,
433they cannot be modified. There are very few exceptions that may require
434modification of the annotation value, like the ones used by mojo interfaces
435where after serialization, the annotation object is first created, then receives
436value. In these cases, `net::MutableNetworkTrafficAnnotationTag` and
437`net::MutablePartialNetworkTrafficAnnotationTag` can be used which do not have
[email protected]d7c93b3b2018-01-19 08:39:07438this limitation.
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37439
[email protected]d7c93b3b2018-01-19 08:39:07440Mutable annotations have a run time check before being converted into normal
441annotations to ensure their content is valid. Therefore it is suggested that
442they would be used only if there is no other way around it. Use cases are
Nicolas Ouellet-Payeur35662942021-08-03 16:01:37443checked with `auditor.py` to ensure proper initialization values for the
444mutable annotations.
Ramin Halavatic56d1702017-11-30 09:23:01445
446
447## Mojo Interfaces (Advanced)
448
449For serialization of network traffic annotation and partial network traffic
450annotation tags, you can use the mutable mojo interfaces defined in
Nicolas Arciniega15745e22019-10-09 20:42:35451`/services/network/public/mojom`.