blob: d7bc89bfaaba6afc7bc69b6e4bbda009e2330dc2 [file] [log] [blame] [view]
Ramin Halavati2e7ffe4f2017-11-13 11:19:351# Network Traffic Annotations
2
3[TOC]
4
5This document presents a practical guide to using network traffic annotations in
6Chrome.
7
8
9## Problem Statement
10
11To make Chrome’s network communication transparent, we would need to be able to
12provide the following answers:
13* What is the intent behind each network request?
14* What user data is sent in the request, and where does it go?
15
Ramin Halavati6c1526e2018-04-06 05:44:3816Besides these requirements, the following information helps users, admins, and
17help desk:
Ramin Halavati2e7ffe4f2017-11-13 11:19:3518* How can a network communication be stopped or controlled?
19* What are the traces of the communication on the client?
20
21It should be noted that the technical details of requests are not necessarily
22important to the users, but in order to provide the intended transparency, we
23need to show that we have covered all bases and there are no back doors.
24
25
26## The Solution
27
28We can provide up to date, in-line documentation on origin, intent, payload, and
29control mechanisms of each network communication. This is done by adding a
30`NetworkTrafficAnnotationTag` to all network communication functions.
31Please note that as the goal is to specify the intent behind each network
32request and its payload, this metadata does not need to be transmitted with the
33request during runtime and it is sufficient to have it in appropriate positions
34in the code. Having that as an argument of all network communication functions
35is a mechanism to enforce its existence and showing the users our intent to
36cover the whole repository.
37
38
39## Best Practices
40
41### Where to add annotation?
42All network requests are ultimately sending data through sockets or native API
43functions, but we should note that the concern is about the main intent of the
44communication and not the implementation details. Therefore we do not need to
45specify this data separately for each call to each function that is used in the
46process and it is sufficient that the most rational point of origin would be
47annotated and the annotation would be passed through the downstream steps.
48Best practices for choosing annotation code site include where:
49 1. The origin of user’s intent or internal requirement for the request is
50 stated.
51 2. The controls over stopping or limiting the request (Chrome settings or
52 policies) are enforced.
53 3. The data that is sent is specified.
54If there is a conflict between where is the best annotation point, please refer
55to the `Partial Annotations` section for an approach to split annotation.
56
57### Merged Requests
58There are cases where requests are received from multiple sources and merged
59into one connection, like when a socket merges several data frames and sends
60them together, or a device location is requested by different components, and
61just one network request is made to fetch it. In these cases, the merge point
62can ensure that all received requests are properly annotated and just pass one
Ramin Halavati6c1526e2018-04-06 05:44:3863of them to the downstream step. It can also pass a local annotation stating that
64it is a merged request on behalf of other requests of type X, which were ensured
65to all have annotations.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3566This decision is driven from the fact that we do not need to transmit the
67annotation metadata in runtime and enforced annotation arguments are just to
68ensure that the request is annotated somewhere upstream.
69
70
71## Coverage
Ramin Halavati6c1526e2018-04-06 05:44:3872Network traffic annotations are currently enforced on all url requests and
73socket writes, except for the code which is not compiled on Windows or Linux.
74This effort may expand to ChromeOS in future and currently there is no plan to
75expand it to other platforms.
Ramin Halavati2e7ffe4f2017-11-13 11:19:3576
77
78## Network Traffic Annotation Tag
79
80`net::NetworkTrafficAnnotationTag` is the main definition for annotations. There
81are few variants of it that are specified in later sections. The goal is to have
82one object of this type or its variants as an argument of all functions that
83create a network request.
84
85### Content of Annotation Tag
86Each network traffic annotation should specify the following items:
87* `uniqueـid`: A globally unique identifier that must stay unchanged while the
88 network request carries the same semantic meaning. If the network request gets
89 a new meaning, this ID needs to be changed. The purpose of this ID is to give
90 humans a chance to reference NetworkTrafficAnnotations externally even when
91 those change a little bit (e.g. adding a new piece of data that is sent along
92 with a network request). IDs of one component should have a shared prefix so
93 that sorting all NetworkTrafficAnnotations by unique_id groups those that
94 belong to the same component together.
95* `TrafficSource`: These set of fields specify the location of annotation in
96 the source code. These fields are automatically set and do not need
97 specification.
98* `TrafficSemantics`: These set of fields specify meta information about the
99 network request’s content and reason.
100 * `sender`: What component triggers the request. The components should be
101 human readable and don’t need to reflect the components/ directory. Avoid
Ramin Halavati6c1526e2018-04-06 05:44:38102 abbreviations, and use a common value for all annotations in one component.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35103 * `description`: Plaintext description of the network request in language
104 that is understandable by admins (ideally also users). Please avoid
105 acronyms and describe the feature and the feature's value proposition as
106 well.
107 * `trigger`: What user action triggered the network request. Use a textual
108 description. This should be a human readable string.
109 * `data`: What nature of data is being sent. This should be a human readable
110 string. Any user data and/or PII should be pointed out.
111 * `destination`: Target of the network request. It can be either the website
112 that user visits and interacts with, a Google service, a request that does
113 not go to network and just fetches a local resource, or other endpoints
114 like a service hosting PAC scripts. The distinction between a Google owned
115 service and website can be difficult when the user navigates to google.com
116 or searches with the omnibar. Therefore follow the following guideline: If
117 the source code has hardcoded that the request goes to Google (e.g. for
118 ZeroSuggest), use `GOOGLE_OWNED_SERVICE`. If the request can go to other
119 domains and is perceived as a part of a website rather than a native
Nicolas Arciniega15745e22019-10-09 20:42:35120 browser feature, use `WEBSITE`. Use `LOCAL` if the request is processed
Ramin Halavati2e7ffe4f2017-11-13 11:19:35121 locally and doesn't go to network, otherwise use `OTHER`. If `OTHER` is
122 used, please add plain text description in `destination_other`
123 field.
124 * `destination_other`: Human readable description in case the destination
125 points to `OTHER`.
126* `TrafficPolicy`: These set of fields specify the controls that a user may have
127 on disabling or limiting the network request and its trace.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35128 * `cookies_allowed`: Specifies if this request stores and uses cookies or
129 not. Use values `YES` or `NO`.
130 * `cookies_store`: If a request sends or stores cookies/channel IDs/... (i.e.
131 if `cookies_allowed` is true), we want to know which cookie store is being
132 used. The answer to this question can typically be derived from the
133 URLRequestContext that is being used. The three most common cases will be:
134 * If `cookies_allowed` is false, leave this field unset.
135 * If the profile's default URLRequestContext is being used (e.g. from
136 `Profile::GetRequestContext())`, this means that the user's normal
137 cookies sent. In this case, put `user` here.
138 * If the system URLRequestContext is being used (for example via
139 `io_thread()->system_url_request_context_getter())`, put `system` here.
140 * Otherwise, please explain (e.g. SafeBrowsing uses a separate cookie
141 store).
142 * `setting`: Human readable description of how to enable/disable a feature
143 that triggers this network request by a user (e.g. “Disable ‘Use a web
144 service to help resolve spelling errors.’ in settings under Advanced”).
145 Note that settings look different on different platforms, make sure your
146 description works everywhere!
147 * `chrome_policy`: Policy configuration that disables or limits this network
148 request. This would be a text serialized protobuf of any enterprise policy.
149 See policy list or chrome_settings.proto for the full list of policies.
150 * `policy_exception_justification`: If there is no policy to disable or limit
Ramin Halavatic7c0bb032018-01-30 16:44:48151 this request, a justification can be presented here.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35152* `comments`: If required, any human readable extra comments.
153
154### Format and Examples
155Traffic annotations are kept in code as serialized protobuf. To define a
156`NetworkTrafficAnnotationTag`, you may use the function
157`net::DefineNetworkTrafficAnnotation`, with two arguments, the unique id, and
158all other fields bundled together as a serialized protobuf string.
159
160#### Good examples
161```cpp
162 net::NetworkTrafficAnnotationTag traffic_annotation =
163 net::DefineNetworkTrafficAnnotation("spellcheck_lookup", R"(
164 semantics {
165 sender: "Online Spellcheck"
166 description:
167 "Chrome can provide smarter spell-checking by sending text you "
168 "type into the browser to Google's servers, allowing you to use "
169 "the same spell-checking technology used by Google products, such "
170 "as Docs. If the feature is enabled, Chrome will send the entire "
171 "contents of text fields as you type in them to Google along with "
172 "the browser’s default language. Google returns a list of "
173 "suggested spellings, which will be displayed in the context menu."
174 trigger: "User types text into a text field or asks to correct a "
175 "misspelled word."
176 data: "Text a user has typed into a text field. No user identifier "
177 "is sent along with the text."
178 destination: GOOGLE_OWNED_SERVICE
179 }
180 policy {
181 cookies_allowed: NO
182 setting:
183 "You can enable or disable this feature via 'Use a web service to "
184 "help resolve spelling errors.' in Chrome's settings under "
185 "Advanced. The feature is disabled by default."
186 chrome_policy {
187 SpellCheckServiceEnabled {
188 SpellCheckServiceEnabled: false
189 }
190 }
191 })");
192```
193
194```cpp
195 net::NetworkTrafficAnnotationTag traffic_annotation2 =
196 net::DefineNetworkTrafficAnnotation(
197 "safe_browsing_chunk_backup_request",
198 R"(
199 semantics {
200 sender: "Safe Browsing"
201 description:
202 "Safe Browsing updates its local database of bad sites every 30 "
203 "minutes or so. It aims to keep all users up-to-date with the same "
204 "set of hash-prefixes of bad URLs."
205 trigger:
206 "On a timer, approximately every 30 minutes."
207 data:
208 "The state of the local DB is sent so the server can send just the "
209 "changes. This doesn't include any user data."
210 destination: GOOGLE_OWNED_SERVICE
211 }
212 policy {
213 cookies_allowed: YES
214 cookies_store: "Safe Browsing cookie store"
215 setting:
216 "Users can disable Safe Browsing by unchecking 'Protect you and "
217 "your device from dangerous sites' in Chromium settings under "
218 "Privacy. The feature is enabled by default."
219 chrome_policy {
220 SafeBrowsingEnabled {
221 policy_options {mode: MANDATORY}
222 SafeBrowsingEnabled: false
223 }
224 }
225 })");
226```
227
228#### Bad Examples
229```cpp
230 net::NetworkTrafficAnnotationTag bad_traffic_annotation =
231 net::DefineNetworkTrafficAnnotation(
232 ...
233 trigger: "Chrome sends this when [obscure event that is not related to "
234 "anything user-perceivable]."
235 // Please specify the exact user action that results in this request.
236 data: "This sends everything the feature needs to know."
237 // Please be precise, name the data items. If they are too many, name
238 // the sensitive user data and general classes of other data and refer
239 // to a document specifying the details.
240 ...
241 policy_exception_justification: "None."
242 // Check again! Most features can be disabled or limited by a policy.
243 ...
244 })");
245```
246
247#### Empty Template
248You can copy/paste the following template to define an annotation.
249```cpp
250 net::NetworkTrafficAnnotationTag traffic_annotation =
251 net::DefineNetworkTrafficAnnotation("...", R"(
252 semantics {
253 sender: "..."
254 description: "..."
255 trigger: "..."
256 data: "..."
257 destination: WEBSITE/GOOGLE_OWNED_SERVICE/OTHER
258 }
259 policy {
Ramin Halavati2e7ffe4f2017-11-13 11:19:35260 cookies_allowed: NO/YES
261 cookies_store: "..."
262 setting: "..."
263 chrome_policy {
264 [POLICY_NAME] {
265 [POLICY_NAME]: ...
266 }
267 }
268 policy_exception_justification = "..."
269 }
270 comments: "..."
271 )");
272```
273
274
275## Testing for errors
276
277There are several checks that should be done on annotations before submitting a
278change list. These checks include:
279* The annotations are syntactically correct.
280* They have all required fields.
281* Partial annotations and completing parts match (please refer to the next
282 section).
283* Annotations are not incorrectly defined.
284 * e.g., traffic_annotation = NetworkTrafficAnnotation({1}).
285* All usages from Chrome have annotation.
286* Unique ids are unique, through history (even if an annotation gets deprecated,
287 its unique id cannot be reused to keep the stats sound).
Bella Bahaf76f4a2020-08-03 22:22:56288* That the annotation appears in
289 `tools/traffic_annotation/summary/grouping.xml`. When adding a new annotation,
290 it must also be included in `grouping.xml` for reporting purposes (please
291 refer to the **Annotations Review**).
Bella Bah2b772c132020-08-05 17:01:37292
Ramin Halavati2e7ffe4f2017-11-13 11:19:35293
Ramin Halavati6c1526e2018-04-06 05:44:38294### Presubmit tests
Ehsan Kia735f6b42020-02-27 18:14:59295To perform tests prior to submit, one can use the `traffic_annotation_auditor`
296binary. It runs over the whole repository and using a python script, extracts
297all the annotations and then checks if all above items are correct. The latest
298executable for supported platforms can be found in
299`tools/traffic_annotation/bin/[platform]`.
300
301Running the `traffic_annotation_auditor` requires having a build directory and
302can be done with the following syntax:
Ramin Halavati6c1526e2018-04-06 05:44:38303`tools/traffic_annotation/bin/[linux64/win32]/traffic_annotation_auditor
Ramin Halavati2e7ffe4f2017-11-13 11:19:35304 --build-path=[out/Default]`
Ramin Halavati6c1526e2018-04-06 05:44:38305
306### Waterfall tests
307Two commit queue trybots test traffic annotations on changed files using the
308scripts in `tools/traffic_annotation/scripts`. To run these tests faster and to
Nicolas Arciniega15745e22019-10-09 20:42:35309avoid spamming the commit queue if an unforeseen error has happened in
310downstream scripts or tools, they are run in error resilient mode, only on
311changed files, and using heuristics to decide which files to process.
Ramin Halavati6c1526e2018-04-06 05:44:38312An FYI bot runs more detailed tests on the whole repository and with different
313switches, to make sure that the heuristics that trybot tests use and the limited
314scope of tests have not neglected any issues.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35315
316
317## Annotations Review
318
Ramin Halavati6c1526e2018-04-06 05:44:38319Network traffic annotations require review before landing in code and this is
320enforced through keeping a summary of annotations in
Bella Bahaf76f4a2020-08-03 22:22:56321`tools/traffic_annotation/summary/annotations.xml`. Once a new annotation is added,
322one is updated, or deleted, this file should also be updated. To update the
323`annotations.xml` file automatically, one can run `traffic_annotation_auditor`
324as specified in presubmit tests. But if it is not possible to do so (e.g., if
325you are changing the code from an unsupported platform or you don’t have a
326compiled build directory), the code can be submitted to the trybot and the test
Bella Bah2b772c132020-08-05 17:01:37327on trybot will tell you the required modifications.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35328
Bella Bahaf76f4a2020-08-03 22:22:56329In order to help make external reports easier, annotation unique ids should be
330mentioned in `tools/traffic_annotation/summary/grouping.xml`. Once a new
331annotation is added, or a preexisting annotation's unique id changes, this file
332should also be updated. When adding a new annotation, make sure it is placed
333within an appropriate group of `grouping.xml`. In the rare case that none of
334the groups are appropriate, one can create a new group for the annotation; the
335arrangement of annotations and group names in `grouping.xml` may be later
336updated by a technical writer to better coincide with the external reports.
Ramin Halavati2e7ffe4f2017-11-13 11:19:35337
338## Partial Annotations (Advanced)
339
340There are cases where the network traffic annotation cannot be fully specified
341in one place. For example, in one place we know the trigger of a network request
342and in another place we know the data that will be sent. In these cases, we
343prefer that both parts of the annotation appear in context so that they are
344updated if code changes. Partial annotations help splitting the network traffic
345annotation into two pieces. In these cases, we call the first part, the partial
346annotation, and the part the completes it, the completing annotation. Partial
347annotations and completing annotations do not need to have all annotation
348fields, but their composition should have all required fields.
349
350### Defining a Partial Annotation
351To define a partial annotation, one can use
352`net::DefinePartialNetworkTrafficAnnotation` function. Besides the unique id and
353annotation text, this function requires the unique id of the completing part.
354For example, a partial annotation that only specifies the semantics part or a
355request with unique id "omnibox_prefetch_image", and is completed later using an
356annotation with unique id "bitmap_fetcher", can be defined as follows:
357
358```cpp
359 net::PartialNetworkTrafficAnnotationTag partial_traffic_annotation =
360 net::DefinePartialNetworkTrafficAnnotation("omnibox_prefetch_image",
361 "bitmap_fetcher", R"(
362 semantics {
363 sender: "Omnibox"
364 Description: "..."
365 Trigger: "..."
366 Data: "..."
367 destination: WEBSITE
368 })");
369```
370
371### Nx1 Partial Annotations
372The cases where several partial annotations may be completed by one completing
373annotation are called Nx1. This also matches where N=1. To define a completing
374annotation for such cases, one can use net::CompleteNetworkTrafficAnnotation
375function. This function receives a unique id, the annotation text, and a
376`net::PartialNetworkTrafficAnnotationTag` object. Here is an example of a
377completing part for the previous example:
378
379```cpp
380 net::NetworkTrafficAnnotationTag traffic_annotation =
381 net::CompleteNetworkTrafficAnnotation("bitmap_fetcher",
382 partial_traffic_annotation, R"(
383 policy {
384 cookies_allowed: YES
385 cookies_store: "user"
386 setting: "..."
387 chrome_policy {...}
388 })");
389```
390
391### 1xN Partial Annotations
392There are cases where one partial traffic annotation may be completed by
393different completing annotations. In these cases,
394`net::BranchedCompleteNetworkTrafficAnnotation` function can be used. This
395function has an extra argument that is common between all branches and is
396referred to by the partial annotation. For the above examples, if there would be
397two different ways of completing the received partial annotation, the following
398the definition can be used:
399
400```cpp
401if (...) {
402 return net::BranchedCompleteNetworkTrafficAnnotation(
403 "bitmap_fetcher_type1", "bitmap_fetcher",
404 partial_traffic_annotation, R"(
405 policy {
406 cookies_allowed: YES
407 cookies_store: "user"
408 setting: "..."
409 chrome_policy {...}
410 })");
411 } else {
412 return net::BranchedCompleteNetworkTrafficAnnotation(
413 "bitmap_fetcher_type2", "bitmap_fetcher",
414 partial_traffic_annotation, R"(
415 policy {
416 cookies_allowed: YES
417 cookies_store: "system"
418 setting: "..."
419 chrome_policy {...}
420 })");
421```
422
423Please refer to `tools/traffic_annotation/sample_traffic_annotation.cc` for more
424detailed examples.
425
426
427## Mutable Annotations (Advanced)
428
429`net::NetworkTrafficAnnotationTag` and `net::PartialNetworkTrafficAnnotationTag`
430are defined with constant internal argument(s), so that once they are created,
431they cannot be modified. There are very few exceptions that may require
432modification of the annotation value, like the ones used by mojo interfaces
433where after serialization, the annotation object is first created, then receives
434value. In these cases, `net::MutableNetworkTrafficAnnotationTag` and
435`net::MutablePartialNetworkTrafficAnnotationTag` can be used which do not have
[email protected]d7c93b3b2018-01-19 08:39:07436this limitation.
437Mutable annotations have a run time check before being converted into normal
438annotations to ensure their content is valid. Therefore it is suggested that
439they would be used only if there is no other way around it. Use cases are
440checked with the `traffic_annotation_auditor` to ensure proper initialization
441values for the mutable annotations.
Ramin Halavatic56d1702017-11-30 09:23:01442
443
444## Mojo Interfaces (Advanced)
445
446For serialization of network traffic annotation and partial network traffic
447annotation tags, you can use the mutable mojo interfaces defined in
Nicolas Arciniega15745e22019-10-09 20:42:35448`/services/network/public/mojom`.