blob: 3bdfcf223df15e8d4224cfcb6fa48ecc9ff8d867 [file] [log] [blame] [view]
Ramin Halavati2e7ffe4f2017-11-13 11:19:351# Network Traffic Annotations
2
3[TOC]
4
5This document presents a practical guide to using network traffic annotations in
6Chrome.
7
8
9## Problem Statement
10
11To make Chromes network communication transparent, we would need to be able to
12provide the following answers:
13* What is the intent behind each network request?
14* What user data is sent in the request, and where does it go?
15
16Besides these requirements, the following information helps Enterprise admins
17and help desk:
18* How can a network communication be stopped or controlled?
19* What are the traces of the communication on the client?
20
21It should be noted that the technical details of requests are not necessarily
22important to the users, but in order to provide the intended transparency, we
23need to show that we have covered all bases and there are no back doors.
24
25
26## The Solution
27
28We can provide up to date, in-line documentation on origin, intent, payload, and
29control mechanisms of each network communication. This is done by adding a
30`NetworkTrafficAnnotationTag` to all network communication functions.
31Please note that as the goal is to specify the intent behind each network
32request and its payload, this metadata does not need to be transmitted with the
33request during runtime and it is sufficient to have it in appropriate positions
34in the code. Having that as an argument of all network communication functions
35is a mechanism to enforce its existence and showing the users our intent to
36cover the whole repository.
37
38
39## Best Practices
40
41### Where to add annotation?
42All network requests are ultimately sending data through sockets or native API
43functions, but we should note that the concern is about the main intent of the
44communication and not the implementation details. Therefore we do not need to
45specify this data separately for each call to each function that is used in the
46process and it is sufficient that the most rational point of origin would be
47annotated and the annotation would be passed through the downstream steps.
48Best practices for choosing annotation code site include where:
49 1. The origin of users intent or internal requirement for the request is
50 stated.
51 2. The controls over stopping or limiting the request (Chrome settings or
52 policies) are enforced.
53 3. The data that is sent is specified.
54If there is a conflict between where is the best annotation point, please refer
55to the `Partial Annotations` section for an approach to split annotation.
56
57### Merged Requests
58There are cases where requests are received from multiple sources and merged
59into one connection, like when a socket merges several data frames and sends
60them together, or a device location is requested by different components, and
61just one network request is made to fetch it. In these cases, the merge point
62can ensure that all received requests are properly annotated and just pass one
63of them to the downstream step.
64This decision is driven from the fact that we do not need to transmit the
65annotation metadata in runtime and enforced annotation arguments are just to
66ensure that the request is annotated somewhere upstream.
67
68
69## Coverage
70Network traffic annotations are currently enforced on all url requests in
71Windows and Linux, and are expanding to sockets and native API functions in
722017,Q4 - 2018,Q1.
73Currently there is no plan to expand the task to other platforms.
74
75
76## Network Traffic Annotation Tag
77
78`net::NetworkTrafficAnnotationTag` is the main definition for annotations. There
79are few variants of it that are specified in later sections. The goal is to have
80one object of this type or its variants as an argument of all functions that
81create a network request.
82
83### Content of Annotation Tag
84Each network traffic annotation should specify the following items:
85* `uniqueـid`: A globally unique identifier that must stay unchanged while the
86 network request carries the same semantic meaning. If the network request gets
87 a new meaning, this ID needs to be changed. The purpose of this ID is to give
88 humans a chance to reference NetworkTrafficAnnotations externally even when
89 those change a little bit (e.g. adding a new piece of data that is sent along
90 with a network request). IDs of one component should have a shared prefix so
91 that sorting all NetworkTrafficAnnotations by unique_id groups those that
92 belong to the same component together.
93* `TrafficSource`: These set of fields specify the location of annotation in
94 the source code. These fields are automatically set and do not need
95 specification.
96* `TrafficSemantics`: These set of fields specify meta information about the
97 network requests content and reason.
98 * `sender`: What component triggers the request. The components should be
99 human readable and dont need to reflect the components/ directory. Avoid
100 abbreviations.
101 * `description`: Plaintext description of the network request in language
102 that is understandable by admins (ideally also users). Please avoid
103 acronyms and describe the feature and the feature's value proposition as
104 well.
105 * `trigger`: What user action triggered the network request. Use a textual
106 description. This should be a human readable string.
107 * `data`: What nature of data is being sent. This should be a human readable
108 string. Any user data and/or PII should be pointed out.
109 * `destination`: Target of the network request. It can be either the website
110 that user visits and interacts with, a Google service, a request that does
111 not go to network and just fetches a local resource, or other endpoints
112 like a service hosting PAC scripts. The distinction between a Google owned
113 service and website can be difficult when the user navigates to google.com
114 or searches with the omnibar. Therefore follow the following guideline: If
115 the source code has hardcoded that the request goes to Google (e.g. for
116 ZeroSuggest), use `GOOGLE_OWNED_SERVICE`. If the request can go to other
117 domains and is perceived as a part of a website rather than a native
118 browser feature, use `WEBSITE`. Use `LOCAL` if the reques is processed
119 locally and doesn't go to network, otherwise use `OTHER`. If `OTHER` is
120 used, please add plain text description in `destination_other`
121 field.
122 * `destination_other`: Human readable description in case the destination
123 points to `OTHER`.
124* `TrafficPolicy`: These set of fields specify the controls that a user may have
125 on disabling or limiting the network request and its trace.
126 * `empty_policy_justification`: If traffic policy cannot be specified, like
127 when the request originates from inside network stack and none of the below
128 control policies applies to it, a plain text here can specify it.
129 * `cookies_allowed`: Specifies if this request stores and uses cookies or
130 not. Use values `YES` or `NO`.
131 * `cookies_store`: If a request sends or stores cookies/channel IDs/... (i.e.
132 if `cookies_allowed` is true), we want to know which cookie store is being
133 used. The answer to this question can typically be derived from the
134 URLRequestContext that is being used. The three most common cases will be:
135 * If `cookies_allowed` is false, leave this field unset.
136 * If the profile's default URLRequestContext is being used (e.g. from
137 `Profile::GetRequestContext())`, this means that the user's normal
138 cookies sent. In this case, put `user` here.
139 * If the system URLRequestContext is being used (for example via
140 `io_thread()->system_url_request_context_getter())`, put `system` here.
141 * Otherwise, please explain (e.g. SafeBrowsing uses a separate cookie
142 store).
143 * `setting`: Human readable description of how to enable/disable a feature
144 that triggers this network request by a user (e.g. Disable Use a web
145 service to help resolve spelling errors.’ in settings under Advanced”).
146 Note that settings look different on different platforms, make sure your
147 description works everywhere!
148 * `chrome_policy`: Policy configuration that disables or limits this network
149 request. This would be a text serialized protobuf of any enterprise policy.
150 See policy list or chrome_settings.proto for the full list of policies.
151 * `policy_exception_justification`: If there is no policy to disable or limit
152 this request, a justification can be presented here. In case Empty Policy
153 Justification is presented, this field is not required.
154* `comments`: If required, any human readable extra comments.
155
156### Format and Examples
157Traffic annotations are kept in code as serialized protobuf. To define a
158`NetworkTrafficAnnotationTag`, you may use the function
159`net::DefineNetworkTrafficAnnotation`, with two arguments, the unique id, and
160all other fields bundled together as a serialized protobuf string.
161
162#### Good examples
163```cpp
164 net::NetworkTrafficAnnotationTag traffic_annotation =
165 net::DefineNetworkTrafficAnnotation("spellcheck_lookup", R"(
166 semantics {
167 sender: "Online Spellcheck"
168 description:
169 "Chrome can provide smarter spell-checking by sending text you "
170 "type into the browser to Google's servers, allowing you to use "
171 "the same spell-checking technology used by Google products, such "
172 "as Docs. If the feature is enabled, Chrome will send the entire "
173 "contents of text fields as you type in them to Google along with "
174 "the browser’s default language. Google returns a list of "
175 "suggested spellings, which will be displayed in the context menu."
176 trigger: "User types text into a text field or asks to correct a "
177 "misspelled word."
178 data: "Text a user has typed into a text field. No user identifier "
179 "is sent along with the text."
180 destination: GOOGLE_OWNED_SERVICE
181 }
182 policy {
183 cookies_allowed: NO
184 setting:
185 "You can enable or disable this feature via 'Use a web service to "
186 "help resolve spelling errors.' in Chrome's settings under "
187 "Advanced. The feature is disabled by default."
188 chrome_policy {
189 SpellCheckServiceEnabled {
190 SpellCheckServiceEnabled: false
191 }
192 }
193 })");
194```
195
196```cpp
197 net::NetworkTrafficAnnotationTag traffic_annotation2 =
198 net::DefineNetworkTrafficAnnotation(
199 "safe_browsing_chunk_backup_request",
200 R"(
201 semantics {
202 sender: "Safe Browsing"
203 description:
204 "Safe Browsing updates its local database of bad sites every 30 "
205 "minutes or so. It aims to keep all users up-to-date with the same "
206 "set of hash-prefixes of bad URLs."
207 trigger:
208 "On a timer, approximately every 30 minutes."
209 data:
210 "The state of the local DB is sent so the server can send just the "
211 "changes. This doesn't include any user data."
212 destination: GOOGLE_OWNED_SERVICE
213 }
214 policy {
215 cookies_allowed: YES
216 cookies_store: "Safe Browsing cookie store"
217 setting:
218 "Users can disable Safe Browsing by unchecking 'Protect you and "
219 "your device from dangerous sites' in Chromium settings under "
220 "Privacy. The feature is enabled by default."
221 chrome_policy {
222 SafeBrowsingEnabled {
223 policy_options {mode: MANDATORY}
224 SafeBrowsingEnabled: false
225 }
226 }
227 })");
228```
229
230#### Bad Examples
231```cpp
232 net::NetworkTrafficAnnotationTag bad_traffic_annotation =
233 net::DefineNetworkTrafficAnnotation(
234 ...
235 trigger: "Chrome sends this when [obscure event that is not related to "
236 "anything user-perceivable]."
237 // Please specify the exact user action that results in this request.
238 data: "This sends everything the feature needs to know."
239 // Please be precise, name the data items. If they are too many, name
240 // the sensitive user data and general classes of other data and refer
241 // to a document specifying the details.
242 ...
243 policy_exception_justification: "None."
244 // Check again! Most features can be disabled or limited by a policy.
245 ...
246 })");
247```
248
249#### Empty Template
250You can copy/paste the following template to define an annotation.
251```cpp
252 net::NetworkTrafficAnnotationTag traffic_annotation =
253 net::DefineNetworkTrafficAnnotation("...", R"(
254 semantics {
255 sender: "..."
256 description: "..."
257 trigger: "..."
258 data: "..."
259 destination: WEBSITE/GOOGLE_OWNED_SERVICE/OTHER
260 }
261 policy {
262 empty_policy_justification = "..."
263 cookies_allowed: NO/YES
264 cookies_store: "..."
265 setting: "..."
266 chrome_policy {
267 [POLICY_NAME] {
268 [POLICY_NAME]: ...
269 }
270 }
271 policy_exception_justification = "..."
272 }
273 comments: "..."
274 )");
275```
276
277
278## Testing for errors
279
280There are several checks that should be done on annotations before submitting a
281change list. These checks include:
282* The annotations are syntactically correct.
283* They have all required fields.
284* Partial annotations and completing parts match (please refer to the next
285 section).
286* Annotations are not incorrectly defined.
287 * e.g., traffic_annotation = NetworkTrafficAnnotation({1}).
288* All usages from Chrome have annotation.
289* Unique ids are unique, through history (even if an annotation gets deprecated,
290 its unique id cannot be reused to keep the stats sound).
291
292To do these tests, traffic_annotation_auditor binary runs over the whole
293repository and using a clang tool, checks if all above items are correct.
294Running the `traffic_annotation_auditor` requires exiting a compiled build
295directory and can be done with the following syntax.
296`tools/traffic_annotation/bin/[linux64/windows32/mac]/traffic_annotation_auditor
297 --build-path=[out/Default]`
298If you are running the auditor on Windows, please refer to extra instructions in
299`tools/traffic_annotation/auditor/README.md`.
300As this test is slow, it is not a mandatory step of the presubmit checks on
301clients, and one can run it manually. The test is done on trybots as a commit
302queue step.
303
304
305## Annotations Review
306
307Network traffic annotations require review by privacy, enterprise, and legal
308teams. To shorten the process of review, only privacy review is a blocking step
309and review by the other two teams will be done after code submission.
310Privacy reviews are enforced through keeping a summary of annotations in
311`tools/traffic_annotation/summary/annotations.xml`, which is owned by privacy
312team. Once a new annotation is added, one is updated, or deleted, this file
313should also be updated. To update the file automatically, one can run
314`traffic_annotation_auditor` as specified in above step. But if it is not
315possible to do so (e.g., if you are changing the code from an unsupported
316platform or you dont have a compiled build directory), the code can be
317submitted to the trybot and the test on trybot will tell you the required
318modifications.
319
320
321## Partial Annotations (Advanced)
322
323There are cases where the network traffic annotation cannot be fully specified
324in one place. For example, in one place we know the trigger of a network request
325and in another place we know the data that will be sent. In these cases, we
326prefer that both parts of the annotation appear in context so that they are
327updated if code changes. Partial annotations help splitting the network traffic
328annotation into two pieces. In these cases, we call the first part, the partial
329annotation, and the part the completes it, the completing annotation. Partial
330annotations and completing annotations do not need to have all annotation
331fields, but their composition should have all required fields.
332
333### Defining a Partial Annotation
334To define a partial annotation, one can use
335`net::DefinePartialNetworkTrafficAnnotation` function. Besides the unique id and
336annotation text, this function requires the unique id of the completing part.
337For example, a partial annotation that only specifies the semantics part or a
338request with unique id "omnibox_prefetch_image", and is completed later using an
339annotation with unique id "bitmap_fetcher", can be defined as follows:
340
341```cpp
342 net::PartialNetworkTrafficAnnotationTag partial_traffic_annotation =
343 net::DefinePartialNetworkTrafficAnnotation("omnibox_prefetch_image",
344 "bitmap_fetcher", R"(
345 semantics {
346 sender: "Omnibox"
347 Description: "..."
348 Trigger: "..."
349 Data: "..."
350 destination: WEBSITE
351 })");
352```
353
354### Nx1 Partial Annotations
355The cases where several partial annotations may be completed by one completing
356annotation are called Nx1. This also matches where N=1. To define a completing
357annotation for such cases, one can use net::CompleteNetworkTrafficAnnotation
358function. This function receives a unique id, the annotation text, and a
359`net::PartialNetworkTrafficAnnotationTag` object. Here is an example of a
360completing part for the previous example:
361
362```cpp
363 net::NetworkTrafficAnnotationTag traffic_annotation =
364 net::CompleteNetworkTrafficAnnotation("bitmap_fetcher",
365 partial_traffic_annotation, R"(
366 policy {
367 cookies_allowed: YES
368 cookies_store: "user"
369 setting: "..."
370 chrome_policy {...}
371 })");
372```
373
374### 1xN Partial Annotations
375There are cases where one partial traffic annotation may be completed by
376different completing annotations. In these cases,
377`net::BranchedCompleteNetworkTrafficAnnotation` function can be used. This
378function has an extra argument that is common between all branches and is
379referred to by the partial annotation. For the above examples, if there would be
380two different ways of completing the received partial annotation, the following
381the definition can be used:
382
383```cpp
384if (...) {
385 return net::BranchedCompleteNetworkTrafficAnnotation(
386 "bitmap_fetcher_type1", "bitmap_fetcher",
387 partial_traffic_annotation, R"(
388 policy {
389 cookies_allowed: YES
390 cookies_store: "user"
391 setting: "..."
392 chrome_policy {...}
393 })");
394 } else {
395 return net::BranchedCompleteNetworkTrafficAnnotation(
396 "bitmap_fetcher_type2", "bitmap_fetcher",
397 partial_traffic_annotation, R"(
398 policy {
399 cookies_allowed: YES
400 cookies_store: "system"
401 setting: "..."
402 chrome_policy {...}
403 })");
404```
405
406Please refer to `tools/traffic_annotation/sample_traffic_annotation.cc` for more
407detailed examples.
408
409
410## Mutable Annotations (Advanced)
411
412`net::NetworkTrafficAnnotationTag` and `net::PartialNetworkTrafficAnnotationTag`
413are defined with constant internal argument(s), so that once they are created,
414they cannot be modified. There are very few exceptions that may require
415modification of the annotation value, like the ones used by mojo interfaces
416where after serialization, the annotation object is first created, then receives
417value. In these cases, `net::MutableNetworkTrafficAnnotationTag` and
418`net::MutablePartialNetworkTrafficAnnotationTag` can be used which do not have
419this limitation. It is strongly suggested that mutable annotations would be used
420only if there is no other way around it. Use cases are checked with the
421`traffic_annotation_auditor` to ensure proper values for the mutable
422annotations.