blob: 4d193695c5d722559b825ec7d8b8089bc9d40d5d [file] [log] [blame] [view]
maxbogue119f6c92017-02-01 21:28:461# Chrome Sync's Model API
2
3Chrome Sync operates on discrete, explicitly defined model types (bookmarks,
4preferences, tabs, etc). These model types are individually responsible for
5implementing their own local storage and responding to remote changes. This
6guide is for developers interested in syncing data for their model type to the
7cloud using Chrome Sync. It describes the newest version of the API, known as
8Unified Sync and Storage (USS). There is also the deprecated [SyncableService
9API] (aka Directory), which as of early 2016 is still used by most model types.
10
11[SyncableService API]: https://www.chromium.org/developers/design-documents/sync/syncable-service-api
12
13[TOC]
14
15## Overview
16
17To correctly sync data, USS requires that sync metadata be stored alongside your
18model data in a way such that they are written together atomically. **This is
19very important!** Sync must be able to update the metadata for any local data
20changes as part of the same write to disk. If you attempt to write data to disk
21and only notify sync afterwards, a crash in between the two writes can result in
22changes being dropped and never synced to the server, or data being duplicated
23due to being committed more than once.
24
25[`ModelTypeSyncBridge`][Bridge] is the interface the model code must implement.
26The bridge tends to be either a [`KeyedService`][KeyedService] or owned by one.
27The correct place for the bridge generally lies as close to where your model
28data is stored as possible, as the bridge needs to be able to inject metadata
29updates into any local data changes that occur.
30
31The bridge has access to a [`ModelTypeChangeProcessor`][MTCP] object, which it
32uses to communicate local changes to sync using the `Put` and `Delete` methods.
33The processor will communicate remote changes from sync to the bridge using the
34`ApplySyncChanges` method. [`MetadataChangeList`][MCL] is the way sync will
35communicate metadata changes to the storage mechanism. Note that it is typically
36implemented on a per-storage basis, not a per-type basis.
37
38[Bridge]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_sync_bridge.h
39[KeyedService]: https://cs.chromium.org/chromium/src/components/keyed_service/core/keyed_service.h
40[MTCP]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_change_processor.h
41[MCL]: https://cs.chromium.org/chromium/src/components/sync/model/metadata_change_list.h
42
43## Data
44
45### Specifics
46
47Model types will define a proto that contains the necessary fields of the
48corresponding native type (e.g. [`TypedUrlSpecifics`][TypedUrlSpecifics]
49contains a URL and a list of visit timestamps) and include it as a field in the
50generic [`EntitySpecifics`][EntitySpecifics] proto. This is the form that all
51communications with sync will use. This proto form of the model data is referred
52to as the specifics.
53
54[TypedUrlSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/typed_url_specifics.proto
55[EntitySpecifics]: https://cs.chromium.org/search/?q="message+EntitySpecifics"+file:sync.proto
56
57### Identifiers
58
59There are two primary identifiers for entities: **storage key** and **client
60tag**. The bridge will need to take an [`EntityData`][EntityData] object (which
61contains the specifics) and be able generate both of these from it. For
62non-legacy types without significant performance concerns, these will generally
63be the same.
64
65The storage key is used to uniquely identify entities locally within a client.
66Its whats used to refer to entities most of the time and, as its name implies,
67the bridge needs to be able to look up local data and metadata entries in the
68store using it. Because it is a local identifier, it can change as part of
69database migrations, etc. This may be desirable for efficiency reasons.
70
71The client tag is used to generate the **client tag hash**, which will identify
72entities **across clients**. This means that its implementation can **never
73change** once entities have begun to sync, without risking massive duplication
74of entities. This means it must be generated using only immutable data in the
75specifics. If your type does not have any immutable fields to use, you will need
76to add one (e.g. a GUID, though be wary as they have the potential to conflict).
77While the hash gets written to disk as part of the metadata, the tag itself is
78never persisted locally.
79
80[EntityData]: https://cs.chromium.org/chromium/src/components/sync/model/entity_data.h
81
82## Storage
83
84A crucial requirement of USS is that the model must add support for keeping
85syncs metadata in the same storage as its normal data. The metadata consists of
86one [`EntityMetadata`][EntityMetadata] proto for each data entity, and one
87[`ModelTypeState`][ModelTypeState] proto containing metadata pertaining to the
88state of the entire type (the progress marker, for example). This typically
89requires two extra tables in a database to do (one for each type of proto).
90
91Since the processor doesnt know anything about the store, the bridge provides
92it with an implementation of the [`MetadataChangeList`][MCL] interface. The
93change processor writes metadata through this interface when changes occur, and
94the bridge simply has to ensure it gets passed along to the store and written
95along with the data changes.
96
97[EntityMetadata]: https://cs.chromium.org/chromium/src/components/sync/protocol/entity_metadata.proto
98[ModelTypeState]: https://cs.chromium.org/chromium/src/components/sync/protocol/model_type_state.proto
99
100### ModelTypeStore
101
102While the model type may store its data however it chooses, many types use
103[`ModelTypeStore`][Store], which was created specifically to provide a
104convenient persistence solution. Its backed by a [LevelDB] to store serialized
105protos to disk. `ModelTypeStore` provides two `MetadataChangeList`
106implementations for convenience; both accessed via
107[`ModelTypeStore::WriteBatch`][WriteBatch]. One passes metadata changes directly
108into an existing `WriteBatch` and another caches them in memory until a
109`WriteBatch` exists to consume them.
110
111The store interface abstracts away the type and will handle setting up tables
112for the types data, so multiple `ModelTypeStore` objects for different types
113can share the same LevelDB backend just by specifying the same path and task
114runner. Sync already has a backend it uses for DeviceInfo that can be shared by
115other types via the
116[`ProfileSyncService::GetModelTypeStoreFactory`][StoreFactory] method.
117
118[Store]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_store.h
119[LevelDB]: http://leveldb.org/
120[WriteBatch]: https://cs.chromium.org/search/?q="class+WriteBatch"+file:model_type_store.h
121[StoreFactory]: https://cs.chromium.org/search/?q=GetModelTypeStoreFactory+file:profile_sync_service.h
122
123## Implementing ModelTypeSyncBridge
124
125### Initialization
126
127The bridge is required to load all of the metadata for its type from storage and
128provide it to the processor via the [`ModelReadyToSync`][ModelReadyToSync]
129method **before any local changes occur**. This can be tricky if the thread the
130bridge runs on is different from the storage mechanism. No data will be synced
131with the server if the processor is never informed that the model is ready.
132
133Since the tracking of changes and updating of metadata is completely
134independent, there is no need to wait for the sync engine to start before
135changes can be made. This prevents the need for an expensive association step in
136the initialization.
137
138[ModelReadyToSync]: https://cs.chromium.org/search/?q=ModelReadyToSync+file:/model_type_change_processor.h
139
140### MergeSyncData
141
142This method is called only once, when a type is first enabled. Sync will
143download all the data it has for the type from the server and provide it to the
144bridge using this method. Sync filters out any tombstones for this call, so
145`EntityData::is_deleted()` will never be true for the provided entities. The
146bridge must then examine the sync data and the local data and merge them
147together:
148
149* Any remote entities that dont exist locally must be be written to local
150 storage.
151* Any local entities that dont exist remotely must be provided to sync via
152 [`ModelTypeChangeProcessor::Put`][Put].
153* Any entities that appear in both sets must be merged and the model and sync
154 informed accordingly. Decide which copy of the data to use (or a merged
155 version or neither) and update the local store and sync as necessary to
156 reflect the decision. How the decision is made can vary by model type.
157
158The [`MetadataChangeList`][MCL] passed into the function is already populated
159with metadata for all the data passed in (note that neither the data nor the
160metadata have been committed to storage yet at this point). It must be given to
161the processor for any `Put` or `Delete` calls so the relevant metadata can be
162added/updated/deleted, and then passed to the store for persisting along with
163the data.
164
165Note that if sync gets disabled and the metadata cleared, entities that
166originated from other clients will exist as local entities the next time sync
167starts and merge is called. Since tombstones are not provided for merge, this
168can result in reviving the entity if it had been deleted on another client in
169the meantime.
170
171[Put]: https://cs.chromium.org/search/?q=Put+file:/model_type_change_processor.h
172
173### ApplySyncChanges
174
175While `MergeSyncData` provides the state of sync data using `EntityData`
176objects, `ApplySyncChanges` provides changes to the state using
177[`EntityChange`][EntityChange] objects. These changes must be applied to the
178local state.
179
180Heres an example implementation of a type using `ModelTypeStore`:
181
182```cpp
183base::Optional<ModelError> DeviceInfoSyncBridge::ApplySyncChanges(
184 std::unique_ptr<MetadataChangeList> metadata_change_list,
185 EntityChangeList entity_changes) {
186 std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch();
187 for (const EntityChange& change : entity_changes) {
188 if (change.type() == EntityChange::ACTION_DELETE) {
189 batch->DeleteData(change.storage_key());
190 } else {
191 batch->WriteData(change.storage_key(),
192 change.data().specifics.your_type().SerializeAsString());
193 }
194 }
195
Mikel Astizf88d7e8722018-03-07 20:06:40196 batch->TakeMetadataChangesFrom(std::move(metadata_change_list));
maxbogue119f6c92017-02-01 21:28:46197 store_->CommitWriteBatch(std::move(batch), base::Bind(...));
198 NotifyModelOfChanges();
199 return {};
200}
201```
202
203A conflict can occur when an entity has a pending local commit when an update
204for the same entity comes from another client. In this case, the bridges
205[`ResolveConflict`][ResolveConflict] method will have been called prior to the
206`ApplySyncChanges` call in order to determine what should happen. This method
207defaults to having the remote version overwrite the local version unless the
208remote version is a tombstone, in which case the local version wins.
209
210[EntityChange]: https://cs.chromium.org/chromium/src/components/sync/model/entity_change.h
211[ResolveConflict]: https://cs.chromium.org/search/?q=ResolveConflict+file:/model_type_sync_bridge.h
212
213### Local changes
214
215The [`ModelTypeChangeProcessor`][MTCP] must be informed of any local changes via
216its `Put` and `Delete` methods. Since the processor cannot do any useful
217metadata tracking until `MergeSyncData` is called, the `IsTrackingMetadata`
218method is provided. It can be checked as an optimization to prevent unnecessary
219processing preparing the parameters to a `Put` or `Delete` call.
220
221Heres an example of handling a local write using `ModelTypeStore`:
222
223```cpp
224void WriteLocalChange(std::string key, ModelData data) {
225 std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch();
226 if (change_processor()->IsTrackingMetadata()) {
227 change_processor()->Put(key, ModelToEntityData(data),
228 batch->GetMetadataChangeList());
229 }
230 batch->WriteData(key, specifics->SerializeAsString());
231 store_->CommitWriteBatch(std::move(batch), base::Bind(...));
232}
233```
234
235## Error handling
236
237If any errors occur during store operations that could compromise the
238consistency of the data and metadata, the processors
239[`ReportError`][ReportError] method should be called. The only exception to this
240is errors during `MergeSyncData` or `ApplySyncChanges`, which should just return
241a [`ModelError`][ModelError].
242
243This will inform sync of the error, which will stop all communications with the
244server so bad data doesnt get synced. Since the metadata might no longer be
245valid, the bridge will asynchronously receive a `DisableSync` call (this is
246implemented by the abstract base class; subclasses dont need to do anything).
247All the metadata will be cleared from the store (if possible), and the type will
248be started again from scratch on the next client restart.
249
250[ReportError]: https://cs.chromium.org/search/?q=ReportError+file:/model_type_change_processor.h
251[ModelError]: https://cs.chromium.org/chromium/src/components/sync/model/model_error.h
252
253## Sync Integration Checklist
254
255* Define your specifics proto in [`//components/sync/protocol/`][protocol].
256* Add a field for it to [`EntitySpecifics`][EntitySpecifics].
257* Add it to the [`ModelType`][ModelType] enum and
258 [`kModelTypeInfoMap`][info_map].
259* Add it to the [proto value conversions][conversions] files.
260* Register a [`ModelTypeController`][ModelTypeController] for your type in
261 [`ProfileSyncComponentsFactoryImpl::RegisterDataTypes`][RegisterDataTypes].
262* Tell sync how to access your `ModelTypeSyncBridge` in
263 [`ChromeSyncClient::GetSyncBridgeForModelType`][GetSyncBridge].
skym31da0bb2017-03-28 16:52:20264* Add your KeyedService dependency to
265 [`ProfileSyncServiceFactory`][ProfileSyncServiceFactory].
maxbogue119f6c92017-02-01 21:28:46266* Add to the [start order list][kStartOrder].
267* Add an field for encrypted data to [`NigoriSpecifics`][NigoriSpecifics].
268* Add to two encrypted types translation functions in
269 [`nigori_util.cc`][nigori_util].
270* Add a [preference][pref_names] for tracking whether your type is enabled.
271* Map your type to the pref in [`GetPrefNameForDataType`][GetPrefName].
272* Check whether you should be part of a [pref group][RegisterPrefGroup].
273* Add to the `SyncModelTypes` enum and `SyncModelType` suffix in
274 [`histograms.xml`][histograms].
275* Add to the [`SYNC_DATA_TYPE_HISTOGRAM`][DataTypeHistogram] macro.
276
277[protocol]: https://cs.chromium.org/chromium/src/components/sync/protocol/
278[ModelType]: https://cs.chromium.org/chromium/src/components/sync/base/model_type.h
279[info_map]: https://cs.chromium.org/search/?q="kModelTypeInfoMap%5B%5D"+file:model_type.cc
280[conversions]: https://cs.chromium.org/chromium/src/components/sync/protocol/proto_value_conversions.h
281[ModelTypeController]: https://cs.chromium.org/chromium/src/components/sync/driver/model_type_controller.h
282[RegisterDataTypes]: https://cs.chromium.org/search/?q="ProfileSyncComponentsFactoryImpl::RegisterDataTypes"
283[GetSyncBridge]: https://cs.chromium.org/search/?q=GetSyncBridgeForModelType+file:chrome_sync_client.cc
skym31da0bb2017-03-28 16:52:20284[ProfileSyncServiceFactory]: https://cs.chromium.org/search/?q=:ProfileSyncServiceFactory%5C(%5C)
maxbogue119f6c92017-02-01 21:28:46285[kStartOrder]: https://cs.chromium.org/search/?q="kStartOrder[]"
286[NigoriSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/nigori_specifics.proto
287[nigori_util]: https://cs.chromium.org/chromium/src/components/sync/syncable/nigori_util.cc
288[pref_names]: https://cs.chromium.org/chromium/src/components/sync/base/pref_names.h
289[GetPrefName]: https://cs.chromium.org/search/?q=::GetPrefNameForDataType+file:sync_prefs.cc
290[RegisterPrefGroup]: https://cs.chromium.org/search/?q=::RegisterPrefGroups+file:sync_prefs.cc
291[histograms]: https://cs.chromium.org/chromium/src/tools/metrics/histograms/histograms.xml
292[DataTypeHistogram]: https://cs.chromium.org/chromium/src/components/sync/base/data_type_histogram.h
293
294## Testing
295
296The [`TwoClientUssSyncTest`][UssTest] suite is probably a good place to start
297for integration testing. Especially note the use of a `StatusChangeChecker` to
298wait for events to happen.
299
300[UssTest]: https://cs.chromium.org/chromium/src/chrome/browser/sync/test/integration/two_client_uss_sync_test.cc