maxbogue | 119f6c9 | 2017-02-01 21:28:46 | [diff] [blame] | 1 | # Chrome Sync's Model API |
| 2 | |
| 3 | Chrome Sync operates on discrete, explicitly defined model types (bookmarks, |
| 4 | preferences, tabs, etc). These model types are individually responsible for |
| 5 | implementing their own local storage and responding to remote changes. This |
| 6 | guide is for developers interested in syncing data for their model type to the |
| 7 | cloud using Chrome Sync. It describes the newest version of the API, known as |
| 8 | Unified Sync and Storage (USS). There is also the deprecated [SyncableService |
| 9 | API] (aka Directory), which as of early 2016 is still used by most model types. |
| 10 | |
| 11 | [SyncableService API]: https://www.chromium.org/developers/design-documents/sync/syncable-service-api |
| 12 | |
| 13 | [TOC] |
| 14 | |
| 15 | ## Overview |
| 16 | |
| 17 | To correctly sync data, USS requires that sync metadata be stored alongside your |
| 18 | model data in a way such that they are written together atomically. **This is |
| 19 | very important!** Sync must be able to update the metadata for any local data |
| 20 | changes as part of the same write to disk. If you attempt to write data to disk |
| 21 | and only notify sync afterwards, a crash in between the two writes can result in |
| 22 | changes being dropped and never synced to the server, or data being duplicated |
| 23 | due to being committed more than once. |
| 24 | |
| 25 | [`ModelTypeSyncBridge`][Bridge] is the interface the model code must implement. |
| 26 | The bridge tends to be either a [`KeyedService`][KeyedService] or owned by one. |
| 27 | The correct place for the bridge generally lies as close to where your model |
| 28 | data is stored as possible, as the bridge needs to be able to inject metadata |
| 29 | updates into any local data changes that occur. |
| 30 | |
| 31 | The bridge has access to a [`ModelTypeChangeProcessor`][MTCP] object, which it |
| 32 | uses to communicate local changes to sync using the `Put` and `Delete` methods. |
| 33 | The processor will communicate remote changes from sync to the bridge using the |
| 34 | `ApplySyncChanges` method. [`MetadataChangeList`][MCL] is the way sync will |
| 35 | communicate metadata changes to the storage mechanism. Note that it is typically |
| 36 | implemented on a per-storage basis, not a per-type basis. |
| 37 | |
| 38 | [Bridge]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_sync_bridge.h |
| 39 | [KeyedService]: https://cs.chromium.org/chromium/src/components/keyed_service/core/keyed_service.h |
| 40 | [MTCP]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_change_processor.h |
| 41 | [MCL]: https://cs.chromium.org/chromium/src/components/sync/model/metadata_change_list.h |
| 42 | |
| 43 | ## Data |
| 44 | |
| 45 | ### Specifics |
| 46 | |
| 47 | Model types will define a proto that contains the necessary fields of the |
| 48 | corresponding native type (e.g. [`TypedUrlSpecifics`][TypedUrlSpecifics] |
| 49 | contains a URL and a list of visit timestamps) and include it as a field in the |
| 50 | generic [`EntitySpecifics`][EntitySpecifics] proto. This is the form that all |
| 51 | communications with sync will use. This proto form of the model data is referred |
| 52 | to as the specifics. |
| 53 | |
| 54 | [TypedUrlSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/typed_url_specifics.proto |
| 55 | [EntitySpecifics]: https://cs.chromium.org/search/?q="message+EntitySpecifics"+file:sync.proto |
| 56 | |
| 57 | ### Identifiers |
| 58 | |
| 59 | There are two primary identifiers for entities: **storage key** and **client |
| 60 | tag**. The bridge will need to take an [`EntityData`][EntityData] object (which |
| 61 | contains the specifics) and be able generate both of these from it. For |
| 62 | non-legacy types without significant performance concerns, these will generally |
| 63 | be the same. |
| 64 | |
| 65 | The storage key is used to uniquely identify entities locally within a client. |
| 66 | It’s what’s used to refer to entities most of the time and, as its name implies, |
| 67 | the bridge needs to be able to look up local data and metadata entries in the |
| 68 | store using it. Because it is a local identifier, it can change as part of |
| 69 | database migrations, etc. This may be desirable for efficiency reasons. |
| 70 | |
| 71 | The client tag is used to generate the **client tag hash**, which will identify |
| 72 | entities **across clients**. This means that its implementation can **never |
| 73 | change** once entities have begun to sync, without risking massive duplication |
| 74 | of entities. This means it must be generated using only immutable data in the |
| 75 | specifics. If your type does not have any immutable fields to use, you will need |
| 76 | to add one (e.g. a GUID, though be wary as they have the potential to conflict). |
| 77 | While the hash gets written to disk as part of the metadata, the tag itself is |
| 78 | never persisted locally. |
| 79 | |
| 80 | [EntityData]: https://cs.chromium.org/chromium/src/components/sync/model/entity_data.h |
| 81 | |
| 82 | ## Storage |
| 83 | |
| 84 | A crucial requirement of USS is that the model must add support for keeping |
| 85 | sync’s metadata in the same storage as its normal data. The metadata consists of |
| 86 | one [`EntityMetadata`][EntityMetadata] proto for each data entity, and one |
| 87 | [`ModelTypeState`][ModelTypeState] proto containing metadata pertaining to the |
| 88 | state of the entire type (the progress marker, for example). This typically |
| 89 | requires two extra tables in a database to do (one for each type of proto). |
| 90 | |
| 91 | Since the processor doesn’t know anything about the store, the bridge provides |
| 92 | it with an implementation of the [`MetadataChangeList`][MCL] interface. The |
| 93 | change processor writes metadata through this interface when changes occur, and |
| 94 | the bridge simply has to ensure it gets passed along to the store and written |
| 95 | along with the data changes. |
| 96 | |
| 97 | [EntityMetadata]: https://cs.chromium.org/chromium/src/components/sync/protocol/entity_metadata.proto |
| 98 | [ModelTypeState]: https://cs.chromium.org/chromium/src/components/sync/protocol/model_type_state.proto |
| 99 | |
| 100 | ### ModelTypeStore |
| 101 | |
| 102 | While the model type may store its data however it chooses, many types use |
| 103 | [`ModelTypeStore`][Store], which was created specifically to provide a |
| 104 | convenient persistence solution. It’s backed by a [LevelDB] to store serialized |
| 105 | protos to disk. `ModelTypeStore` provides two `MetadataChangeList` |
| 106 | implementations for convenience; both accessed via |
| 107 | [`ModelTypeStore::WriteBatch`][WriteBatch]. One passes metadata changes directly |
| 108 | into an existing `WriteBatch` and another caches them in memory until a |
| 109 | `WriteBatch` exists to consume them. |
| 110 | |
| 111 | The store interface abstracts away the type and will handle setting up tables |
| 112 | for the type’s data, so multiple `ModelTypeStore` objects for different types |
| 113 | can share the same LevelDB backend just by specifying the same path and task |
| 114 | runner. Sync already has a backend it uses for DeviceInfo that can be shared by |
| 115 | other types via the |
| 116 | [`ProfileSyncService::GetModelTypeStoreFactory`][StoreFactory] method. |
| 117 | |
| 118 | [Store]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_store.h |
| 119 | [LevelDB]: http://leveldb.org/ |
| 120 | [WriteBatch]: https://cs.chromium.org/search/?q="class+WriteBatch"+file:model_type_store.h |
| 121 | [StoreFactory]: https://cs.chromium.org/search/?q=GetModelTypeStoreFactory+file:profile_sync_service.h |
| 122 | |
| 123 | ## Implementing ModelTypeSyncBridge |
| 124 | |
| 125 | ### Initialization |
| 126 | |
| 127 | The bridge is required to load all of the metadata for its type from storage and |
| 128 | provide it to the processor via the [`ModelReadyToSync`][ModelReadyToSync] |
| 129 | method **before any local changes occur**. This can be tricky if the thread the |
| 130 | bridge runs on is different from the storage mechanism. No data will be synced |
| 131 | with the server if the processor is never informed that the model is ready. |
| 132 | |
| 133 | Since the tracking of changes and updating of metadata is completely |
| 134 | independent, there is no need to wait for the sync engine to start before |
| 135 | changes can be made. This prevents the need for an expensive association step in |
| 136 | the initialization. |
| 137 | |
| 138 | [ModelReadyToSync]: https://cs.chromium.org/search/?q=ModelReadyToSync+file:/model_type_change_processor.h |
| 139 | |
| 140 | ### MergeSyncData |
| 141 | |
| 142 | This method is called only once, when a type is first enabled. Sync will |
| 143 | download all the data it has for the type from the server and provide it to the |
| 144 | bridge using this method. Sync filters out any tombstones for this call, so |
| 145 | `EntityData::is_deleted()` will never be true for the provided entities. The |
| 146 | bridge must then examine the sync data and the local data and merge them |
| 147 | together: |
| 148 | |
| 149 | * Any remote entities that don’t exist locally must be be written to local |
| 150 | storage. |
| 151 | * Any local entities that don’t exist remotely must be provided to sync via |
| 152 | [`ModelTypeChangeProcessor::Put`][Put]. |
| 153 | * Any entities that appear in both sets must be merged and the model and sync |
| 154 | informed accordingly. Decide which copy of the data to use (or a merged |
| 155 | version or neither) and update the local store and sync as necessary to |
| 156 | reflect the decision. How the decision is made can vary by model type. |
| 157 | |
| 158 | The [`MetadataChangeList`][MCL] passed into the function is already populated |
| 159 | with metadata for all the data passed in (note that neither the data nor the |
| 160 | metadata have been committed to storage yet at this point). It must be given to |
| 161 | the processor for any `Put` or `Delete` calls so the relevant metadata can be |
| 162 | added/updated/deleted, and then passed to the store for persisting along with |
| 163 | the data. |
| 164 | |
| 165 | Note that if sync gets disabled and the metadata cleared, entities that |
| 166 | originated from other clients will exist as “local” entities the next time sync |
| 167 | starts and merge is called. Since tombstones are not provided for merge, this |
| 168 | can result in reviving the entity if it had been deleted on another client in |
| 169 | the meantime. |
| 170 | |
| 171 | [Put]: https://cs.chromium.org/search/?q=Put+file:/model_type_change_processor.h |
| 172 | |
| 173 | ### ApplySyncChanges |
| 174 | |
| 175 | While `MergeSyncData` provides the state of sync data using `EntityData` |
| 176 | objects, `ApplySyncChanges` provides changes to the state using |
| 177 | [`EntityChange`][EntityChange] objects. These changes must be applied to the |
| 178 | local state. |
| 179 | |
| 180 | Here’s an example implementation of a type using `ModelTypeStore`: |
| 181 | |
| 182 | ```cpp |
| 183 | base::Optional<ModelError> DeviceInfoSyncBridge::ApplySyncChanges( |
| 184 | std::unique_ptr<MetadataChangeList> metadata_change_list, |
| 185 | EntityChangeList entity_changes) { |
| 186 | std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch(); |
| 187 | for (const EntityChange& change : entity_changes) { |
| 188 | if (change.type() == EntityChange::ACTION_DELETE) { |
| 189 | batch->DeleteData(change.storage_key()); |
| 190 | } else { |
| 191 | batch->WriteData(change.storage_key(), |
| 192 | change.data().specifics.your_type().SerializeAsString()); |
| 193 | } |
| 194 | } |
| 195 | |
Mikel Astiz | f88d7e872 | 2018-03-07 20:06:40 | [diff] [blame] | 196 | batch->TakeMetadataChangesFrom(std::move(metadata_change_list)); |
maxbogue | 119f6c9 | 2017-02-01 21:28:46 | [diff] [blame] | 197 | store_->CommitWriteBatch(std::move(batch), base::Bind(...)); |
| 198 | NotifyModelOfChanges(); |
| 199 | return {}; |
| 200 | } |
| 201 | ``` |
| 202 | |
| 203 | A conflict can occur when an entity has a pending local commit when an update |
| 204 | for the same entity comes from another client. In this case, the bridge’s |
| 205 | [`ResolveConflict`][ResolveConflict] method will have been called prior to the |
| 206 | `ApplySyncChanges` call in order to determine what should happen. This method |
| 207 | defaults to having the remote version overwrite the local version unless the |
| 208 | remote version is a tombstone, in which case the local version wins. |
| 209 | |
| 210 | [EntityChange]: https://cs.chromium.org/chromium/src/components/sync/model/entity_change.h |
| 211 | [ResolveConflict]: https://cs.chromium.org/search/?q=ResolveConflict+file:/model_type_sync_bridge.h |
| 212 | |
| 213 | ### Local changes |
| 214 | |
| 215 | The [`ModelTypeChangeProcessor`][MTCP] must be informed of any local changes via |
| 216 | its `Put` and `Delete` methods. Since the processor cannot do any useful |
| 217 | metadata tracking until `MergeSyncData` is called, the `IsTrackingMetadata` |
| 218 | method is provided. It can be checked as an optimization to prevent unnecessary |
| 219 | processing preparing the parameters to a `Put` or `Delete` call. |
| 220 | |
| 221 | Here’s an example of handling a local write using `ModelTypeStore`: |
| 222 | |
| 223 | ```cpp |
| 224 | void WriteLocalChange(std::string key, ModelData data) { |
| 225 | std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch(); |
| 226 | if (change_processor()->IsTrackingMetadata()) { |
| 227 | change_processor()->Put(key, ModelToEntityData(data), |
| 228 | batch->GetMetadataChangeList()); |
| 229 | } |
| 230 | batch->WriteData(key, specifics->SerializeAsString()); |
| 231 | store_->CommitWriteBatch(std::move(batch), base::Bind(...)); |
| 232 | } |
| 233 | ``` |
| 234 | |
| 235 | ## Error handling |
| 236 | |
| 237 | If any errors occur during store operations that could compromise the |
| 238 | consistency of the data and metadata, the processor’s |
| 239 | [`ReportError`][ReportError] method should be called. The only exception to this |
| 240 | is errors during `MergeSyncData` or `ApplySyncChanges`, which should just return |
| 241 | a [`ModelError`][ModelError]. |
| 242 | |
| 243 | This will inform sync of the error, which will stop all communications with the |
| 244 | server so bad data doesn’t get synced. Since the metadata might no longer be |
| 245 | valid, the bridge will asynchronously receive a `DisableSync` call (this is |
| 246 | implemented by the abstract base class; subclasses don’t need to do anything). |
| 247 | All the metadata will be cleared from the store (if possible), and the type will |
| 248 | be started again from scratch on the next client restart. |
| 249 | |
| 250 | [ReportError]: https://cs.chromium.org/search/?q=ReportError+file:/model_type_change_processor.h |
| 251 | [ModelError]: https://cs.chromium.org/chromium/src/components/sync/model/model_error.h |
| 252 | |
| 253 | ## Sync Integration Checklist |
| 254 | |
| 255 | * Define your specifics proto in [`//components/sync/protocol/`][protocol]. |
| 256 | * Add a field for it to [`EntitySpecifics`][EntitySpecifics]. |
| 257 | * Add it to the [`ModelType`][ModelType] enum and |
| 258 | [`kModelTypeInfoMap`][info_map]. |
| 259 | * Add it to the [proto value conversions][conversions] files. |
| 260 | * Register a [`ModelTypeController`][ModelTypeController] for your type in |
| 261 | [`ProfileSyncComponentsFactoryImpl::RegisterDataTypes`][RegisterDataTypes]. |
| 262 | * Tell sync how to access your `ModelTypeSyncBridge` in |
| 263 | [`ChromeSyncClient::GetSyncBridgeForModelType`][GetSyncBridge]. |
skym | 31da0bb | 2017-03-28 16:52:20 | [diff] [blame] | 264 | * Add your KeyedService dependency to |
| 265 | [`ProfileSyncServiceFactory`][ProfileSyncServiceFactory]. |
maxbogue | 119f6c9 | 2017-02-01 21:28:46 | [diff] [blame] | 266 | * Add to the [start order list][kStartOrder]. |
| 267 | * Add an field for encrypted data to [`NigoriSpecifics`][NigoriSpecifics]. |
| 268 | * Add to two encrypted types translation functions in |
| 269 | [`nigori_util.cc`][nigori_util]. |
| 270 | * Add a [preference][pref_names] for tracking whether your type is enabled. |
| 271 | * Map your type to the pref in [`GetPrefNameForDataType`][GetPrefName]. |
| 272 | * Check whether you should be part of a [pref group][RegisterPrefGroup]. |
| 273 | * Add to the `SyncModelTypes` enum and `SyncModelType` suffix in |
| 274 | [`histograms.xml`][histograms]. |
| 275 | * Add to the [`SYNC_DATA_TYPE_HISTOGRAM`][DataTypeHistogram] macro. |
| 276 | |
| 277 | [protocol]: https://cs.chromium.org/chromium/src/components/sync/protocol/ |
| 278 | [ModelType]: https://cs.chromium.org/chromium/src/components/sync/base/model_type.h |
| 279 | [info_map]: https://cs.chromium.org/search/?q="kModelTypeInfoMap%5B%5D"+file:model_type.cc |
| 280 | [conversions]: https://cs.chromium.org/chromium/src/components/sync/protocol/proto_value_conversions.h |
| 281 | [ModelTypeController]: https://cs.chromium.org/chromium/src/components/sync/driver/model_type_controller.h |
| 282 | [RegisterDataTypes]: https://cs.chromium.org/search/?q="ProfileSyncComponentsFactoryImpl::RegisterDataTypes" |
| 283 | [GetSyncBridge]: https://cs.chromium.org/search/?q=GetSyncBridgeForModelType+file:chrome_sync_client.cc |
skym | 31da0bb | 2017-03-28 16:52:20 | [diff] [blame] | 284 | [ProfileSyncServiceFactory]: https://cs.chromium.org/search/?q=:ProfileSyncServiceFactory%5C(%5C) |
maxbogue | 119f6c9 | 2017-02-01 21:28:46 | [diff] [blame] | 285 | [kStartOrder]: https://cs.chromium.org/search/?q="kStartOrder[]" |
| 286 | [NigoriSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/nigori_specifics.proto |
| 287 | [nigori_util]: https://cs.chromium.org/chromium/src/components/sync/syncable/nigori_util.cc |
| 288 | [pref_names]: https://cs.chromium.org/chromium/src/components/sync/base/pref_names.h |
| 289 | [GetPrefName]: https://cs.chromium.org/search/?q=::GetPrefNameForDataType+file:sync_prefs.cc |
| 290 | [RegisterPrefGroup]: https://cs.chromium.org/search/?q=::RegisterPrefGroups+file:sync_prefs.cc |
| 291 | [histograms]: https://cs.chromium.org/chromium/src/tools/metrics/histograms/histograms.xml |
| 292 | [DataTypeHistogram]: https://cs.chromium.org/chromium/src/components/sync/base/data_type_histogram.h |
| 293 | |
| 294 | ## Testing |
| 295 | |
| 296 | The [`TwoClientUssSyncTest`][UssTest] suite is probably a good place to start |
| 297 | for integration testing. Especially note the use of a `StatusChangeChecker` to |
| 298 | wait for events to happen. |
| 299 | |
| 300 | [UssTest]: https://cs.chromium.org/chromium/src/chrome/browser/sync/test/integration/two_client_uss_sync_test.cc |