As I continue to plan apps for iOS – both future apps and new features for my existing P2R app – one thing I keep coming back to is the need for data synchronization. In this day of multiple devices and platforms, it is increasingly necessary to ensure users have access to their data no matter which platform or device they are using. Having an app that a user can use on both his iPad and iPhone isn’t much use of the data on each is kept separate.
Creating a method to synchronize data is not trivial. There are numerous sync approaches out there, each with its plusses and minuses. In the case of a sync server for iOS apps, the approach I’m taking is fairly straightforward. Known as Optimistic Concurrency Control, this approach relies on the assumption that data conflicts are likely to be relatively minimal and not likely to result from multiple clients attempting to synchronize at the same time. Since we’re dealing with iOS apps (and possibly a desktop version), the assumption is fairly safe. There will not be multiple users trying to sync the same data at the same time. Worst case scenario, a user tells his iPhone and iPad to sync at the same time.
The following process is used for synchronization:
- For each user/app combination, there is a global revision id and global revision number. If John Smith starts to synchronize P2R, all the data he syncs is collectively tracked with a global revision id and revision number.
- The revision id never changes. The revision number is incremented any time the user synchronizes data that causes a change on the server. If the global revision number is 17 and the user uploads some new notes, the global revision id increments to 18 to reflect that changes have been made.
- Every piece of data should be tracked separately. In the case of P2R, every user, every collection, every note, every review flag should be stored as its own discrete data unit with its own unique revision id and its own distinct revision number. Keeping each item separate will increase the number of things that need to be tracked but should also reduce the amount of information that has to be resolved in case of conflict.
- In addition to the global revision id and number, there is also a revision tracking field. Every piece of data has its own unique revision id which never changes. The revision tracking field stores the revision id of any data units that changed in the current global revision number. If a user uploads changes to revision ids 4, 6, and 18 in revision number 8 then these ids would be stored in the revision tracking field for revision 8.
- Whenever any data changes on the client, a flag is set on the data unit telling the client that this data unit will need to be sent to the server during the next synchronization.
- When the client sends a sync request to the server, in addition to any changes, it also sends the global revision number it received from the server during the last sync. The server looks to see which revision ids have been changed since the client last resolved. After updating its data with the new data from the client, any changed data is sent back to the client.
- Data sent from the server to the client will include any data units the client uploaded to the server. When the client reloads this data, it will reset the flag indicating the data was changed. This is done to help guarantee the changes were actually loaded on the server. If the flag does not get reset on the client, then next time a sync takes place it will again try to upload any data with the change flag set. This has the potential of creating unnecessary conflict, but has the advantage of providing a way of ensuring client data makes it to the server.
Those are the basic steps for synchronization and revision. Optimally, the server will be written with a web interface that gives users access to old versions of their data, providing a kind of backup to their information. Also, the web interface should provide a means for users to download all of their data.
So far I’ve not said anything about conflict management. Conflict occurs when one client connects to the server with an old revision number and uploads new data. Following that, another client connects to the server with an old revision number and tries to upload data for the same data unit/s as the first client. This creates competing/conflicting sets of data. Which data should be stored on the server?
The process for handling conflict would look something like:
- Conflict will be resolved automatically by the server. No user input should be required.
- Conflict resolution will vary depending on data type. The following items use P2R as an example.
- When synchronizing collections, don’t store any text collection provided by the server. Instead, always point the user to the latest version of the text collection already on the server.
- If synchronizing a custom collection, always use the collection with the most recent timestamp.
- If synchronizing users (username), always use the user entry with the most recent timestamp.
- If synchronizing weekly reviews, don’t allow duplicate entries but otherwise do no conflict resolution.
- If synchronizing notes and the notes differ, concatenate the notes with the newest copy on top.
- If synchronizing which collection/week a user is currently viewing, use the most recent.
That should handle synchronization and conflict resolution for P2R and provide some examples when considering sync methods for other apps. Currently, the code is still in the (very) early stages. As I write the sync server I may find that some of the methods proposed in this post simply don’t work. I’ll provide updates as I go. In the meantime, let me know if you see any holes in my plan or know of a better way of doing this!




Recent Entries
Tech Talk
Projects
Archive
Tags