The Mental Blog

Software with Intellect

19 notes

Under the Sheets with iCloud and Core Data: How it Works

Apple’s documentation for Core Data syncing via iCloud offers very little insight into how it actually works. This is probably quite deliberate: Apple sees this as an implementation detail that developers should not concern themselves with. Unfortunately, if you have no concept of what is happening behind the scenes, at least in vague terms, it is difficult to grasp what you should expect during the syncing process.

To give a concrete example, I was initially under the impression that importing of transaction logs took place at a low level, independent of my application code. This is not actually the case. The import uses the same model classes that the rest of the app uses, and calls the same custom accessor methods. If those accessors cause side effects, such as the creation of new objects, the import fails. Not understanding this ‘implementation detail’ ended up costing me a couple of weeks of head scratching.

We will return to side effects in a future post, but today we will focus on how iCloud syncing works behind the scenes.

Pushing Bits

At its heart, iCloud is not much more than a folder — the iCloud container — that syncs its contents between devices, just like Dropbox. The folder is not readily visible to an end-user — it is hidden away at ~/Library/Mobile Documents — but it is there, and if you drop a file in, it will magically appear on your other iCloud devices.

So how does Core Data fit in? At first you may be tempted to think Core Data is talking to an intelligent server in the cloud, one that understands your entity relationships. This is how Core Data syncing worked under MobileMe. But iCloud syncing is an altogether simpler affair on the server side: the server has no understanding of iCloud at all, it just pushes bits from one device to another.

The intelligence lies entirely in the Core Data framework on each client device. The framework registers changes when your app saves, and writes them away as transaction log files in the iCloud container. iCloud copies the files to the cloud, and then to your other devices. The Core Data framework on these devices notices the new log files, and imports them.

In other words, where with MobileMe there was a single truth database in the cloud, there is now just a collection of transaction logs from different devices, and a number of clients doing their best to independently reconcile those logs. As long as the updates are reasonably distinct, it works well, but when changes overlap considerably, you can get quite tricky situations. A lot of the discussion in this series of posts revolves around working around the conflicts that can arise.

Transaction Logs

When you hear the term transaction log, you may be led to think that it is some low level log file generated by the underlying SQLite database, but the logs are actually just property lists. If you dig into the iCloud container of your app, you will find a number of Core Data Transaction files with the extension .cdt. Each one is a zipped plist file; if you unzip one of them, you should find it contains a file called contents, and this can be opened as a property list in a text editor.

The iCloud container is a subfolder of ~/Library/Mobile Documents; it has the same reverse-DNS name used in the application code, but with all periods replaced by tildes (e.g. ~com~mentalfaculty~mentalcase). Inside the container, you should find one subfolder for each user-device combination. If you dig even deeper into the folder of one of the devices, you will see a folder for each Core Data store that is being synced, and — a few levels deeper — the CDT files themselves.

It is not really necessary to understand the contents of a CDT file, because it is an implementation detail that can change, but in my view it does help to break down some of the mystery of how iCloud—Core Data syncing does its thing, so I want to go through it in some detail.

The structure of the CDT properly list is reasonably transparent. It begins with an array mapping each object in the log file to an entity, and a primary key in the local database.

 {  compressedGlobalIDs = ( "0:0:0", "1:1:0", "2:2:0", "3:3:0", "4:4:0", "5:0:0", "4:0:0", "1:5:0", "0:6:0",
        "0:7:0", "5:8:0", "0:8:0", "5:4:0", "5:9:0", "1:4:0", "2:10:0", "3:11:0", "4:8:0",
        "1:12:0", "0:13:0", "4:14:0", "1:15:0", ...

The first number in each tuple is the index of an entity in the entityNames array.

entityNames = ( "MCNoteFacet", "MCNote", "MCNoteFacetRole", "MCScalableText", "MCFacetPermutation",
    "MCMediaFile", "MCNoteOccurrence", "MCManualCase", "MCCollectionCase", "MCNoteTemplate",
    "MCSlideshow", "MCScheduleManager"
);

The second number in each tuple is the primary key in that entity’s SQLite table. (I have not been able to determine what exactly the third number is for.)

primaryKeys = ( "p1", "p8", "p4", "p32", "p27", "p12", "p48", "p18", "p2", "p46", "p10", "p52", "p9",
    "p15", "p3", "p26", "p45", "p58", "p6", "p5", "p24", "p71", "p11", "p29", "p41", ...

Core Data keeps track of objects across different stores by assigning unique identifiers to inserted objects.

externalDataReferencesInfo = { inserted = ( "1EB4B045-FD76-4791-937B-96BEF29462AD", "BEDDFF4B-6C77-43D3-977E-D1B6734CCFAE", 
"29E8179F-A896-4775-9782-597C2ED92B82", "EF4733C4-C317-41B8-91D8-A4E64BC5D067", ...

Dictionaries are included for all deletions, insertions, and updates. The keys for the dictionaries are the indexes of the corresponding objects in the compressedGlobalIDs array. These indexes are also used to refer to objects that are the target of relationships.

For example, in the insertion dictionary that follows the key is 0, which means the inserted object corresponds to the first item in the compressedGlobalIDs array above. That object has tuple “0:0:0”. The first 0 refers to the index of the entity in the entityNames array, which is MCNoteFacet. The second 0 is the primary key in the MCNoteFacet SQLite table, which is “p1”.

inserted = {
    0 = {
        appearsInSlideshows = 1;
        backgroundColor = <312E3030 30303030 20312E30 30303030 3020312E 30303030 30302031 2E303030 303030>;
        canBeCombinedOnSlide = 0;
        canBePrompt = 0;
        isFactual = 1;
        note = "1";
        noteFacetRole = "2";
        orderIndex = 1;
        promptFacetPermutations = ( );
        responseFacetPermutations = ( "4" );
        scalableText = "3";
        slideFacetOccurrences = ( );
        uniqueId = "2BAC1A44-77AA-466C-A64D-4C0426CCCA07-44703-000132BEDC7A394E";
    };

You can see that the value corresponding to the key is a dictionary of properties, with relationships given as indexes. For example, the property note is a to-one relationship, and has the value “1”. The related object is thus the second object in the compressedGlobalIDs array. To-many relationships are given as arrays of indexes.

(A big thanks to Christian Beer for helping me understand the property list structure.)

Importing Transactions

When import logs are transferred to a device, Core Data attempts to import them into the local store. The exact details of this are very vague. Basically, the process is private, and Apple do not provide any hooks into the import procedure.

It seems that Core Data sets up a private NSPersistentStoreCoordinator, and applies the changes in an associated private NSManagedObjectContext. The managed object context saves, pushing the changes into the store, before posting a notification to alert other contexts to refresh the updated objects. (Thanks to Marcus Zarra for clarifying aspects of this procedure for me.)

You would think, with the import process being a private affair, that your own code could play no role, but it is important to realize that custom accessors and validation methods from your custom NSManagedObject subclasses are used in the background context, and thus do have a very important influence on the success or failure of transaction imports. We will discuss the practicalities of this in a future post.

Filed under icloud coredata sync software mac ios

  1. slowfocus reblogged this from mentalfaculty
  2. amirsaam reblogged this from mentalfaculty and added:
    Drew McCormack pops the hood to see how iCloud and Core Data syncing works.
  3. scottdensmore reblogged this from mentalfaculty
  4. anjerodesu reblogged this from mentalfaculty
  5. mentalfaculty posted this