Importing Data in Background With CoreData

It’s very common for an application to import and export data to other programs, granting intercommunicability and flexibility. This kind of operation should be executed in background, so that your application UI remains reactive. Under Cocoa and, specifically, with CoreData there are a lot of available options to implement background operations, as much as a lot of literature and best practices about it.

Here the pattern I chose for my application and the reasons why.

Problem description

First of all, my application runs under MacOS, it’s backed by CoreData and based on NSPersistentDocument, so my considerations and needs are strictly related to this context: the same problem would be addressed in a different way if it was under IOS or requirements were different.

In a subview of my application, user can drop one or more files into a drop view, for example dragging them from Finder; these files could be of different kind (images, documents, …) but must be all handled in the same way and, possibly, without any interaction with the user. The drop view, the receiver of dropping operation, will ask the import manager to handle these files, returning immediately and maintaining the UI responsive; the import manager should also be able to import files concurrently, abort one or more operations if user asked so.

The “Import Manager”

As different file types must be handled, there will be an importer for each of them and another one class to manage them all. The import manager creates an operation queue in which will enqueue the import operations: it will wait for the drop view to ask to import new resources and create as many operations as requested.

The method importItems:intoDocument of the importer manager is called by the drop view: it requires an array of items to import and the current document which will be the owner of newly created resources. Note that items is an array of DZImportItem, a wrapper class that represent a single file to import.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
@implementation DZImporterManager

- (void) importItems:(NSArray *)items intoDocument:(NSPersistentDocument *)aDocument
{
    dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_BACKGROUND, 0);

    dispatch_async(queue, ^{
        [items enumerateObjectsWithOptions:NSEnumerationConcurrent
                                usingBlock:^(DZImportItem *item, NSUInteger idx, BOOL *stop) {
                       // Search for a class handler for current file
                       DZImportOperation *operation = nil;
                       for (Class handlerClass in _importers) {
                           if ([handlerClass canOpenURL:item.url]) {
                               operation = [[handlerClass new] initWithItem:item
                                                                    context:aDocument.managedObjectContext];
                               break;
                           }
                       }
                       // If an handler was found, enqueue the import operation
                       if (operation) {
                           [_queue addOperation:operation];
                       } else {
                           DDLogError(@"No registered handler for %@", item.url);
                           [item setStatus:DZStatusAborted
                               withMessage:Localize(@"No registered handler for this file type")];
                       }
                   }];
    });
}

Using dispatch_async of GCD we ensure that this operation will be executed in background and not in the same queue of UI, without blocking it.

Each data importer is a subclass of NSOperation, its main method will do the job. The following example is responsible to import image files.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
@implementation DZImageImportOperation

+ (BOOL)canOpenURL:(NSURL *)anUrl;
{
    BOOL retValue = NO;
    NSString *uti;
    NSError *error;

    if ([anUrl getResourceValue:&uti forKey:NSURLTypeIdentifierKey error:&error]) {
        retValue = UTTypeConformsTo((__bridge CFStringRef)(uti), (__bridge CFStringRef) @"public.image");
    }

    return retValue;
}

- (id)initWithItem:(DZImportItem *)item context:(NSManagedObjectContext *)moc;
{
    if ((self = [super init])) {
        _importItem = item;
        _context = [moc createPrivateSubcontext]; /* we'll see later this method */
    }
    return self;
}

- (void)main
{
    [self.context performBlockAndWait:^{
        @autoreleasepool {
            // Create new managed objects in self.context 
          // (an instance of NSManagedObjectContext)
          
            [self saveContext];
        }
    }];
}

@end

The method canOpenURL checks if the given file conforms to UTI public.image, which identifies an image file, while its main performs the job of opening the file and creating a corresponding managed object in the context. The last operation is saving the context: we’ll see deeper this part.

Managed Object Contexts

The pattern I chose is nested managed object contexts: in this scenario, you create new contexts which are children of the document’s one; at this point all the managed objects created in child context are propagated to the parent whenever a save occurs.

Let’s see the required steps to use them.

As you already saw, the importer makes use of -[NSManagedObjectContext performBlockAndWait], which implies that the managed object context has an associated queue. By default, an application based on NSPersistentDocument creates a managed object context initialized with NSConfinementConcurrencyType, which specifies that the context will use the thread confinement pattern: trying to use perfomBlock with this concurrency type would rise an exception. We have to override this default initialization by rewriting -[NSPersistentDocument setManagedObjectContext:].

1
2
3
4
5
6
7
8
9
10
- (void)setManagedObjectContext:(NSManagedObjectContext *)managedObjectContext
{
    if (managedObjectContext.concurrencyType != NSMainQueueConcurrencyType) {
        NSManagedObjectContext *context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSMainQueueConcurrencyType];
        context.persistentStoreCoordinator = managedObjectContext.persistentStoreCoordinator;
        [super setManagedObjectContext:context];
    } else {
        [super setManagedObjectContext:managedObjectContext];
    }
}

In this way we create a managed object context with concurrency type NSMainQueueConcurrencyType, which allows us to create subcontexts. The inizialization of each single NSOperation subclass creates a new subcontext which is set up as child of document main context: a category of NSManagedObjectContext makes available this method and init the context with NSPrivateQueueConcurrencyType so that it will be associated with a private dispatch queue and won’t block the main context.

1
2
3
4
5
6
7
8
- (NSManagedObjectContext*)createPrivateSubcontext
{
    NSManagedObjectContext* context = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
    context.parentContext = self;
    context.undoManager = nil;

    return context;
}

Note: in my case the import operation doesn’t ever read of any other existing objects of the parent context, i.e. doesn’t issue a fetch request: this kind of operation would block the parent context even if the request was issued by the child.

Considerations

From the performance point of view, it seems this isn’t the best approach as nested contexts waste a lot of time in reciprocal synchronization and do a lot of disk access; read this excellent article by Florian Kugler about this argument.

According to his measurement, the best solution is create another context which shares the same persistent store; by the way, under MacOS, it’s not unusual for an application to create a new document and working on it without having saved to disk, i.e. without a persistent store. In this scenario, I had a lot of problem and got very often this exception: “Object’s persistent store is not reachable from this NSManagedObjectContext’s coordinator”.

With nested context there is also no need to watch for context save notification and call mergeChangesFromContextDidSaveNotification.

Useful resources and links

Comments