(For those impatient with my long-windedness “You talk too much. Just gimme some code!”.)
A few weeks ago it came to my attention, that the best way to handle a certain problem was to monitor a file for changes, and reload it when it did. Now in general when dealing with Cocoa most things you want to do are relatively simple once you know how and where to look for things, but every now and again figuring out where to even start involves digging deep into arcane areas, cursing the gods of Apple; and wondering what idiot thought it was a good idea to hide such important API’s under so many layers of obscurity.
This, in my experience, is the case with kqueue. Now what, you ask, is kqueue?
Well, technically, kqueue is a mostly fine grained kernel notification queue, for watching for events on files in OSX, and the BSD’s. Which, to put it simply, means that kqueue is something like a primitive NSNotificationCenter for changes to files. At first you might think, but doesn’t NSWorkspace already have a real Notification Center for this? But actually, it doesn’t. It only posts about changes that occur by using the workspace, such as changes made in Finder. And it is only to normal files which is a limitation because OS X is built on unix principles, if not design, so everything is a file.
So if as in my case the file in question isn’t being changed via NSWorkspace or isn’t a “normal” file or is being changed in ways that the Workspace doesn’t post about, you have to dig deeper. Into kernel api territory – a dark place where programmers don’t use cross-linked web pages for documentation, but scary things called “man pages” . Okay Okay, so I actually used google to find and look at the man page so technically it was still a cross-linked webpage, but definitely far outside the scope of most of the rest of the documentation.
Now using kqueue itself is relatively simple, in that there isn’t much too it api wise - two functions, one struct, one macro, and a handful of constants. However because of its low level nature, it bypasses any of the high level api’s you expect to use in a Cocoa application where usually low level means that annoying bit of CoreFoundation code which Apple hasn’t gotten around to making wrappers or toll-free bridges for.
The general approach to using kqueue is simply : get a handle to kqueue, set up a struct with some flags to watch the file that matters to you, add that struct to kqeue, poll the kqueue file for changes.
Wait, what? Poll? If it is a notification center, why would I have to poll? Aha. The gotcha, it isn’t a notification center, or even a notification queue like NSNotificationQueue. Rather it is a queue of notifications. Which means it is up to you to actually check the queue for new messages. Constantly. As long you want updates.
So bypassing semantics of use for the moment, and going to the Cocoa relevant question – if it requires polling, and thus its own loop, how on earth am I supposed to use it inside a Cocoa application, and an NSRunLoop? The usual answer is – you don’t use it in an NSRunLoop, and you setup a separate thread to do it for you. Which of course, requires either spawning a new thread for everything you want to watch, or managing thread synchronization issues if you need to change, and it makes actually getting notified about the changes if you want to do your work on your main application thread (such as notify the user), a pain.
But if that is the way it is that is the way it is, right? So my first attempt involved taking some sample code, and shoving it into an NSThread detached selector. It worked, barely, but how to get notified about changes? I played with a notification center, but if you haven’t used them much before, you quickly realize that NSNotificationCenter is not thread safe (for obvious reasons when you understand how they work). Now Apple has NSNotificationQueue, for helping this problem, but I chose the more immediate NSDistributedNotificationCenter, which basically means I sent my message to another program, which then sent it back to me, on my other thread. Ridiculous. But it worked for basic testing.
Then came the synchronization issues – if I wanted to add (or remove!) a file I was watching from the queue, I had to add all manner of @try/@finally NSLock, or @synchronization blocks. Not pretty, not clean, and for most programmers used to simple Cocoa thread cases, very confusing.
But I finally got it working, and everything was fine and dandy, and I could finally actually try and figure out how to use kqueue to do what I wanted. What a pain right? I knew there had to be a better way, and so I went digging into the Apple API’s, and googling for common words.
What I found was a relatively obscure CoreFoundation API called CFFileDescriptor. Now CFFileDescriptor is actually not much good by itself. In fact it exists for one reason, and one reason only – to wrap a file to watch for changes in a CFRunLoopSource. To clarify this -
An NSRunLoop is a system which waits for events, and triggers code when they happen. The obvious example of this is the GUI, the run loop receives window events, and notifies the View’s to redraw themselves, or your code that a button was pressed, etc.
So you are dealing with kqueue here, which is itself a queue of notifications, and you need to be notified when events are waiting for you to read, this sounds like just the sort of thing for an NSRunLoop, right? So you dig and discover that NSRunLoop’s don’t have a way to add sources of events, but that the CFRunLoop does.
Once you realize this you figure out that, you don’t need to do thread management yourself at all! You can just tell the runloop to watch a FileDescriptor for you, this will spawn the thread for you, watch the file for you, and callback your code when events happen, on your own runloop. No fuss, no muss, no threading, no synchronization.
Because kqueue returns a low level file descriptor, since everything in unix is a file, you can wrap it in a CFFileDescriptor, create a source; and off to the races! And since it is on a thread the runloop manages, you can add things to the kqueue safely at any time, because while your code is being called, the runloop isn’t polling.
So now we now how to use kqueue within a runloop, so we have pushed the problem of kernel api by wraping the dirty details in a CoreFoundation API, and we are in familiar territory again. Now we can do what every good Cocoa Programmer tries to do – wrap all this nasty low level code in a Cocoa class.
So now back to the problem at hand. Watching files for changes. The nice way, as already attempted, is to use the current run loops notification center. So what we need is a class that starts up kqueue, wraps it in a CFFileDescriptor, adds it as a source to the current runloop, and which translates the kqueue events into notification for the rest of your classes.
Now to start to clarify all this with some actual code, first, how to start up kqueue, that is trivial enough, just -
int kq = kqueue();
Then Comes wrapping this handle with a CFFileDescriptor, which in CoreFoundation style, is obscure and unintuitive -
CFFileDescriptorContext context = {0, self, NULL, NULL, NULL };
CFFileDescriptorRef kqueueFD = CFFileDescriptorCreate(kCFAllocatorDefault, kq, false, (CFFileDescriptorCallBack)kqueueEventCallback, &context);
CFFileDescriptorEnableCallBacks(kqueueFD, kCFFileDescriptorReadCallBack);
Basically we set up our file descriptor for kqueue, provide a callback function, along with a context defining user data passed back to the callback function, and then we enable our callback.
Then adding this FileDescriptor to a runloop -
CFRunLoopSourceRef source = CFFileDescriptorCreateRunLoopSource(kCFAllocatorDefault, fd, 0); CFRunLoopAddSource(CFRunLoopGetCurrent(), source, kCFRunLoopDefaultMode); CFRelease(source);
Because CF doesn’t have a concept of autorelease (come on Apple where is the consistency?), we have to release it once added, but that is all for the runloop side of things.
Now our callback will be triggered once kqueue actually has something to tell us, so lets look at what that callback will look like, with the actual down and dirty details of kqueue -
static void kqueueEventCallback(CFFileDescriptorRef fdref,
CFOptionFlags callBackTypes,
FSWatcher *self)
{
struct kevent event;
int status;
struct timespec timeout = {0, 0};
int kq = CFFileDescriptorGetNativeDescriptor(fdref);
while (true)
{
/* Read Next Event */
status = kevent(kq, NULL, 0, &event, 1, &timeout);
/* -1 is an error, 0 is no more events */
if (status <= 0)
break;
/* Propogate the event if it is a file change */
if (event.filter == EVFILT_VNODE)
[self propogateKQueueEvent:event];
}
/* Re-enable kqueue callback since this is a one-shot callback */
CFFileDescriptorEnableCallBacks(fdref, kCFFileDescriptorReadCallBack);
}
Basically, our callback gets called, we read anything queued, and pass each event individually back our to our actual class. The timeout is for blocking while waiting for new events, but since we are using a runloop, we don’t want to block, thus 0 for timeout. Since kqueue supports multiple different types of events, we want to ignore anything that isn’t a EVFILT_VNODE, which is the type of filter used for normal file events. Some other examples of using this, are for monitoring the status of another application, or watching for new information on a socket.
Once we have read and processed all our events, we re-enable the callback since it is disabled after every trigger, and we are done.
So next up, how do we actually add files to the kqueue to watch? Actually fairly trivially, if a bit obscure because it is such a low level API. Barring any other logic it looks like -
struct kevent change;
int handle = open([path fileSystemRepresentation], O_EVTONLY, 0 );
/* Add file to kqueue */
EV_SET( &change, handle, EVFILT_VNODE,
EV_ADD | EV_ENABLE | EV_CLEAR | EV_ERROR ,
NOTE_RENAME | NOTE_WRITE | NOTE_DELETE |
NOTE_EXTEND,
0, nil );
kevent(kq, &change, 1, NULL, 0, nil );
So we open a file path, as “Event Only”, since we don’t actually want to lock the file, or change it., then we setup an event structure with the parameters we want to watch for, and add it to the queue.
To explain a little bit further – the first set of EV_* options are general types of flags where Add means we are adding this event, Enable means enable notifications for this file, Clear means remove it from the queue once we are done it with; and Error means we want to receiver errors as well.
The other set is the actual file events we wish to be notified of – Rename, Write, Delete, mean, just what they say, and Extend means the file has been extended. Other things that can be watched for include permissions and user changes.
Something else I wish I had known early on about this is Rename events don’t actually occur on the file but on the folder that contains the file, thus the above won’t actually work for listening for rename events, unless you add two sets of events, one for the file, and one for the folder containing the file.
So onward, how to remove a file from the queue? You simply close the file handle you opened.
close(handle);
Now this of course means you have to keep track of all the handles you open, so you have to wrap your file handles in an NSNumber, and add it to mutable dictionary, keyed by path name. But, we also want to be able get the path from the fd, so that we can pass it back in a notification, so we actually need two dictionaries, keyed as follows -
NSNumber *fd = [[NSNumber numberWithInteger:handle] retain]; [fds setObject:fd forKey:path]; [files setObject:path forKey:fd];
So we have a rough idea of how to add, remove, and watch for changes, but what about the actual propagation? In the callback we simply passed the event to our class, which at a bare minimum, will lookup the path, and file event type, and post a notification, which should look something like this -
- (void) propogateKQueueEvent:(struct kevent)event
{
NSString *path = [[files objectForKey:[NSNumber numberWithInteger:event.ident]] retain];
if (path)
{
NSString *eventType = @"Unknown";
if (event.fflags & NOTE_RENAME)
eventType = @"Rename";
else if ((event.fflags & NOTE_WRITE) ||
(event.fflags & NOTE_EXTEND))
eventType = @"Modified";
else if (event.fflags & NOTE_DELETE)
eventType = @"Delete";
// Post notification
[[NSNotificationCenter defaultCenter] postNotificationName:@"FSWatcherFileChangedEvent"
object: nil
userInfo: [NSDictionary dictionaryWithObjectsAndKeys: path, @"File Path",
eventType, @"File Event", nil]];
}
[path release];
}
Not so hard, right? Mostly normal Cocoa code here. There is a specific gotcha however, which make this particular bit of code unreliable. Namely, an evil evil idea that still persists for reasons I cannot fathom – Atomic Writes. And for that we need a bit of a hackish workaround. You see with an atomic write, you write changes to an entirely new file, delete the old, and rename the new file back to the original file name.
This causes kqueue to drop the file handle when the first handle gets deleted, because as far as it is concerned you were watching not the path, but the specific file itself. To do this then, we have to handle delete’s a little special, so that if the file was atomically rewritten, we can re-add the file back to the queue. Roughly, this looks like the following -
else if (event.fflags & NOTE_DELETE)
{
eventType = @"Delete";
/* HACK ALERT - Try and watch out for Atomic Writes */
NSNumber *fd = [fds objectForKey:path];
/* Close old FD */
[files removeObjectForKey:fd];
close((int)event.ident);
/* Try and re-open */
event.ident = open([path fileSystemRepresentation], O_EVTONLY, 0 );
if (event.ident == -1)
[fds removeObjectForKey:path];
else
{
struct kevent update;
/* Update fd <-> path dictionaries */
fd = [NSNumber numberWithInteger:event.ident];
[fds setObject:fd forKey:path];
[files setObject:path forKey:fd];
/* Re-Add file to kqueue */
EV_SET( &update, event.ident, EVFILT_VNODE,
EV_ADD | EV_ENABLE | EV_CLEAR | EV_EOF ,
NOTE_RENAME | NOTE_WRITE | NOTE_DELETE |
NOTE_EXTEND,
0, nil );
kevent(CFFileDescriptorGetNativeDescriptor(kqueueFD), &update, 1, NULL, 0, nil );
eventType = @"Modified";
}
}
Pretty evil right? But now we have all the pieces for a simple FSWatcher that will watch for file changes, and notify us about them. It needs some more logic to handle multiple watchers of the same file, and the subsequent retain count, and of course, we need cleanup.
Cleanup is pretty straight forward so lets take a quick look at that -
-(void) dealloc
{
/* Close file handles */
NSEnumerator *enumerator = [files keyEnumerator];
NSNumber* fd;
while (fd = [enumerator nextObject])
{
close([fd intValue]);
while ([fd retainCount] > 2)
[fd release];
}
/* Remove maps */
[files release];
[fds release];
/* Close kqueue */
CFFileDescriptorInvalidate(kqueueFD);
CFRelease(kqueueFD);
close(kq);
[super dealloc];
}
Basically we close all the handles we opened, then cleanup the CFFileDescriptor, then close kqueue.
One other gotcha I ran into that is a little bit more obvious – just like NSWorkspace we have to have an absolute file path, so no ~’s, or file://, or symlinks. To solve this a simple method to expand a path can be used, either as a string category, or a method on your class, as follows -
- (NSString*) stringByExpandingPath:(NSString*)filePath
{
NSString *result = filePath;
// file://
if ([result hasPrefix:@"file:"])
result = [[NSURL URLWithString:result] path];
// ~
if ([result hasPrefix:@"~"])
result = [result stringByExpandingTildeInPath];
return [result stringByResolvingSymlinksInPath];
}
And hey presto.
This solution is still relatively speaking primitive, it needs as mentioned earlier, code to handle renames. And if you want to watch an entire file tree instead of individual files, then you are getting involved in a whole different API – FSEvents.
But for now this is a good base and much saner than any other example I found.
The full class, including an ugly retain workaround for watching the same file multiple times is available here.
For more information on kqueue you can read the man page.
And for more information on CFFileDescriptor you can see the Reference