libxml2 + xmlTextReader on Macs
I’ve used a few different XML parsers on Mac OS X—including CoreFoundation’s XML parser and NSXMLDocument.
But recently, for technical reasons, I couldn’t use any of the XML parsers I was already using. And, furthermore, I had reason to want to use a stream-based parser rather than a parser that builds a tree. (For better performance and lower memory use.)
I figured that probably meant using a SAX-ish API. (SAX == Simple API for XML) But I’ve never wanted to deal with SAX because it meant writing a bunch of code to deal with state, and that’s just a pain. (Honestly. No matter what Gus says.)
So I found my way to libxml2 and its SAX2 module. Eh, okay, I’ll do this, I guess. Maybe it’ll even be fun! (Really thinking, probably not fun.)
Then somehow I ran across the xmlreader module. It turns out to be exactly what I wanted—stream-based and fast—without being a big pain like SAX.
(It’s a clone of the xmlReader .NET interface. It’s possible that it’s very commonly-used in the Windows world.)
xmlTextReader works like this:
loop until done GetTheNextBitOfXML DoSomethingWithItIfYouWant
Right. No callback functions (as in SAX). Just loop through the XML until you’re done.
And here’s a demo project (BSTweetParser) that downloads the Twitter public timeline and parses it into Cocoa objects, an array of dictionaries. (Twitter stuff makes for great sample code.)