inessential by Brent Simmons

NetNewsWire Diary #1: Automatic Hashing and Performance

I like Swift’s recent addition of automatic hashing support — in many cases you can declare conformance to Hashable and let the compiler do the rest.

This let me delete a bunch of code, and I love deleting code.

* * *

I noticed a regression the other day: for some reason, fetching articles from the database and populating the timeline view got noticeably slower when the results are fairly large.

I worried that this is because my articles database is over a year old, and as it grows the fetch times get longer.

So I used the Time Profiler instrument to see what was going on during a fetch — and I found that most of the time was being spent in hash(into:) in two of my structs: Article and DatabaseArticle.

And of course fetching articles means creating a whole bunch of these structs. Hundreds or thousands, even, depending.

Those were two cases where I had adopted automatic hashing. The hash(into:) method was generated by the compiler.

So I thought about what to do. I wanted a hash that’s unique, or close enough, and I want it to be fast.

The solution, in both of these cases, was obvious — each has an articleID property that is unique per database, which is close enough. That means just hashing one property rather than (presumably) all of them.

So I made hashValue a computed property in each of those structs, as in:

var hashValue: Int {
    return articleID.hashValue
}

I built and ran the app — and the performance issue was fixed.

I put some (temporary) timing code around the code that fetches all unread articles, and it went from 0.37 with automatic hashing to 0.07 with my computed hashValue.

That’s huge!

I realize I could have written a hash(into:) function instead. Maybe I should? I’m not sure that it matters one way or the other. Possibly by the time you read this I will have switched the implementation.

The point still stands, though, that automatic hashing in the case of objects with lots of properties might be a performance hit. As always — use the profiler.

PS Hashing is important in NetNewsWire because I use sets frequently. In general I make arrays at the UI level, when populating a timeline (for instance), and use sets when fetching from the database, etc.