this is aaronland

sea marshmallows

go-iiif

For a whole bunch of reasons I've found myself thinking about the International Image Interoperability Framework which is often just referred to as IIIF, lately. If you've never heard of IIIF it is a standard developed principally by the library and archives community with three principal areas of interest : Images, publications and search.

The first (images) is a standardized URI-based syntax for common operations around image manipulation. The second (publications) is a declarative syntax for essentially defining learning modules around the idea of the slideshow. The third (search) always seems to stray quickly in to territory labeled metadata which... well, is not my jam but neither is it my party so I just try to maintain a healthy distance.

The IIIF Image API is the thing that's been coming up a lot in a variety of museum-related conversations. Images, and more generally digital assets, have been a bit an albatross around the neck of the cultural heritage sector for... basically, forever. The problem has been made worse year over year as museums embark on ever more ambitious digitization projects that lend themselves to ever more sophisticated tools without really bothering to distinguish the layers of concern (storage, search, processing and delivery) or the mechanics, and more importantly the economics, of how they all fit together.

Historically the solution has been, and continues to be, outsourcing the problem to so-called Digital Asset Management System (DAMS) and more recently Image Delivery System (IDS) vendors. There is a much larger discussion to be had about that but this is not the place, right now. Suffice it to say that if the cultural heritage community wants to take on the challenge of standardizing on some basic image-related tasks and functionality, and even endeavour to write software, common to most institutions then that is an unqualified Good Thing.

Which of course means I had a little bit of a freak out the first I read the API spec over coffee, one morning. The details of the freak out aren't really important. I can be pretty impatient about these things the first time around, not always in a good way.

The relevant bit, for me, is that I kept asking questions and badgering the people I knew who are involved with the IIIF project. So many times in fact that eventually it seemed like the best thing to do to understand the decisions I was questioning and to test whether my criticisms passed muster would be to write an implementation of the IIIF Image API. So I did.

One of the convenient side-effects of a service that standardizes on operations like image resizing and cropping is it doubles as a tiled image server. Think a traditional slippy map but instead of zooming in and out of geography you are zooming in and out of really big pictures of culture. It is hard to explain to people outside the cultural heritage sector just how anxious, defeated and envious the sector has been since the Google Art Project rolled in to town with their fancy gigapixel cameras and the ability to do to works of art what they had previously done to maps.

Some museums have cobbled together their own solutions for making zoom-able images available on their websites; it was defintely one of the things lacking from the Cooper Hewitt collections website during my time there. Some have even open-sourced their toolkits for making image-based slippy maps but nothing has seemed to stick across the sector.

This was the bias that I approached IIIF from. After all, I like maps. The result is go-iiif and you can see a live demo of some of what it does over here:

There is also a local copy at http://www.aaronland.info/weblog/2016/09/18/marshmallows/go-iiif/ for when that link inevitably breaks...

go-iiif began life as a fork of Yoan Blanc's iiif Go server. Almost immediately it morphed in to something different and we quickly agreed that the two code bases should continue independently of one another. Here's the not-so-short short version of what go-iiif does.

The performance of go-iiif is best described as pretty fast to very fast. Generating tiles offline, the bottlenecks are CPU usage and disk I/O with the potential of Go making eiher of those thing worse by trying to do too many things at once. The performace and load testing docs go on to say:

[O]n a machine with 8 CPUs and 32GB RAM I was able to run the machine hot with all the CPUs pegged at 100% usage and seed 100, 000 (2048x pixel) images yielding a little over 3 million, or approximately 70GB of, tiles in around 24 hours. Some meaningful but not overwhelming amount of time was spent fetching source images across the network so presumably things would be faster reading from a local filesystem. Memory usage across all the iiif-tile-seed processes never went above 5GB and, in the end, I ran out of inodes.

So, it's a start. I still have a number of questions about IIIF and pretty serious concerns about ever running a public IIIF server (even this one) in front of a general audience of strangers-on-the-internet so it's not perfect. But, all in all, it feels better than yesterday.

If nothing else it will be useful for Parallel Flickr.