go-iiif, in production

On Friday Micah flipped the switch to enable zoomable images on the collections website at the Cooper Hewitt. Despite not being employed by the museum anymore this is a project I've been helping out with during the margins of the day. It's been a good way to test a bunch of choices and decisions I've made around the go-iiif project that I started last year.

Rather than deploying an on-the-fly IIIF server all of the zoomable-image tiles were pre-rendered. What follows are copies of two emails I posted on the iiif-discuss mailing list, describing some of the nuts and bolts of tiling all those images, and then a third email sent privately explaining why to pre-tile instead of running a production server.

I just finished tiling a large museum collection that was recently digitized in its entirety at a minimum of 4096 pixels per side. I did this to test a bunch of questions I had/have about IIIF as well as the go-iiif implementation that I wrote to investigate those questions.

Here are the numbers, passed along as an FYI:

289, 208 images – as measured by the number of "info.json" files produced

58, 977, 784 tiles produced (for scale factors 8,4,2,1)

518 GB of (tiles) data

45 days processing / transfer time (averaging out to ~ 12 seconds per tile)

The bottleneck in the process was always CPU coupled with disk I/O initially (when writing to a local disk) and then some amount of network transfer once we started writing directly to S3.

To be honest, by the time we got around to writing directly to S3 I was less interested in endless performance tweaks (this was a mornings and weekends project) and reasonably certain that any non-CPU related improvements would be incidental since the real problem is that crunching pixels remains CPU-expensive.

Over the course of the 45 days, using an 8 core machine with 30GB RAM, all of the CPUs were pegged at 80-100% usage and memory usage never went above ~ 8GB. CPU cost was probably exacerbated by the (seeming ?) inability of the image processing library to return a new smaller image when cropping the source. Dunno - mornings and weekends, right :-)

To put the 518 GB of tile data in perspective, that's 518 GB worth of bytes which doesn't take in to account filesystem specifics like block size or inode count. For example before switching to S3 we had filled up 619 GB on a 1 TB disk using 40% of the inodes (ext-4 with a block size of 2048 bytes) and had tiled less than 50% of the total images.

Finally, when writing directly to disk on a EC2 machine disk I/O was brutal to put it mildly. The software had to be repeated throttled to account for the fact that AWS would limit writes (on this particular instance type) to about 120MB/second.

And then the follow-up to that email:

A quick follow-up based on some comments I've received about this post, specifically about the time (45 days) to process all the images:

The bottleneck as I mentioned was some combination of disk I/O, network and CPU. AWS aggressively throttles I/O on EC2 machines so there's not much to do there. Bandwidth inside AWS (from EC2 to S3) is pretty fast but again I expect that is throttled too. The CPU issue is an open question. Or more specifically CPU and a gazillion concurrent processes.

The libvips author mentioned that he's written a "deep zoom" tiling client that is impressively fast. It's entirely possible that some combination of the Go bindings and my code on top of that have introduced problems. I would never rule that out, especially for a "mornings and weekends" project.

Like I said, it's not clear to me how/whether the generic libvips API allows to crop an existing image without also transforming it, meaning that every crop requires processing the same bag of original pixels (even if you keep them hanging around in memory).

The other thing I didn't mention in my note to the mailing list is that I introduced some very aggressive blocking throttles in my code mostly in an effort to keep the OS from freaking out and imposing its own limits or simply grinding to a halt. These limits were pretty arbitrary and once they achieved something approaching "good enough is perfect" I just let things run their course.

I expect that accounts for a meaningful portion of the 45 days but more testing would be good, for sure, even if it was just spreading the load across (n) machines in parallel.

Finally, it seems pretty clear to me that the pre-tiling IIIF images problem has "please write an AWS Lambda function" written all over it. We can debate the relative merits of abstracting general purpose computing further and further in to commercial services but for a small library or museum having the ability to put a bag of images in one S3 bucket and generate another bag of (tiled) images in another S3 bucket for a one-time fixed cost is pretty attractive.

To be clear: I have no idea what it would cost to tile 280K images using a Lambda function (and the PUT costs in S3 will start to add up... like 58M times) but maybe it's still less that compute time on a big EC2 machine and certainly not having to deal with the hassle of things like "installing libvips" is worth time and money, especially for institutions with limited staff.

Speaking of which, since publishing this post I've learned that Roger Howard has written an AWS Lambda function which converts uploaded images to the Deep Zoom tiled image format so that's quite good.

And then finally this email, which was sent privately, on the rationale for pre-tiling over running a live server against production traffic (read: strangers on the Internet):

I would not run any IIIF server live in front of production traffic, including mine.

I was planning to write a long and twisty museum paper about just this subject but now I am not going to, so I may just write a long and twisty blog post instead.

Before I get in to why you shouldn't run an IIIF server in production I am not sure I completely understand your setup. Are you saying that source image lives on S3? If that's the case then that would explain why "info.json" takes as long as it does to generate.

In order to generate the info.json file you need to know the dimensions of the file itself which means the file needs to be loaded (from whatever its source is) and then again in to memory, so this bit here:

https://github.com/thisisaaronland/go-iiif/blob/master/cmd/iiif-server.go#L232-L255

If that's the case then I'd start by checking how long it takes to download the image itself from S3 and compare it to the XHR request for info.json. How big is the source image? If the image is being loaded locally that's weird. Note: It wasn't

Anyway, and specifically for your stuff, I would pre-cache all the tiles up front. There are only a few thousand images so it shouldn't take very long. To give you some sense of how long it takes with a larger dataset there's this:

https://groups.google.com/forum/#!topic/iiif-discuss/uI5k_3NsYv4 Note: This is just a link to two email messages transcribed above.

Again, it's not really the total number of images that's the problem so much as how many any one class of EC2 machine can handle at once.

I also may have downloaded all 250GB worth of the Met's public domain images recently and plan to test things again with their dataset...

The issue with pre-tiling will be I/O and CPU contention but again for a small dataset it's not a big deal.

Pre-cached tiles + Leaflet + Jack Reed's thing to deal with the IIIF nonsense is great because it's simple and easy and pretty much bog-standard not-bleeding-edge JavaScript that has a hope of lasting more than 6 months before it needs to be fixed. And doing incremental updates to images as they change is trivial.

Running a IIIF tiling server in production is kind of madness though because of the I/O contention and the fact that any given server has a finite amount of RAM.

If you implement the spec without thinking about server-side caching then you end up loading the same multi-MB-sometimes-even-GB file in to memory for every 256 square pixel tile that is requested and doing some fairly CPU intensive pixel-crunching for each one, all at the same time. With a scenario like that it's only a question of how soon your server will melt down and stop being able to serve traffic.

Even if you cache source images aggressively in memory eventually you will run out of memory, given enough traffic.

The go-iiif server (and friends) were a way to work through these concerns and see if maybe I was missing something. I don't think I am. A live IIIF server is great for you and your 4 closest librarian friends but actually scaling this thing in any meaningful way (without just burning cash on hardware) appears to have been an afterthought.

Anyway, if your source images are on S3 then you can get the iiif-tile-seed program to fetch them by defining your "source" config like this:

https://github.com/thisisaaronland/go-iiif#s3

And you can write the tiled images _back_ to S3 by defining your "derivatives" config like this:

https://github.com/thisisaaronland/go-iiif#s3-1

Just make you write the tiles to a different or child directory of the source directory. If you don't then AWS will happily overwrite your original source files with the directory (which shares the same names as the original file) containing your derivatives. Because [redacted].

But anyway, zoomable images! One of the nice things about the zoomable images on the Cooper Hewitt website is the ability to generate and download a static image of the area you've zoomed in on. This was something I tried to get included in the scope for the interactive tables when the museum reopened, but it never made the cut which is too bad. We had everything in place to allow you to upload those crops to your visit pages and everything would have been properly cross-referenced with the original object.

That activity would have also been a valuable metric for user-engagement (...we'll save a longer discussion of that subject for another day). People save and collect things for later for all kinds of reasons, many of them absent-minded, but taking the trouble to crop and save a detail requires a certain amount of deliberation. The temptation to data-mine those image crops would have been strong but to do so would be creepy so just knowing which images had been cropped and saved, and how many times, seems like a reasonable compromise.

I don't know whether that feature will ever get built in to the tables. It should, and if any museum is in a position to void the warranty, with confidence, on their third-party work it is the Cooper Hewitt. In the meantime because those interactive tables are just consumers of an API that the museum controls and because there are already the infrastructure to save objects on the collections website itself there is only a small amount of time-and-typing necessary to implement saving an image detail to your online shoebox.

Maybe the museum will do that. That would be good.

This blog post is full of links.

#numbers

this is aaronland

numbers for the record

go-iiif, in production