this is aaronland

things I have written about elsewhere #20241002

Small focused tools for visualizing geographic data

Coin bank: various airlines. Metal. Gift of Thomas G. Dragges, SFO Museum Collection. 1999.061.085

This was originally published on the SFO Museum Mills Field weblog, in October 2024.

This is a blog post about a series of small, purpose-fit web applications that can be run from your local desktop (or a remote server, if necessary) to visualize geographic data in a map. The first application lets you visualize GeoJSON files, the second SQLite databases containing MBTiles data and the third GeoParquet files. These are tools which were written to address specific use cases for SFO Museum but are sufficiently generalized, even though they each only try to do one thing well, that we are sharing them in the spirit of generosity.

go-geojson-show

Menu: Delta Air Lines. Paper, metal, ink. Gift of Adan Wong, SFO Museum Collection. 2006.034.101

SFO Museum uses GeoJSON files extensively. Every record we publish to the sfomuseum-data GitHub repository is a GeoJSON file. Often, the easiest way to look at a single GeoJSON file is to use Mapbox's geojson.io web service. geojson.io is great. Years of effort have gone in to developing its features and "polishing its doorknobs" to make it a fantastic tool for visualizing and debugging GeoJSON data.

Recently, the airport completed construction of the "North" section of the Harvey Milk Terminal 1 terminal. That triggered a "phase-shift", a meaningful change in how passengers travel through and experience the airport, so I set about to update all those records in the sfomuseum-data/sfomuseum-data-archicture repository as well their dependencies (like exhibition or public art records). You can read more about this idea of "phase-shifts" and having multiple representations of the airport over time in the Who's On First at SFO Museum blog post.

These updates produced a number of intermediary GeoJSON files that I wanted to quickly "spot check" for errors. In the past, I've used geojson.io (or a more sophisticated application like QGIS) but these applications generally require a number of steps to select and load a file on the map. I wanted something that I could automatically open in local web browser as part of a pipeline of commands run from my computer's terminal.

Thus was born go-geojson-show. It is, as the name suggests, a command-line tool for serving GeoJSON features from an on-demand web server. It has a number of features but the simplest way to use it to call the package's show command with the path to one or more GeoJSON files. For example:

$> ./bin/show \
	/usr/local/data/sfomuseum-data-architecture/data/102/527/513/102527513.geojson
	
2024/08/13 13:01:44 Features are viewable at http://localhost:55799

The application will read the GeoJSON data, serve it from a local webserver (which it starts internally) and instructs the operating system to open a web browser to the page where you can see the data on a map. Because the application is written in Go the whole thing can be compiled down a single standalone binary application that has no external dependencies and can run on a variety of operating systems.

As mentioned, there are a number of additional features (reading data from STDIN, custom map tiles including tiles served from a local Protomaps database, custom map styles and so on) but it is otherwise a small focused tool that tries to only do one thing and do it well.

go-mbtiles-show

Set of coasters: United Airlines. Plastic, paper, elastic, ink. Anonymous gift, SFO Museum Collection. 2019.043.001 a e

Shortly after the go-geojson-show application was written the Museum received updated aerial imagery of the airport for the year 2024. These tiles are included on the Mills Field Map page and as part of the interactive installation in the Terminal 2 SkyTerrace Observation Deck.

When the installation was originally conceived the idea was that it would run entirely offline powered by a single iPad. After some trial and error we decided that the tile data for the maps would be bundled as a collection of MBTiles SQLite databases, one for each year's worth of tiles we wanted to display. The reasons for this choice was pragmatic: It is easier and faster to copy 43GB of data to an iPad in 40 individual SQLite databases than it is to copy the same amount of data contained in millions of individual raster tiles.

Fast-forward to 2024 and I have tiled the new aerial imagery we've received from the GIS department, created a new MBTiles database, loaded it on the iPad I use to develop the SkyTerrace application and... nothing. As with the GeoJSON files I was producing I needed something that I could use to quickly visualize an MBTiles database and make sure the problem with the new database I'd created (it was) and not an error in another part of the code. I needed go-geojson-show but for MBTiles so now go-mbtiles-show exists.

Like it's counterpart, it too has a number of features, but the simplest way to use it to call the package's show command with the path to one or more MBTiles databases. For example:

$> ./bin/show \
	-map-provider leaflet \
	-raster 2024=/usr/local/sfomuseum/tiles/sqlite/2024.db

2024/08/30 18:16:14 INFO Server is ready and features are viewable url=http://localhost:64115

The go-mbtiles-show tool supports rendering MBTiles databases containing either raster data or vector data (with a very limited set of styles) as well as most of the other options supported by the go-geojson-show tool.

go-geoparquet-show

Coaster: Japan Air Lines. Paper, ink. Gift of Thomas G. Dragges, SFO Museum Collection. 2002.035.229

GeoParquet is a relatively new file format developed to take advantage of the functionality that column-oriented and cloud-hosted databases enable. SFO Museum is not using GeoParquet files in production yet but there has been some thought that it might be an efficient way to make the over 7 million records of flights in and out of SFO available to people, allowing them to query GeoParquet files hosted on our servers without the need to download all those files locally.

So, in the margins of the day when everything else was done, I started writing some tools to populate GeoParquet databases with our flight data and once that was done... you guessed it: I made go-geojson-show but for GeoParquet files! Like all the other tools, there are specialized features, but the simplest way to use it to call the package's show command with the path (or URI) to one or more GeoParquet. For example:

$> ./bin/show \
	-data-source /usr/local/data/arch.geoparquet

2024/08/20 17:37:53 INFO Server is ready and features are viewable url=http://localhost:58296

go-geoparquet-show is the least mature of the three tools. Depending on the size and shape of the data encoded in a GeoParquet file the tool can be noticeably slow. For example, loading all the flight paths in and out of SFO during March, 2024 at zoom level 1 or 2 can take upwards of a minute for each map tile to render.

On the other hand, loading all the administrative points and polygons for Canada, using the Who's On First GeoParquet files produced by Geocode.earth, only takes a couple seconds.

I haven't determined whether this has something to do with how I am creating the GeoParquet files, whether it's just the volume of data or something else. There is more investigating and digging to do here (help is welcomed and encouraged). In the meantime, the go-geoparquet-show tool satisfies the basic goal which is a simple single-purpose tool to quickly (sort of) visualize one or more GeoParquet files.

go-www-show

Menu: Pan American World Airways. Paper, ink. Gift of Serge Mitt, SFO Museum Collection. 1993.01.096

All of these tools follow a common pattern: There is a simple HTML + CSS + JavaScript web application, which may or may not be enhanced by some server-based capabilities, that you want to be able to run locally in open with a unique URL in a web browser. We taken the "serve a web application from a unique URL and open it in a web browser" piece and moved it in to its own dedicated Go package called go-www-show. It can be used like this:

import (
	"context"
	"net/http"

	"github.com/sfomuseum/go-www-show"
)

func main() {

	ctx := context.Background()
	mux := http.NewServeMux()
	
	// Configure handlers for mux here
	// This is your application
	
	browser, _ = show.NewBrowser(ctx, "web://")
	
        show_opts := &www_show.RunOptions{
                Browser: browser,
                Mux:     mux,
        }

        return show.RunWithOptions(ctx, www_show_opts)
}

The most important part is this:

mux := http.NewServeMux()

// Configure handlers for mux here
// This is your application

An http.ServeMux is the Go-language construct used to configure web applications. "ServeMux" is an unfortunate name but this is where the logic that looks at a URL and figures out which piece of code to invoke lives.

The three different set of "handlers" in the three different ServeMux instances are what distinguishes go-show-geojson, go-show-mbtiles and go-show-geoparquet from one another. Everything else, though, is shared code and that functionality is exposed to each of the three applications through their use of the go-www-show package. If you have a simple web application that you'd like to deploy (serve) and launch from a binary tool then this package might be of interest to you.

No one is coming to save us

Digital photograph: Virgin America. Digital file. Gift of Captain Karen Lee McNally, SFO Museum Collection. 2018.139.004

Long-time readers will know that I am an enthusiastic cheerleader for the idea of "small focused tools", by and for the cultural heritage sector:

The cultural heritage sector needs as many small, focused tools as it can produce. It needs them in the long-term to finally reach the goal of a common infrastructure that can be employed sector-wide. It needs them in the short-term to develop the skill and the practice required to make those tools successful. We need to learn how to scope the purpose of and our expectations of any single tool so that we can be generous of, and learn from, the inevitable missteps and false starts that will occur along the way.

As of this writing, these four tools might be more SFO Museum specific than not. Geography is the through-line for everything in our collection but "place" is a part of every museum's collection so I think these tools might become useful to others sooner than people think. Even if they don't we are committed to the practice of releasing these "small focused tools" because the economic and staffing constraints facing museums, in 2024, mean we need to be proactive about developing and sharing tools with one another.

This is one of the things I'll be talking about during the Show Off Your Stack: Approaches to Building and Documenting Systems that Work panel, along with Andrea Ledesma, Ronald Czik and Tyson Wilday at the Museum Computer Network (MCN) conference, in Lawrence Kansas later this month. If you're at MCN drop by and say hello!

Hostess scarf: Hughes Airwest. Polyester. SFO Museum, SFO Museum Collection. 2022.026.003