Cloud-Friendly Geodata Formats and When to Use Them

FlatGeobuf, GeoParquet, PMTiles, whatnot—I find it hard to understand when and how to use new cloud-native data formats. Brandon Liu has answers.

For analytics use FlatGeobuf or GeoParquet:

FlatGeobuf and GeoParquet are analysis-focused formats. They’re useful for answering queries like What is the sum of attribute A over features that overlap this polygon? But their design does not enable cloud-native visualization like COG does.

You can convert FlatGeoBuf and GeoParquet data into cloud-friendly formats using tools like Tippecanoe:

The best-in-class tool for creating vector tiles from datasets like FlatGeobuf and GeoParquet is tippecanoe, originally developed by Mapbox, but since v2.0 maintained by Felt. Tippecanoe doesn’t just slice features into tiles, it generates smart overviews for every zoom level matching a typical web mapping application. It adaptively simplifies and discards features, using many configuration options, to assemble a coherent overview of entire datasets with minimal tile size.

The output from Tippecanoe can be PMTiles a format that can be read in the browser:

The last missing piece is a cloud-friendly organization of tiles enabling efficient spatial operations. This is the focus of my PMTiles project, an open specification for COG-like pyramids of tiled data, suited to planet-scale vector mapping.

The post doesn’t go into any technical details. I enjoyed as a short and sweet overview of these new(ish) formats and how they are related.