Standards

Tim Schaub built and ran a crawler on public STAC catalogs to better understand the current adoption of the standard. A couple of figures stand out:

  • About a third of the catalogs are static, but API-based catalogs provide access to disproportionally more items and assets.
  • The vast majority of catalogs already implement STAC 1.0.0, which was only finalised in May.
  • Almost half of all items across all catalogs are not satellite images or other spatial data; they are additional meta represented in various formats such as JSON, XML or HTML.

PMTiles version 3 introduces a new tile-addressing schema. Instead of using Z,X,Y tile coordinates, the new schema uses tile IDs based on the tile’s position within a series of Hilbert curves:

The TileId 36052 corresponds to the Z,X,Y position of 8,40,87. The calculation of ID uses a pyramid of Hilbert curves starting at TileId=0 for zoom level 0. The next zoom level, a 2x2 square, occupies the next four IDs in the ID space TileId=(1,2,3,4), the next level being the next 16 IDs, and so on.

And to not duplicate tiles that contain virtually no information (for example, tiles just showing water) the RunLength indicates how many times a tile will be repeated within the Hilbert curve, so vast areas of the ocean can be represented with just one tile.

Ocean tiles are not only repetitive, but sparse and often contiguous in Hilbert space. This entry:

TileID=2578427,RunLength=107977,Offset=3650651795,Length=42

means that the 44 byte vector tile with a single square in the layer ocean is repeated over 100,000 times, starting at Z,X,Y=11,285,1311 and ending at 11,19,1304.

Neat.

Update: A new post outlines the disk layout and compression approach of PMTiles version 3. (15 August 2022)

OGC Endorses Zarr 2.0 Community Standard

The Open Geospatial Consortium (OGC), the standards body of the geospatial world, has endorsed the Zarr 2.0 specification as a community standard.

Zarr logo
Image source: zarr.dev

Zarr originated in genomics research but has since been adopted by the geospatial community because of its ability to quickly access multi-dimensional data in chunks. Zarr allows accessing a window of its data without having to first download the entire data set and dissect it locally. Think of a thirty-year time series of a grid of ocean-surface temperature data, where you can just retrieve the area around the Canary Islands for the last three years.

Zarr data can be stored in a wide range of storage systems, including object stores, such as AWS S3 or Google Cloud Storage, or in storage accessible via HTTP APIs. This makes Zarr the ideal candidate for cloud-native storage and processing of large, multi-dimensional datasets.

Community standards are a way for the OGC to formally adopt specifications developed outside the OGC standardisation process. A community-standard endorsement signifies that a specification is mature, established, widely used, and implemented into reference software. This is a big step for Zarr 2.0, showing that it is now a de-facto way to access multi-dimensional data sets over the Web.

A community standard usually represents a snapshot of a specification under constant development. The Zarr community already works on advancements to the existing standard, eventually resulting in a new Zarr 3.0 specification and proposed to the OGC as a new community standard. Other work includes an extension to the Zarr 2.0 specification formilising how georefrenced grids should be represented in Zarr.