July 2023

Overture Maps Foundation Releases First Data

The Overture Maps Foundation has released the first version of a global map data set, which includes data for places of interest, buildings, transportation and administrative boundaries.

The coverage of the building dataset depicted on top of a world map.
The extent of the Overture Map Foundation's places data across the globe. (Source: Overture Map Foundation)

The four layers combine data from different sources into one uniform data schema. The data has been sourced from commercial data provides, all members of the foundation, and and OpenStreetMap:

  • Meta contributed to places of interest,
  • Microsoft contributed to places of interest and buildings,
  • Esri contributed to buildings, and further
  • Buildings and transportation data was sourced from OpenStreetMap.

Accordingly the licenses vary for each data layer:

  • Places of interest and administrative boundaries are licensed under CDLA Permissive v2.0; the data can be modified and shared as long as the original license is included.
  • The buildings and transportation layers are available under ODbL, which the same as OpenStreetMap’s license.

The data is hosted on AWS S3 and Azure Blob Storage and available for download in the Parquet format, a cloud-native format. You can download the data to your local machine using the AWS CLI, Azure Storage Explorer or the AzCopy.

Alternatively, to make use of Parquet’s cloud-native capabilities to download only the data you need, you can fetch data via SQL queries in AWS Athena, Microsoft Synapse, or DuckDB.

The data release is the first major sign that the the Overture Maps Foundation is alive and produces results. Combining sources from major providers could yield one of the most complete and up-to-date data sources we currently have; it would be interesting to see how it compares to OpenStreetMap. Plus the use of modern, cloud-native formats make this vast dataset manageable, without clogging up your hard drive.

Gregor MacLennan with a nice overview of the current state of Mapeo and what we can expect from future releases.

We developed Mapeo over 8 years through a co-design process with local partners, and have learned a huge amount about the challenges and opportunities of peer-to-peer technology along the way. This post shares some technical details about these challenges and how the solutions are guiding our work on “Mapeo Next”.

I’ve always admired the work of Digital Democracy, especially on Mapeo. They’ve built a great product for collaborative and participatory mapping, something we didn’t quite pull off when I worked at ExCiteS and Cadasta.

My colleagues at Development Seed have released eoAPI:

Say hello 👋 to eoAPI, a cloud-native backend for standing up a modern, open geospatial data infrastructure. Built around the STAC specification, eoAPI makes massive earth observation (EO) data archives discoverable and interoperable. EO data is accessible through open community standards for data discovery, allowing your data to connect seamlessly to scientific notebooks, AI pipelines, and dashboard interfaces.

eoAPI bundles open-source software into a package, which simplifies standing up modern geospatial data infrastructure to aid discovery and visualisation of geospatial vector and raster data and make it available through open standards:

  • pg-STAC is an optimized Postgres schema to index and search large-scale STAC collections.
  • stac-fastapi is an Open Geospatial Consortium (OGC) Features API compliant FastAPI application for STAC metadata search.
  • titiler-pgSTAC is a TiTiler extension that connects to pgSTAC to support large-scale dynamic mosaic tiling for visualizing STAC collections and items.
  • tipg is an Open Geospatial Consortium (OGC) Features and Tiles API for vector datasets.

Getting started is easy with infrastructure-as-code templates allowing you to deploy eoAPI with opinionated but reasonable defaults:

  • eoapi-cdk - A set of AWS CDK constructs to deploy eoAPI services.
  • eoapi-template - An AWS CDK app that shows how to configure the eoapi-cdk constructs.
  • eoapi-k8s - IaC and Helm charts for deploying eoAPI services on AWS and GCP.

The softwares is available open-source and free under the permissive MIT licence.

A Building Dataset for the Global South

Google unveiled a dataset containing 1.8 billion building footprints in Central and Southern America, Africa, Indian Subcontinent, as well as islands in Southern Asia.

The coverage of the building dataset depicted on top of a world map.

The data was created from high-resolution satellite images, using machine learning. As such, it only contains only the building geometry, data that can be derived from the building footprint (centroid, area, it the Plus code), and the level of confidence in the mapping. However, no further information about a building are available, like its height, building type, or address.

You can download the data in CSV format, one file per S2 level 4 cell, with the building polygons in WKT. Other common geospatial formats are not available and additional processing and data ingestion may be required for many use cases.

The data is available under two licenses: Creative Commons Attribution (CC BY-4.0) and Open Data Commons Open Database License (ODbL) v1.0 license, which makes it compatible with OpenStreetMap. If anyone wants to kick up a stink with parts of the OSM community, this is your chance. Go on and import the whole dataset in one big change set.

Mapbox GL marked a paradigm shift in web mapping; away from pre-rendered tiled raster maps towards more dynamic vector maps rendered in the client.

Konstantin Käfer looks back at the early days of Mapbox GL:

Luckily, the time was right for a new approach. Several things fell into place that enabled the creation of Mapbox GL:

  • We had just developed the Mapbox Vector Tile format, enabling efficient delivery of small chunks of geodata to the client. Over the past decade, this format has become tremendously successful and is now an industry standard that is used across the entire geospatial community.
  • WebGL was becoming widely available, having been standardized just two years earlier.

Mapbox GL was a game-changer. Too bad they decided to switch to a proprietary license.

I reported about the release of STAC API 1.0 earlier this year (I’m tempted to say exclusively but this is just a blog not the New York Times.) Today, Radiant Earth, the shepherd of STAC, published an official announcement.

On April 25, with the help of 47 contributors and 2,790 commits, the STAC API specification reached its 1.0.0 version release. With this release, the STAC API specification is fully aligned with OGC API - Features Version 1.0 standard and the project aims to maintain alignment with OGC standards as they mature.

Following this milestone for STAC, the community is now working to align STAC extensions with the API spec so each of the extensions reaches 1.0 at some point.

A comprehensive overview of the current statuses of these extensions can be accessed at stac-api-extensions.github.io. At the time of writing this blog post, none of the extensions have reached the 1.0.0 milestone yet. However, no significant changes are expected for the Fields, Sort, Transaction, Filter, and Query extensions, and they are anticipated to attain the 1.0.0 status in the near future.

Kyle Barron muses different approaches for bringing high-performance geometry libraries written in C/C++ and Rust to the to the Web, using WebAssembly.

It’s my belief that for any project beyond a certain complexity, there should only be three core implementations:

  • One in C/C++ because C/C++ is today’s de-facto performance-critical language, and it can bind to almost any other language.
  • One in Rust because removing memory errors brings so much potential and Rust’s ergonomics bring impressive development velocity to low-level code. I believe it’s tomorrow’s performance-critical language.
  • One in Java because the Java Virtual Machine makes it hard to interface with external C libraries (and it’s yesterday’s performance-critical language?).

The best code is the code that is never written, or so that say. Turf has served my modest needs well in the past, but something as fundamental as geometry operations doesn’t need to be rewritten if we have battle-tested libraries in other languages that we can bind with WebAssembly yielding similar, often better, performance. The JavaScript world has a weird habit of reinventing the wheel, solving the same problems with slightly different approaches. We end up with lots and lots of software that basically does the same thing—having fewer, but more stable options, would be a good thing.