Knowledge

Darren Wiens of SparkGeo demonstrates an AI-driven approach to generating audio descriptions for interactive web maps. The overall process is simple yet effective:

  1. Convert the current map view to an image.
  2. Upload the image to OpenAI’s API to return a text description of the images.
  3. Upload the text description to OpenAI’s API to receive the audio of the text description.

It’s one example of how new generative AI technologies can be put to good use. It is simply impossible for humans to provide alt-text for large interactive maps and AI is the one (the only?) way to make textual descriptions of maps scalable.

You can serve PMTiles directly from a cloud object storage but in some cases, you want to control who accesses data and how often—and you need a server for that. Craig Kochis wrote up two examples of how to serve PMTiles using a NodeJS server application from a co-located file and S3.

A webinar on cloud-native geospatial technologies and their applications in the pacific region features four speakers:

  • Wei Ji Leong of Development Seed introduces cloud-optimised data formats for geospatial data, focussing on imaging, multi-dimensional data cubes, point clouds and vector data.
  • Alex Leith of Auspatious talked through the history of Digital Earth platforms, and how cloud-native spatial data have shaped and influenced their development.
  • Leo Ghignone from the University of Tasmania explained how IMOS, the Integrated Marine Observing System, use cloud-optimised data to support oceanographic research around the Great Barrier Reef.
  • Fang Yuan of Frontier SI shared a perspective from application developers who used cloud-native data to implement scalable and performant geospatial solutions.

The two-hour recording is now available online.

Estimating the Cost of Hosting a Global PMTiles Dataset

In his NACIS conference talk, Brandon Liu positions Protomaps as an altenative to what he call scarcity maps: Tile services offered by commercial companies that cost a small fortune once your project becomes popular and exceeds the number of tile requests in the free tier.

Nothing is free in this world, even hosting PMTiles yourself isn’t. If you want to convince someone that hosting Protomaps is a financially viable alternative then you need to compare numbers.

So let’s do some quick math and compare a rough estimate of the costs for hosting PMTiles on S3 to the monthly costs of Mapbox Vector tiles.

For the sake of simplicity, let’s assume that clients make 1.5 million tile requests per month. The costs incurred on S3 fall into two categories. Data storage and transfer.

On 3 November, the size of a PMTiles dataset based on OpenStreetMap covering the whole world was 107.62 GB. AWS charges $0.023 per GB and month to store data in S3, so the cost to store a global map is $2.47.

To estimate the transfer costs, we need to know the average size of a PMTile that is delivered over the network. The Protomaps website conveniently has an example that shows size of each tile response. I zoomed and panned around on the map and logged the individual size of about two hundred requests. The average size per tile in my sample was 68.88KB. 1.5 million tile requests at 68.88KB rack up about 103GB in transferred data. AWS charges $0.09 per transferred GB from S3 to the internet, so the overall data-transfer cost is $9.27.

The cost to host and serve a world-wide map dataset is about $12. But here’s a catch. If you put a Cloudfront CDN in front of your S3 bucket (which you probably want to do), then data transfer from S3 to Cloudfront is free, so is the first terra-byte from Cloudfront to the internet. Chances are your can host your PMTiles for less than $5.

The same 1.5 million vector-tile requests on Mapbox will cost you $325; a significant difference. Even considering the labour costs of setting up the infrastructure and data on AWS, and making the occasional update, PMTiles will save money. Like a lot of money.

Disclaimer: This is an informed estimate not a scientific study. I literally did this on the back of an envelope. It’s not my fault, if you take these numbers to your boss to convince them to adopt Protomaps and it turns out you’re paying $25 per month.

Very Spatial has compiled a collection of free and open books on spatial analysis.

I am teaching a straight forward, stand-alone Spatial Analysis class for the first time in a couple of decades. That means that I have been looking at resources to share with the class, especially reference materials that they can access given that they will mostly forget what I tell them by February once the next semester is in swing.

FlatGeobuf, GeoParquet, PMTiles, whatnot—I find it hard to understand when and how to use new cloud-native data formats. Brandon Liu has answers.

For analytics use FlatGeobuf or GeoParquet:

FlatGeobuf and GeoParquet are analysis-focused formats. They’re useful for answering queries like What is the sum of attribute A over features that overlap this polygon? But their design does not enable cloud-native visualization like COG does.

You can convert FlatGeoBuf and GeoParquet data into cloud-friendly formats using tools like Tippecanoe:

The best-in-class tool for creating vector tiles from datasets like FlatGeobuf and GeoParquet is tippecanoe, originally developed by Mapbox, but since v2.0 maintained by Felt. Tippecanoe doesn’t just slice features into tiles, it generates smart overviews for every zoom level matching a typical web mapping application. It adaptively simplifies and discards features, using many configuration options, to assemble a coherent overview of entire datasets with minimal tile size.

The output from Tippecanoe can be PMTiles a format that can be read in the browser:

The last missing piece is a cloud-friendly organization of tiles enabling efficient spatial operations. This is the focus of my PMTiles project, an open specification for COG-like pyramids of tiled data, suited to planet-scale vector mapping.

The post doesn’t go into any technical details. I enjoyed as a short and sweet overview of these new(ish) formats and how they are related.

If you have projects that still use OpenStreetMap map tiles with the deprecated URL schema i.e., {a,b,c}.tiles.openstreetmap.org, do upgrade to the newer schema.

tile.openstreetmap.org supports HTTP/2 and HTTP/3 which no longer require the old (a|b|c) aliases to increase browser connection concurrency. Using a single URL improves performance and ability to cache.

It will make your app faster and lowers the burden on maintainers.

If you’re working with STAC or want to learn about it, consider following the STAC Google Group for regular news and invitations to join community meetings.

The technical how-to describes how you can use AWS Athena to query OpenStreetMap data from Parquet files on S3. Athena is an analysis layer sitting on top of data source to simplify data access for application such as machine-learning tools or data dashboards. Using analysis-ready OSM data removes storage- and computation-heavy steps to obtain and convert the data into the desired format from the processing pipeline. The example use data from the Daylight Map, which is available from AWS’ Open Data Registry.

Circles are only supported in a few geo-data formats because most of today’s formats are based on the Simple Features specification, which doesn’t define circles.

Tom MacWright, writing on the Placemark blog, explores why circles are so hard to implement into geo-data applications and why Placemark ended up with three circle definitions: geodesic, degree and Mercator circles.

A quick tutorial by Bert Temme about how to turn a shape file into PMTiles using Tippecanoe:

 In this blog we created in a few easy steps vector tiles from shapefile of worldwide railroads in PMTile format using Tippecanoe, and deployed to a standard webserver. No complicated backend WMS/WFS mapservers are needed anymore to get this working.

Iván Sánchez Ortega reporting from his activities during the latest OGC code sprint:

when pygeoapi is requested a coverage from GIS client (preferring image/tiff or application/ld+json or the like), the raw data is returned. But when it’s a web browser (preferring text/html), then a webpage with a small viewer is returned.

It’s an interesting deep-dive into HTTP content negotiation, how it relates to geo-data problems and what OGC API implementations could do better.