A Way Forward for OpenStreetMap’s Data Model

Jochen Topf has published a the preliminary report on the shortcomings of OpenStreetMap’s data model. It’s a very very in-the-weeds document, providing an in-depth discussion of the current state of OSM’s data model and ways to improve and future-proof it.

The report vets the data model against the backdrop of anticipated growth of the OSM data set and its implications for data processing, it looks at the lack of a native polygon geometry type (a classic!), limitations related to mapping large objects or and those with fuzzy boundaries, its incompatibility with standard GIS software, and many others. Topf also suggests solutions addressing some of the problems, including removing untagged nodes, introducing an area type, limiting the length of tag values, changing the database management system, and offering different data formats via the API and for exports.

How many of the proposed changes will be implanted remains to be seen, Topf himself is cautious:

I am not proposing any action or, at most, minor steps. This is not because those are not important issues, but because I cannot see a clear path to any improvements. Often the goals are too vague to be actionable.

Implementing just some of the proposed changes would be if big lift, every tool interacting with the OSM data and API will be affected; every editor, every command-line script that coverts data, every export tool. It would require constant engagement with the community and strong technical leadership.


On a side note, it’s almost disappointing how few arguments about the paper there have been so far, compared to the huge stir the announcement of Topf’s review had caused a few months back.