YAOM Specification

Author: Santiago Crespo

Date: 2022-10-30

Change Summary

Version: 0.1

Date: 2022-10-30

Author/Editor: Santiago Crespo

Summary: Initial draft. Incomplete.

Abstract

A curated collection of authoritative and non-authoritative datasets in a OpenStreetMap-compatible format and data model, along with the scripts and documentation to transform and use this data.

OpenStreetMap (OSM) was a reaction to the lack of open data. Yet Another Open Map (YAOM) is a reaction to the abundance of it.

Purpose

The objective is to have a spatial database made up of automated imports from the best open datasets available for each feature and area, curated by volunteers and organizations worldwide. It leverages the great ecosystem of tools around OpenStreetMap by using the same format and a compatible data model.

Worldwide public ways from OpenStreetMap are used in combination with many other features from hundreds of local, regional, national and international open data providers to produce data compatible with OSM software. This data can be used for many purposes, such as:

It will promote the use of the YAOM data model as an international de facto standard for open data producers to use it to publish their data in addition to their traditional formats. For example, a software that does some analysis involving specific attributes of trees that has been designed for the needs of a city, would work in any other city that also publish its tree inventory according to the YAOM data model. This benefits the wider data community.

OSM and YAOM

Key differences

OSM YAOM
Community Contribute and maintain data Contribute and maintain scripts and the specification
Sources Emphasizes local knowledge. Contributors use aerial imagery, GPS devices, and low-tech field maps to verify that OSM is accurate and up to date. Emphasizes already existing open data. Contributors find suitable open datasets and follow the guidelines to improve the YAOM database.
Licenses Database: ODbL. Source (Rails Port): GNU GPL 2. Documentation (wiki): CC BY-SA 2.0 Database rights, scripts and documentation: CC0 1.0. Datasets: various open licenses, listed on the Sources page
Legal Operated by the OpenStreetMap Foundation (OSMF) on behalf of the community Operated by Santiago Crespo (for now) on behalf of the community
Data Different kinds of objects may share part or all of their geometries Layered GIS-Format approach

Benefits to OSM

Risks to OSM

Risks or benefits to OSM

Licensing

The YAOM database, scripts and documentation is available under the CC0 1.0 license. If you don't like it, feel free to republish under any license you want! But remember to respect the licenses of the individual datasets by giving them proper attribution and letting users know the license for each dataset you use.

The scripts are dual-licensed under the Zero-Clause BSD (0BSD) and CC0 1.0, as having the source published under a OSI-approved license is a requirement to submit this project to the Sustainable Smart Cities challenge.

Some data providers use licenses that requires YAOM to give attribution to the sources and to let you know what license applies. YAOM gives credit to all data sources, no matter the license, and kindly asks its users to do the same. Users can also find the license for each dataset on the Sources page, in the Downloads section and on the data itself.

Data model

xkcd 927 - Standards xkcd 927 - Standards

Introduction

The YAOM data model follows the basics of the OSM data model: geometries are represented as nodes, ways and relations; and attributes as tags (key=value).

While it has important flaws, it has the backing of valuable open-source software projects that can work with OSM data for a variety of purposes. If and when the OSM data model evolves, YAOM shall adapt to the changes to continue benefiting from the ecosystem of compatible software.

The tagging tries to be compatible with the "de facto" tags and values used in OSM, to benefit the most from current and future compatible software. But instead of an "open tagging" approach, all tags are unambiguously defined in the documentation after being discussed and agreed by the community.

Data formats

All the strings use the UTF-8 character encoding.

The coordinate system used is WGS84.

The individual datasets and the planet file are distributed as PBF files.

Numbering

In order to merge all the datasets into one coherent database, unique IDs are required. The merge script takes care of it before merging all datasets.

All ID's on YAOM can change. Future versions of this specification could make it mandatory for features to have a permanent and unique ID that could be linked to.

Tagging

When choosing a name for a tag, the tag names already in use in OSM for that type of object or attribute shall be always considered first.

If there are competing tagging schemes in OSM for the same kind of object or attribute, the YAOM community will discuss and choose one, ideally after reading the previous relevant discussions in the OSM community, checking the real usage in the OSM database and the software that supports each alternative. The YAOM community may decide to use different tags than OSM, if there is an agreed good reason.

Values have a 255 character limit to be compatible with the OSM API 0.6. This limit could be increased for all values or just for certain tags such as description=* and opening_hours=* in a future version of this specification.

As there are available open data sets with a large variety of attributes, it is probable this specification will add new tags for specific attributes that are not used (yet) in OSM.

Distances

Distances are always represented in meters, without specifying the unit and using the period as decimal separator, with the exception of the maxspeed=* tag that has its own rules.

Dates

Dates are always represented using the ISO 8601 standard: yyyy-mm-dd. Future YAOM specifications could add short variations for elements where only part of the date is known.

Mandatory tags

The source=*, source_link=*, license=*, license:code=* and upload=no tags are present in all tagged nodes, ways and relations, so data users can easily find this information and give the proper credit.

source=*

Its value indicates the name of the source of the data in exactly the way that the licensor wants to be attributed. This value can usually be found in the license of the source data. If it is not clear in the license, refer to the website where the source dataset is available. If it is still not clear, ask the organization that provides the source data.

Its value indicates the URL that the licensor wants its name to be linked to when giving attribution. If it is not specified by the licensor, the value is the URL for the open data portal where the data is sourced. It is source_link instead of source:link to preserve the source: namespace for its future use as a way to reference the source of a specific attribute.

license:url=*

Its value is the URL to the license, exactly as it is linked by the licensor.

license:code=*

A short code for the license of the data and optionally the version, such as "CC0", "ODbL" or "CC BY-SA 4.0". A list of the codes is maintained on the licenses page.

upload=no

This tag is to prevent the accidental uploading of YAOM data to OSM using JOSM.

Optional tags

name=*

Its value is the main endonym (native name) of the object.

name:*=*

To specify an exonym, the language is specified after the colon with either:

natural=coastline

It is used in closed ways (areas), representing the Mean High Water along the coastline. The way must be drawn so that the land is on the left side and the water on the right side of the way.

natural=water

A closed way (area) representing a body of water with the exception of the oceans, glaciers, wetlands, salt ponds and swimming pools.

highway=*

All linear public ways and crossings nodes worldwide are sourced from OSM. Closed ways such as pedestrian areas and other nodes such as traffic signals are not included by default, as there are good authoritative datasets for those features. The next versions of this specification shall try to unambiguously define as many highway=* features as possible after a broad discussion. It could be agreed by the community to change the category of some ways according to its physical characteristics if present, or to use different definitions for different parts of the world.

maxspeed=*

The maximum legal speed for vehicles circulating on a way. It defaults to km/h without specifying the units. Can also be specified in mph or knots, adding the unit after the value with a space in between.

Lifecycle namespace prefix

Used to identify objects that are not in their normal state of operation.

proposed:*=*

Not constructed yet.

construction:*=*

Being constructed.

disused:*=*

Not in operation but could be operating next month.

abandoned:*=*

Not in operation and won't be operating next month.

ruins:*=*

In ruins.

demolished:*=*

Removed and no longer exists. Remains may remain.

was:*=*

The object is not this kind of feature or does not have this attribute any more.

Data sources

YAOM feeds from an increasing variety of sources. Parts of a dataset can be filtered out for specific areas, if there is a better dataset for an area that is part of the first, bigger one.

Some examples of the worldwide features and source providers:

National and regional data sources include:

And some local examples:

In areas where the best source for a particular feature is OSM, its use will be encouraged for that particular area and feature.

Adding new data sources

A contributor that wants to add a new data source have to:

  1. Read and accept the YAOM contributor agreement (only for new contributors)
  2. Review the license of the original data and determine if it is compatible.
  3. Write the documentation and the scripts for the import (the "project") according to this YAOM specification.
  4. Share the project with the local community.
  5. Improve the project according to the feedback until there are no more concerns.
  6. Share the project with the wider community.
  7. Improve the project according to the feedback until there are no more concerns.
  8. Create a pull request in the Git repository and wait for it to be accepted and automatically published if it passes all the tests.

The source data will be automatically downloaded, processed, and added to the YAOM database, if it passes all the tests. Whenever the source provides an update, data will be automatically downloaded and processed, archiving the old transformed version of the dataset.

Source Data Licensing

If the license requires anything else other than reasonable attribution and letting the users know the license, it probably won't be compatible.

Compatible licenses: CC0, PDDL, ODbL, CC BY-SA, ...

Not compatible: CC BY-NC, ...

Attribution

Feel free to link to the YAOM Sources page or to build your own according to the datasets you use.

Please give attribution to OpenStreetMap and YAOM with this text and links:

Public ways data from OpenStreetMap. Other data from YAOM.

Or if you don't have much space:

Public ways © OpenStreetMap. Other data: YAOM.

Source data quality

To be written

Source data coverage (Who's on First?)

To be written

Project Governance

To be written

Code of conduct

Be nice.