“OpenStreetMap Without Servers [Part 2]: A peer-to-peer OSM database”


author: James Halliday
date: 2016-06-09
slug:
title: "OpenStreetMap Without Servers [Part 2]: A peer-to-peer OSM database"
wordpress_id:
categories: – blog
image: https://images.digital-democracy.org/assets/substack-bkd-notes-1600@2x.jpg

This is a more technical post by our lead programmer, James Halliday, about the core technology that powers OpenStreetMap Without Servers

osm-p2p is a decentralized peer-to-peer database for storing and editing OpenStreetMap nodes, ways, and relations. It includes a node.js server that implements the core functionality of the OSM API, or it can be used completely in the browser using IndexedDB for persistent storage.

osm-p2p development was driven by the needs of the Amazonian indigenous communities with whom we work. Our partners want everybody to be able to participate and collaborate in the process of creating a map, in remote regions without an internet connection.

Read more about why we built osm-p2p

These requirements are a good match for a p2p, distributed data model. Each
instance of the osm-p2p database can replicate with any other instance, so
replication can happen offline with whichever data transfer methods are
available.

The osm-p2p project is split into many different packages to make it easier to
adapt the pieces for different problems.

Here’s what we have now:

Here’s what we would like to have soon, to better interoperate with the rest of
the Open Street Map ecosystem:

  • import public osm data from a region into osm-p2p
  • export osm-p2p edits back to public open street map

Here is a demo using the mapeo-desktop to synchronize data between two
osm-p2p databases. In our work in remote areas, it’s more useful and reliable to
use a USB drive for a replication medium than a network connection.

Someone can make edits, replicate to a USB drive, bring the USB drive to another
village, replicate, and then bring the USB drive back to replicate again. The
replication copies data both ways, so after this procedure both villages will
have the complete dataset.

p2p sync screencast

Architecture

osm-p2p implements a kappa architecture where all updates are written to an
append-only log. This log is the "source of truth" that populates indexes called
materialized views. These materialized views are only meant to answer questions
faster than reading the whole log. If the views need to change in the future to
accommodate different queries, they can be regenerated from the log without
having the migrate the log schema.

Each update is written to an append-only log provided by hyperlog, a module
from the dat project. This log has indexes which provide different materialized views of the data.

One view (hyperkv) provides a key/value interface mapping OSM ids to
documents. Another view creates a spatial index to answer bounding box queries
(hyperlog-kdb-index).

During replication, each side requests the documents from the log that it
doesn’t have and appends those documents to its own log. The indexes watch these
inserts and buffer requests until the they are caught up with the latest known
document in the log.

You can read about the osm-p2p architecture in more detail in
this architecture document.

Getting Started

There are many ways to get started. One way is to install the
osm-p2p-server package as a standalone command with npm:

$ npm install -g osm-p2p-server
$ osm-p2p-server
http://127.0.0.1:5000
database location: /tmp/osm-p2p.db

You may need to run the first command with sudo if you haven’t configured npm
to install into a custom location.

If these commands succeeded, you should now be running an http server locally
backed by osm-p2p-db that speaks the Open Street Map HTTP API.

To start using the osm-p2p API directly from node.js or the browser, check out
the osm-p2p repository. First install the library locally into a project
directory:

$ npm install osm-p2p

To create a database:

var osmdb = require("osm-p2p");
var osm = osmdb("/tmp/osm.db");

In the browser, you can omit the directory. In your code, you can now use
osm-p2p methods. Here are a few of the methods you can call:

  • osm.create(doc, cb) creates a new document
  • osm.put(id, doc, cb) updates an existing document
  • osm.get(id, cb) retreives existing documents by their id
  • osm.query(bbox, cb) retreives documents by a bounding box
  • osm.log.replicate() create a duplex stream for replication

To use osm-p2p in the browser, compile your program with browserify.

Get Involved

If you are a developer and would like to support our work, or if you have ideas about how to use and adapt osm-p2p for your own project, then dive into the code on Github. Open an issue with a bug report or feature request, or send us a pull request with a bug-fix or new feature. We need help right now adding tests and fixing edge-cases with osm-p2p-server and increasing compatibility with other OSM tools such as JOSM.

osm-p2p development was supported by a grant from the Knight Prototype Fund