Using OSM Data with PostGIS
Lars Ahlzen
3/24/2016
+
Setting up PostgreSQL/PostGIS
Linux: Use package manger. E.g.
apt-get install postgresql
Win/Mac: Download installer from postgresql.org/download/
Use pgAdmin (GUI) or psql (command line) to create database and install extensions. Example:
lars@geovm1:~$ psql -U lars -h localhost postgres
psql (9.3.11)
Type "help" for help.
postgres=# create database gis with owner=lars encoding='UTF8';
CREATE DATABASE
postgres=# \q
lars@geovm1:~$ psql -U lars -h localhost gis
gis=# create extension postgis;
CREATE EXTENSION
gis=# create extension hstore;
CREATE EXTENSION
Control access with pg_hba.conf
Tuning the database:
Downloading OSM data
- Small areas: Directly from osm.org export tab. Real-time.
- Larger areas: From the Overpass API (OSM mirror database). Link at osm.org
- Regional extracts: GeoFabrik. Updated daily. Has shapefiles, too!
- Whole planet: at planet.osm.org: 48 GB as of March 2016.
OSM Data File Formats
- .osm - XML based format.
- .osm.gz - compressed OSM XML.
- .osm.bz2 - compressed OSM XML.
- .osm.pbf - binary (protobuf-based) format.
- .osc - OSM "diff" file for replication.
Recap: The OSM data model
Three geometry types:
- Node (Point)
- Way (LineString or Polygon)
- Relation (collections of Nodes, Ways and other Relations)
Example
Example of OSM XML file.
More info about the OSM file format on the
OSM Wiki
Importing OSM data into PostGIS
Need to translate from the OSM data model to postgis. Challenges include:
- Node/way/relation-based geometry.
- Freeform key-value tags.
- No obvious distinction between linear and area features.
Several different tools for different purposes.
Overview of tools at:
http://wiki.openstreetmap.org/wiki/PostgreSQL
Schemas
Common database schemas for PostGIS:
- OSM apidb - The schema of the main OSM DB. No direct geometries.
- pgsnapshot - Somewhat easier to work with than apidb.
- osm2pgsql - Useful for rendering, misc. Easy to work with.
More info at:
http://wiki.openstreetmap.org/wiki/Databases_and_data_access_APIs#Database_Schemas
Osmosis
Osmosis is a tool to load, process, convert and save OSM data. Supports the pgsnapshot schema. Useful for analysis.
Details at:
wiki.openstreetmap.org/wiki/Osmosis
http://wiki.openstreetmap.org/wiki/Osmosis/Detailed_Usage_0.44
Example: Setting it up from geovm1 (Ubuntu) on the postgresql server deltaiii:
lars@geovm1:/usr/share/doc/osmosis/examples$ cat \
pgsnapshot_schema_0.6.sql | psql -h deltaiii -U lars gis
lars@geovm1:~/Downloads$ osmosis --read-xml file=BostonArea.osm \
--write-pgsql user=lars host=deltaiii database=gis
lars@geovm1:~/Downloads$ psql -U lars -h deltaiii gis
psql (9.4.6, server 9.5.1)
Type "help" for help.
gis=# \dt
List of relations
Schema | Name | Type | Owner
--------+--------------------+-------+-------
public | nodes | table | lars
public | relation_members | table | lars
public | relations | table | lars
public | schema_info | table | lars
public | users | table | lars
public | way_nodes | table | lars
public | ways | table | lars
public | spatial_ref_sys | table | lars
...
(13 rows)
gis=# select count(*) from nodes;
count
--------
503297
(1 row)
osm2pgsql
Creates database tables and geometries (LineStrings, Polygons, etc) with attribute columns from OSM tags.
Info and installation instructions at:
http://wiki.openstreetmap.org/wiki/Osm2pgsql
Install tl;dr
*Binaries for Windows exist, but are generally out of date
For the latest version, don't be afraid to compile yourself:
sudo apt-get install make cmake g++ libboost-dev libboost-system-dev \
libboost-filesystem-dev libexpat1-dev zlib1g-dev \
libbz2-dev libpq-dev libgeos-dev libgeos++-dev libproj-dev lua5.2 \
liblua5.2-dev
git clone https://github.com/openstreetmap/osm2pgsql.git
cd osm2pgsql
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=$HOME
make
make install
More info in the osm2psql readme on Github.
Import example
osm2pgsql -c --prefix=boston -d gis -U lars -H localhost BostonArea.osm
Command line explained:
- -c Create new tables
- --prefix= Prefix for table names
- -d Name of the database
- -U Database user name
- -H Database host
osm2pgsql -h
osm2pgsql -h -v
After import
lars@geovm1:~$ psql -U lars -h localhost gis
psql (9.3.11)
Type "help" for help.
gis=> \dt
List of relations
Schema | Name | Type | Owner
-------+-----------------------------------+-------+-------
public | boston_line | table | lars
public | boston_points | table | lars
public | boston_polygon | table | lars
public | boston_roads | table | lars
...
public | spatial_ref_sys | table | lars
(23 rows)
gis=>\d boston_line
Table "public.boston_line"
Column | Type | Modifiers
--------------------+-----------------------------+-----------
osm_id | bigint |
access | text |
addr:housename | text |
addr:housenumber | text |
addr:interpolation | text |
admin_level | text |
aerialway | text |
aerialway_class | text |
aeroway | text |
amenity | text |
area | text |
barrier | text |
...
z_order | integer |
way_area | real |
way | geometry(LineString,900913) |
Indexes:
"boston_line_index" gist (way) WITH (fillfactor='100')
gis=>
What data/tags get imported?
Style file determines:
- What features are imported.
- What OSM tags become columns in the database tables.
- What ways become polygons (rather than circular linestrings).
Generally: One column per tag.
If you're missing a tag, copy the default style, modify it and use the --style option (or use hstore (see below)).
osm2pgsql -c --prefix=boston --style=MyExtraDetailed.style -d gis -U lars -H localhost BostonArea.osm
Default style file is well commented
Other useful osm2pgsql options
- -a Append to existing tables, instead of creating new.
- --slim Reduce memory usage (slower; needed for very large imports).
- --cache=N Use N MB of RAM for cache. For performance tuning.
- --hstore Import all tags into an hstore column.
- --latlong Use Lat/Lon (4326) instead of web mercator (3857/900913).
- --keep-coastlines Coastlines are imported (special case in OSM).
- --tag-transform-script=FILE Use a lua script to pre-process tags.
Coastline
The coastline is a special case in OSM because of its length.
Import with the --keep-coastlines option, or (better) use
coastline/land/ocean shapefiles from openstreetmapdata.com.
Using hstore
osm2pgsql -c --prefix=boston --hstore \
-d gis -U lars -H localhost BostonArea.osm
lars@geovm1:~$ psql -U lars -h localhost gis
psql (9.3.11)
Type "help" for help.
gis=>\d boston_line
Table "public.boston_line"
Column | Type | Modifiers
--------------------+-----------------------------+-----------
osm_id | bigint |
access | text |
addr:housename | text |
addr:housenumber | text |
addr:interpolation | text |
admin_level | text |
...
z_order | integer |
way_area | real |
tags | hstore |
way | geometry(LineString,900913) |
Indexes:
"boston_line_index" gist (way) WITH (fillfactor='100')
gis=>
Example query using hstore tags:
Select all polygons - and their centerpoint lon/lat - that have an ICAO code:
gis=> SELECT osm_id, ST_AsText(ST_Transform(ST_Centroid(way), 4326)),
gis-> (tags->'iata') AS iata, name
gis-> FROM boston_polygon WHERE (tags->'iata') IS NOT NULL;
osm_id | st_astext | iata | name
-----------+---------------------------+------+----------------------------
79839974 | POINT(-71.17363 42.18940) | OWD | Norwood Memorial Airport
95408743 | POINT(-71.28741 42.46888) | BED | Laurence G. Hanscom Field
291828579 | POINT(-71.01184 42.36406) | BOS | Logan International Airport
(3 rows)
gis=>
Using lua scripts
Supply a lua script to pre-process tags using
the command line.
Example from my
Hikemap project:
Indexes
Key to performance!
osm2pgsql creates a GIST spatial index on the geometry (way) column by default.
If you filter common queries based on other columns, add spatial indexes
that includes those columns as well. For example:
CREATE INDEX ON boston_point USING gist(way) WHERE "natural" = 'peak';
Speeds up this type of query by orders of magnitude:
SELECT way, name, ... FROM boston_point
WHERE "natural" = 'peak'
AND way && ST_MakeEnvelope(-72.0, 42.0, -71.0, 43.0);
Good overview on spatial indexes:
http://revenant.ca/www/postgis/workshop/indexing.html
QGIS
Example: Select all public transit stations/stops within a mile of Mapkin HQ:
WITH here AS (SELECT ST_Transform(ST_SetSRID(
ST_MakePoint(-71.05940, 42.35348), 4326), 900913) AS p)
SELECT osm_id, name, ST_Distance(way, here.p) * 3.28 AS distance
FROM here, boston_point
WHERE public_transport IN ('station', 'stop_position')
AND ST_DWithin(way, here.p, 1609) ORDER BY distance;
(cast geometry to geography for the most accurate result)
Simplify usage with a view:
CREATE VIEW nearby_stations AS
(WITH here AS (SELECT ST_Transform(ST_SetSRID(
ST_MakePoint(-71.05940, 42.35348), 4326), 900913) AS p)
SELECT way, osm_id, name, ST_Distance(way, here.p) * 3.28 AS distance
FROM here, boston_point
WHERE public_transport IN ('station', 'stop_position')
AND ST_DWithin(way, here.p, 1609));
Mapnik
Mapnik XML example
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE Map>
<Map srs="+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0
+lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over">
<Style name="BostonBuildings">
<Rule> ... </Rule>
</Style>
...
<Layer name="BostonBuildings" status="on"
srs="+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0
+lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs +over">
<StyleName>BostonBuildings</StyleName>
<Datasource>
<Parameter name="type">postgis</Parameter>
<Parameter name="password">****</Parameter>
<Parameter name="host">localhost</Parameter>
<Parameter name="port">5432</Parameter>
<Parameter name="user">lars</Parameter>
<Parameter name="dbname">gis</Parameter>
<Parameter name="table">
(SELECT way, building, name,
(tags->'height') AS height,
(tags->'building:levels') AS levels
FROM boston_polygon
WHERE building IS NOT NULL
) AS buildings
</Parameter>
<Parameter name="estimate_extent">false</Parameter>
<Parameter name="extent">-20037508,-19929239,20037508,19929239</Parameter>
</Datasource>
</Layer>
</Map>
Other links:
Good site for resources:
switch2osm.org/loading-osm-data
(not exclusively for PostGIS, but has lots of useful tips for setting up rendering with OSM and PostGIS)
Lars Ahlzen
lars@ahlzen.com
OpenStreetMap Boston Meetup