Mathare Building import

From Map Kibera

(Difference between revisions)
Jump to: navigation, search
m (Processing the data)
m (Processing the data)
Line 243: Line 243:
         'AAAS building extraction from satellite imagery "MM128982" for Mathare 2011' as source,
         'AAAS building extraction from satellite imagery "MM128982" for Mathare 2011' as source,
         g.the_geom
         g.the_geom
-
  FROM groundtruthed84 as g;
+
  FROM "groundtruthed-wgs84" as g;
== Find collisions between OSM and the import candidate ==
== Find collisions between OSM and the import candidate ==

Revision as of 12:46, 18 May 2011

In this page, I (Sébastien) will describe how to import the data of the building extraction of Mathare that we've got from AAAS. See Primož's blog for the initial analysis.


Contents

Changing the SRS

The data was made available as shapefiles in WGS84 Web Mercator Auxiliary Sphere projection, which is known as Web Mercator, spherical Mercator, Google Mercator and is referenced by various ids: EPSG:900913, EPSG:3785 (obsolete), EPSG:3857, ESRI WKID 102100... In QGIS, it's known as Google Mercator, EPSG:900913, ID 3644.

Before working with the data, we need to change the Spatial Reference System (SRS) to be compatible with OSM, that is moving to non-projected WGS84.
In QGIS: menu Vector -> Data Management Tools -> Export to new projection. Then choose the output CRS to be WGS84, EPSG:4326, ID 3452.
The two files that we're going to import are GroundTruthed-wgs84.shp and BuildingsOnTop-wgs84.shp

Setup PostGIS database

PostGIS is relatively easy to install on Ubuntu. I might describe it further later. Anyway, the current environment is PostgreSQL 8.4, PostGIS 1.5.
I found this script from GeoDjango very convenient to create a postgis db template:

$ ./create_template_postgis-debian.sh
$ createdb -T template_postgis mathare

Import the shapefiles to the db

Each shapefile will be imported a separate table with shp2pgsql:

shp2pgsql -s 4326 -c -g the_geom -I -S -W UTF-8 -N skip Groundtruthed-wgs84.shp | psql mathare
shp2pgsql -s 4326 -c -g the_geom -I -S -W UTF-8 -N skip BuildingsOnTop-wgs84.shp | psql mathare

Import OSM data to the db

In this case, we're only interested in buildings from OSM.

Get fresh data from OSM, keep only ways tagged with 'building' and their associated ways:

osmosis \
--read-api left=36.8425655 right=36.881876 top=-1.2480512 bottom=-1.2696752999999998 outPipe.0=1 \
--way-key keyList=building inPipe.0=1 outPipe.0=2 \
--used-node inPipe.0=2 outPipe.0=3 \
--write-xml file=mathare_building.osm inPipe.0=3

Import it to the database:

osm2pgsql -c -d mathare -l -s mathare_building.osm

The following tables will be created:

  • planet_osm_line
  • planet_osm_nodes
  • planet_osm_point
  • planet_osm_polygon
  • planet_osm_rels
  • planet_osm_roads
  • planet_osm_ways

Tagging scheme

Every geometry imported will be tagged with:

 source=AAAS_satellite_extraction_for_Mathare_2011
 

Shapefile column OSM tags
id ignored, since not used in shapefiles
designated see table below
type=* building:type=*
name=* name=*
Designated
Shapefile
Designated=*
OSM tags
Toilet building=yes
amenity=toilet
NGO/CBO building=office
office=ngo
Administration building=office
office=government
CBO building=office
office=ngo
Self-Help Group
building=yes
note=Self-Help Group
fixme=Find an appriopriate tag for "Self help group"
Medical Clinic building=yes
amenity=hospital
health_facility:type=medical_clinic
Business building=commercial
Petrol Station building=yes
amenity=fuel
motor_vehicle=yes
House building=residential
Apartment Building building=apartments
Hardware Shop building=commercial
shop=hardware
Church building=church
amenity=place_of_worship
religion=christian
Bar building=yes
amenity=bar
Bathroom building=yes
amenity=shower
Private Toilet building=yes
amenity=toilet
access=private
fixme=mathare preset scheme
Garage building=garage
Public Toilet building=yes
amenity=toilet
access=public
Market Place

building=yes
amenity=market_place

Cyber Cafe building=yes
amenity=cyber_cafe


Keys:

  • access
  • amenity
  • building
  • building:type
  • fixme
  • health_facility:type
  • motor_vehicle
  • name
  • note
  • office
  • religion
  • shop
  • source

Preparing the database for the new data

DROP TABLE IF EXISTS mathare_building_import;

CREATE TABLE mathare_building_import
(
    shp_gid         integer,
    osm_id          integer,
    layer           integer,
    "access"        text,
    "amenity"       text,
    "building"      text,
    "building:type" text,
    "fixme"         text,
    "health_facility:type" text,
    "motor_vehicle" text,
    "name"          text,
    "note"          text,
    "office"        text,
    "religion"      text,
    "shop"          text,
    "source"        text,
    the_geom        geometry,
    CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2),
    CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'POLYGON'::text OR the_geom IS NULL),
    CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 4326)
) WITH ( OIDS=FALSE );

CREATE INDEX build_the_geom_gist ON mathare_building_import USING gist(the_geom);

Processing the data

INSERT INTO mathare_building_import (
   shp_gid,
   "building",
   "building:type",
   "access",
   "amenity",
   "fixme",
   "health_facility:type",
   "motor_vehicle",
   "name",
   "note",
   "office",
   "religion",
   "shop",
   "source",
   the_geom
) 
SELECT g.gid,
       ( CASE WHEN g.designated IN ('NGO/CBO', 'Administration', 'CBO') THEN 'office'
              WHEN g.designated IN ('Business', 'Hardware Shop' ) THEN 'commercial'
              WHEN g.designated = 'House' THEN 'residential'
              WHEN g.designated = 'Apartment Building' THEN 'apartments'
              WHEN g.designated = 'Church' THEN 'church'
              WHEN g.designated = 'Garage' THEN 'garage'
              ELSE 'yes'
         END
       ) as building,
       lower(replace(g.type, ' ', '_')) as building_type,
       ( CASE WHEN g.designated = 'Public Toilet' THEN 'public'
              WHEN g.designated = 'Private Toilet' THEN 'private'
              ELSE NULL
         END
       ) as access,
       ( CASE WHEN g.designated IN ('Toilet', 'Private Toilet', 'Public Toilet') THEN 'toilet'
              WHEN g.designated = 'Petrol Station' THEN 'fuel'
              WHEN g.designated = 'Medical Clinic' THEN 'hospital'
              WHEN g.designated = 'Church' THEN 'place_of_worship'
              WHEN g.designated = 'Bar' THEN 'bar'
              WHEN g.designated = 'Bathroom' THEN 'shower'
              WHEN g.designated = 'Cyber Cafe' THEN 'cyber_cafe'
              ELSE NULL
          END
        ) as amenity,
        ( CASE WHEN g.designated = 'Self-Help Group' THEN 'Find an appriopriate tag for "Self help group"'
               WHEN g.designated IN ('Public Toilet', 'Private Toilet', 'Toilet', 'Bathroom') THEN 'MapMathare WATSAN-preset scheme missing'
               ELSE NULL
          END
        ) as fixme,
        ( CASE WHEN g.designated = 'Medical Clinic' THEN 'medical_clinic' ELSE NULL END ) as health,
        ( CASE WHEN g.designated = 'Petrol Station' THEN 'yes' ELSE NULL END ) as motor_vehicle,
        g.name as name,
        ( CASE WHEN g.designated = 'Self-Help Group' THEN 'Self-Help Group' ELSE NULL END ) as note,
        ( CASE WHEN g.designated = 'Administration' THEN 'government'
               WHEN g.designated in ( 'NGO/CBO', 'CBO') THEN 'ngo'
               ELSE NULL
          END
        ) as office,
        ( CASE WHEN g.designated = 'Church' THEN 'christian' ELSE NULL END ) as religion,
        ( CASE WHEN g.designated = 'Hardware Shop' THEN 'hardware' ELSE NULL END) as shop,
        'AAAS building extraction from satellite imagery "MM128982" for Mathare 2011' as source,
        g.the_geom
FROM "groundtruthed-wgs84" as g;

Find collisions between OSM and the import candidate

Very few buildings have already been mapped in OSM in Mathare (139 as of 2011.05.09). Yet, we should be careful whilst importing the dataset from aerial imagery extraction. While we can discard the geometry of the existing buildings, we shall preserve their attributes.

Basically, we shall check for geometries that overlaps and apply the attributes of the OSM element to the imported one.

Note: I can see two tricky corner cases, which I don't need to worry about right now since this import does not feature them. However, overlooking these issues in another import will likely result in a loss of data:

  • nodes of the buildings bear attributes
  • multipolygons

We can find the overlapping elements with this query:

mathare#= select g.gid, o.osm_id, from "groundtruthed-wgs84" as g, planet_osm_polygon as o where ST_Intersects(g.the_geom, o.way) ORDER BY g.gid;

This script removes the columns, which are empty:

#!/usr/bin/python

import psycopg2

conn = psycopg2.connect("dbname=mathare")
cur= conn.cursor()
cur2 = conn.cursor()
cur3 = conn.cursor()
cur.execute("select attname from pg_attribute, pg_type where typname='planet_osm_polygon' and typrelid=attrelid ;")
for record in cur:
	column = record[0]
	print column
	query = cur2.mogrify("SELECT count(*) from planet_osm_polygon WHERE \"" + column + "\" IS NOT NULL" )
 	cur2.execute(query)

	for c in cur2:
		if c[0] == 0L:
			query = "ALTER TABLE planet_osm_polygon DROP COLUMN \"" + column + "\" CASCADE;"
			print query
			cur3.execute(query)
			print repr(cur3.statusmessage)
conn.commit()
cur.close()
cur2.close()
cur3.close()
conn.close()


It turns out only one of the pre existing buildings that overlap carries information that should be kept:

mathare=# SELECT DISTINCT m.shp_gid, o.osm_id, o.name, o.amenity, o.religion 
          FROM planet_osm_polygon AS o, mathare_building_import AS m
          WHERE (o.name IS NOT NULL OR o.religion IS NOT NULL or o.amenity IS NOT NULL)
          AND ST_Intersects(m.the_geom, o.way);
 shp_gid |  osm_id  |              name              |     amenity      | religion  
---------+----------+--------------------------------+------------------+-----------
     517 | 96051625 | St. John's & St. Paul's Church | place_of_worship | christian
(1 row)

Buildings on top of other

Generate the OSM data to be imported

Personal tools