osmenrich is an R package to easily enrich geocoded data (latitude/longitude) with geographic features from OpenStreetMap (OSM). This package is designed to work with the
osmdata packages. This package leverages the work provided in
sf for the manipulation of simple features (i.e. real-world objects), and
osmdata for querying OpenStreetMap data (i.e. geographical data).
Often a user is interested in retrieving information about the location and the closeness of real-world objects. If the objects in a dataset have geocoded data (latitude/longitude), then this package enables the user to interact with these objects and enrich them with information about other objects around them. We call the object in the dataset " reference points “, while the objects we are interested in retrieving” feature points ".
Therefore, if a dataset contains geocoded data, with this package one can extract information about real-world object around each of the objects contained in the data, compute their distance/duration from the objects and then enrich the dataset with this information. The result is a tidy
To do this, the package needs to connect to a server containing OpenStreetMap data and one (or more) servers containing routing engines - used to compute durations and distances.
If you do have this package, due to recent changes in GitHub’s naming of branches, please make sure you have the latest version of
remotes or at least version
Once you did this, to continue the installation of the
osmenrich package, run:
osmenrich can be installed with the remotes package from GitHub with
and then load it in the usual way:
Out of the box
osmenrich uses pubic remote servers to retrieve OSM data and to compute distances/durations from reference points to feature points.
As stated above,
osmenrich makes use of an OSM server and one or more OSRM servers to retrieve OSM data ( feature points ) and to calculate metrics such as distances and durations. The OSM feature points available can be found by: 1. Visiting the OSM wiki: https://wiki.openstreetmap.org/wiki/Map_features. 2. Loading the
library(osmdata)) and calling the function
The basic data enrichment will work without having to setup any one of these server locally, thanks to publicly available servers. However, for large data enrichment tasks and for tasks involving the computation of durations between reference points and feature points and/or the computation of custom distances or durations between these points (such as the distances between two points computed on a walking distance or cycling), the setup of one or more of these servers is required.
We created a GitHub repository hosting the instruction and the
docker_compose.yml files needed to setup these servers.
To facilitate the routing of users to the right setup for their need, we provide some use cases and their respective recommended setup:
overpass(OSM) server. The OSRM connection will rely on public servers (only car distances available!)
docker_compose.ymlto setup both the
overpass(OSM) and all three
Let’s enrich a spatial (
sf) dataset (
sf_example) with the number of waste baskets in a radius of 500 meters from each of the point specified in a dataset:
# Import libraries library(tidyverse) library(sf) library(osmenrich) # Create an example dataset to enrich sf_example <- tribble( ~person, ~lat, ~lon, "Alice", 52.12, 5.09, "Bob", 52.13, 5.08, ) %>% sf::st_as_sf( coords = c("lon", "lat"), crs = 4326 ) # Print it sf_example #> Simple feature collection with 2 features and 1 field #> geometry type: POINT #> dimension: XY #> bbox: xmin: 5.08 ymin: 52.12 xmax: 5.09 ymax: 52.13 #> CRS: EPSG:4326 #> # A tibble: 2 x 2 #> person geometry #> * <chr> <POINT [°]> #> 1 Alice (5.09 52.12) #> 2 Bob (5.08 52.13)
To enrich the
sf_example dataset with “waste baskets” in a 500m radius, we create a query using the
enrich_osm() function. This function uses the bounding box created by the points present in the example dataset and searches for the specified
key = "amenity" and
value = "waste_basket. We also add a custom
name for the newly created column and specify the radius (
r) used in the search.
# Simple OSMEnrich query sf_example_enriched <- sf_example %>% enrich_osm( name = "n_waste_baskets", key = "amenity", value = "waste_basket", r = 500 ) #> Downloading data for waste_baskets... Done. #> Downloaded 147 points, 0 lines, 0 polygons, 0 mlines, 0 mpolygons. #> Computing distance matrix for waste_baskets...Done. sf_example_enriched #> Simple feature collection with 2 features and 2 fields #> geometry type: POINT #> dimension: XY #> bbox: xmin: 5.08 ymin: 52.12 xmax: 5.09 ymax: 52.13 #> geographic CRS: WGS 84 #> # A tibble: 2 x 3 #> person geometry waste_baskets #> * <chr> <POINT [°]> <int> #> 1 Alice (5.09 52.12) 75 #> 2 Bob (5.08 52.13) 1
The waste baskets column is now the result of summing all the wastebaskets in a 500 meter radius for Alice and Bob:
Using the example dataset
sf_example specified in the previous example, we continue with a more advanced enrichment example. Here, we use a number of additional available variables to specify our initial “waste_baskets” query. We add the following:
type: "points": we specify that we are interested only in retrieving points from OSM. In this example there will not be a difference, however when querying different types of objects this might help us reduce the the amount of data retrieved.
distance: "distance_by_car": we are not anymore interested in just retrieving the number of points in a certain area (given by the radius
r), but we now want to retrieve the sum of the driving distances from a point to all the waste_baskets within radius
kernel: "parabola": we can specify the kernel function used in summarizing the features retrieved (in this example waste_baskets). Kernels convert distance or duration vectors to single numbers, with a certain weight for certain distances. This package also support the usage of custom kernel functions.
In this example, we make use of a local instance of the OSRM server to query the driving distances (
distance = "distance_by_car"). Follow the instructions in section osmenrich Docker repository to set it up. Otherwise, out-of-the-box this package will support querying only driving distances. If you are interested in querying distances or durations for other means of transportation, you will need to set up local OSRM instances.
# Specify the address of local OSRM instance # options(osrm.server = "http://localhost:<port>/") options(osrm.server = "http://localhost:8080/") # You can specify also the address of the Overpass (OSM) instance # osmdata::set_overpass_url("http://localhost:<port>/api/interpreter") osmdata::set_overpass_url("http://localhost:8888/api/interpreter") # Advanced OSMEnrich query sf_example_advanced <- sf_example %>% enrich_osm( name = "waste_baskets", key = "amenity", value = "waste_basket", type = "points", distance = "distance_by_foot", kernel = "uniform", r = 100 ) sf_example_advanced # > Simple feature collection with 2 features and 4 fields # > geometry type: POINT # > dimension: XY # > bbox: xmin: 5.08 ymin: 52.12 xmax: 5.09 ymax: 52.13 # > CRS: EPSG:4326 # > # A tibble: 2 x 5 # > person id val geometry waste_baskets # > * <chr> <dbl> <int> <POINT [°]> <int> # > 1 Alice 1 5 (5.09 52.12) 1 # > 2 Bob 2 2 (5.08 52.13) 0
For a more advanced example in which
osmenrich is put to use with other packages, please refer to this tutorial.