Skip to content
Snippets Groups Projects
02-data_acquisition.qmd 10.2 KiB
Newer Older
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
---
bibliography: references.bib
---
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
# Data Acquisition

## Online databases

## OpenStreetMap

::: {style="float: right"}
<img src="img/Openstreetmap_logo.svg" width="150px" padding="1em"/>
:::

Lea's avatar
Lea committed
[OpenStreetMap (OSM)](https://www.openstreetmap.org){target="_blank"} is a participatory mapping project that aims to built a free geographic database on a global scale. OpenStreetMap lets you view, edit and use geographic data around the world.
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed

**Terms of use**

> OpenStreetMap is open data : you are free to use it for ant purpose as long as you credit OpenStreetMap and its contributers. If you modify or rely data in any way, you may distribute the result only under the same license. (...)

**Contributors**

> (...) Our contributors incloude enthusiastic mapmakers, GIS professional, engineers running OSM servers, humanitarians mapping disaster-stricken areas and many mmore.(...)

### Display and interactive map

The two main packages that allow to display as interactive map based on OSM are `leaflet` [@leaflet] and `mapview` [@mapview].

#### `leaflet`

<img src="img/logo_leaflet.png" align="right" width="200"/> `leaflet` uses the javascript library Leaflet [@JS-Leaflet] to create interactive maps.

```{r leaflet, eval=TRUE, cache = FALSE, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
library(leaflet)

district <- st_read("data_cambodia/cambodia.gpkg", layer = "district", quiet = TRUE)
hospital <- st_read("data_cambodia/cambodia.gpkg", layer = "hospital", quiet = TRUE)


banan <- district[district$ADM2_PCODE == "KH0201", ]     #Select one district (Banan district: KH0201)
health_banan <- hospital[hospital$DCODE == "201", ]      #Select Health centers in Banan

banan <- st_transform(banan, 4326)                       #Transform coordinate system to WGS84
health_banan <- st_transform(health_banan, 4326)

banan_map <- leaflet(banan) %>%                          #Create interactive map
  addTiles() %>%
  addPolygons() %>%
  addMarkers(data = health_banan)
banan_map
```

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website of `leaflet`\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[Leaflet for R](https://rstudio.github.io/leaflet/)
:::

#### `mapview`

<img src="img/logo_mapview.gif" align="right" width="150"/> `mapview` relies on `leaflet` to create interactive maps, its use is easier and its documentation is a bit dense.

```{r mapview, cache=FALSE, class.output="code-out", warning=FALSE, message=FALSE}
library(mapview)
mapview(banan) + mapview(health_banan)
```

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website of `mapview`\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[`mapview`](https://r-spatial.github.io/mapview/)
:::

### Import basemaps

The package `maptiles` [@maptiles] allows downlaoding and displaying raster basemaps.\
The function `get_tiles()` allow you to download OSM background maps and the function `plot_tiles()` allows to display them.\
Renders are better if the input data used the same coordinate system as the tiles ([EPSG:3857](https://github.com/riatelab/maptiles#projection)).

```{r display_point, fig.width=6, fig.height=6, eval=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
library(maptiles)
district <- st_read("data_cambodia/cambodia.gpkg", layer = "district", quiet = TRUE)
district <- st_transform(district, 3857)
osm_tiles <- get_tiles(x = district, zoom = 10, crop = TRUE)
plot_tiles(osm_tiles)
plot(st_geometry(district), border = "grey20", lwd = .7, add = TRUE)
mtext(side = 1, line = -2, text = get_credit("OpenStreetMap"), col="tomato")
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
```

### Import OSM data

#### `osmdata`

<img src="img/logo_osmdata.png" align="right" height="140" width="122/"/> The package `osmdata` [@osmdata] allows extracting vector data from OSM using the [Overpass turbo](https://wiki.openstreetmap.org/wiki/Overpass_turbo) API.

```{r osmdata, eval=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
library(osmdata)
library(sf)

country <- st_read("data_cambodia/cambodia.gpkg", layer = "country", quiet = TRUE)
ext <- opq(bbox = st_bbox(st_transform(country, 4326)))                    #Define the bounding box
query <- add_osm_feature(opq = ext, key = 'amenity', value = "hospital")   #Health Center Extraction
hospital <- osmdata_sf(query)
hospital <- unique_osmdata(hospital)                                       #Result reduction (points composing polygon are detected)
```

lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
The result contains a point layer and a polygon layer. The polygon layer contains polygons that represent hospitals. To obtain a coherent point layer we can use the centroids of the polygons.
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed

```{r osmdata2, class.output="code-out", warning=FALSE, message=FALSE}
hospital_point <- hospital$osm_points
hospital_poly <- hospital$osm_polygons                                                             #Extracting centroids of polygons
hospital_poly_centroid <- st_centroid(hospital_poly)

cambodia_point <- intersect(names(hospital_point), names(hospital_poly_centroid))                  #Identify fields in Cambodia boundary
hospitals <- rbind(hospital_point[, cambodia_point], hospital_poly_centroid[, cambodia_point])     #Gather the 2 objects
```

**Result display**

```{r osmdatadisplay, cache=FALSE, class.output="code-out", warning=FALSE, message=FALSE}
library(mapview)
mapview(country) + mapview(hospitals)
```

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website of `osmdata`\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[`osmdata`](https://docs.ropensci.org/osmdata/)
:::

#### `osmextract`

<img src="img/logo_osmextract.svg" align="right" width="120"/> The package `osmextract` [@osmextract] allows to extract data from an OSM database directly. This package make it possible to work on very large volumes of data.

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website of `osmextract`\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[`osmextract`](https://docs.ropensci.org/osmextract/)
:::

For administrative boundaries, check [here](https://wiki.openstreetmap.org/wiki/Tag:boundary%3Dadministrative#10_admin_level_values_for_specific_countries) the administrative levels by country:

```{r, class.output="code-out", warning=FALSE, message=FALSE}
library(osmextract)
library(mapsf)
province <- oe_get(
  place = "Cambodia",
  download_directory = "data_cambodia/",
  layer = "multipolygons",
  extra_tags = c("wikidata", "ISO3166-2", "wikipedia", "name:en"),
  vectortranslate_options = c(
    "-t_srs", "EPSG:32648",
    "-nlt", "PROMOTE_TO_MULTI",
    "-where", "type = 'boundary' AND boundary = 'administrative' AND admin_level = '4'"
  ))

mf_map(x = province)
```

```{r, class.output="code-out", warning=FALSE, message=FALSE}
roads <- oe_get(
  place = "Cambodia",
  download_directory = "data_cambodia/",
  layer = "lines",
  extra_tags = c("access", "service", "maxspeed"),
  vectortranslate_options = c(
  "-t_srs", "EPSG:32648",
  "-nlt", "PROMOTE_TO_MULTI",
  "-where", "
    highway IS NOT NULL
    AND
    highway NOT IN (
    'abandonded', 'bus_guideway', 'byway', 'construction', 'corridor', 'elevator',
    'fixme', 'escalator', 'gallop', 'historic', 'no', 'planned', 'platform',
    'proposed', 'cycleway', 'pedestrian', 'bridleway', 'footway',
    'steps', 'path', 'raceway', 'road', 'service', 'track'
    )
    "
),
  boundary = subset(province, name_en == "Phnom Penh"),
  boundary_type = "clipsrc"
)
mf_map(x = roads)
```

lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
## Import from lat / long file

The function `st_as_sf()` makes it possible to transform a `data.frame` container of geographic coordinates into an object `sf`. Here we use the `data.frame` `places2` created in the previous point.

```{r build_sf, eval = TRUE, echo = TRUE, nm = TRUE, fig.width=8, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
place_sf <- st_as_sf(read.csv("data_cambodia/adress.csv"), coords = c("long", "lat"), crs = 4326)
place_sf
```

```{r echo=FALSE, eval=TRUE}
sf_use_s2(FALSE)
```

To create a `sf` POINT type object with only one pair of coordinate (WGS84, longitude=0.5, latitude = 45.5) :

```{r build_sf2, eval = TRUE, echo = TRUE, nm = TRUE, fig.width=8, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
test_point <- st_as_sf(data.frame(x = 0.5, y = 45.5), coords = c("x", "y"), crs = 4326)
test_point
```

We can display this object `sf` on an [OpenStreetMap](https://www.openstreetmap.org/) basesmap with the package maptiles `maptiles` [@maptiles].

```{r display_sf, eval = TRUE, echo = TRUE, nm = TRUE, out.width="788px", warning=FALSE, message=FALSE}
library(maptiles)
osm <- get_tiles(x = place_sf, zoom = 12)
plot_tiles(osm)
plot(st_geometry(place_sf), pch = 2, cex = 2, col = "red", add = TRUE)
```

lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
## Geocoding

Serveral pakages alow you to geocode addresses. <img src="img/logo_tidygeocoder.png" align="right" width="120"/> The package `tidygeocoder` [@tidygeocoder] allow the use of a [large number of online geocoding sevices](https://jessecambon.github.io/tidygeocoder/articles/geocoder_services.html). The package `banR` [@banR], which is based on the [National Address Base](https://adresse.data.gouv.fr/), is the particularly suitable for geocoding addresses in France.

### `tidygeocoder`

```{r tidygeocoder, eval=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
library(tidygeocoder)
test_adresses <- data.frame(
  address = c("Phnom Penh International Airport, Phnom Penh, Cambodia",
              "Khmer Soviet Friendship Hospital, Phnom Penh, Cambodia"))
places1 <- geocode(test_adresses, address)
places1
```

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website by `tidygeocoder` :\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[tidygeocoder](https://jessecambon.github.io/tidygeocoder/)
:::

### `banR` (Base Adresse Nationale)

```{r banR, eval=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
# remotes::install_github("joelgombin/banR")
library(banR)
mes_adresses <- data.frame(
  address = c("19 rue Michel Bakounine, 29600 Morlaix, France",
              "2 Allee Emile Pouget, 920128 Boulogne-Billancourt")
)
places2 <- geocode_tbl(tbl = mes_adresses, adresse = address)
places2
```

::: {.callout-note appearance="minimal" icon="false"}
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
Website of `banR` :\
lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
[An R client for the BAN API](http://joelgombin.github.io/banR/)
:::

## Digitization

The package `mapedit` [@mapedit] allows you to digitize base map directly in R. Although it can be practical in some cases, in package cannot replace the functionalities of a GIS for important digitization tasks.

lucas.longour_ird.fr's avatar
lucas.longour_ird.fr committed
![<small>Gif taken from [`mapedit` website](https://r-spatial.org/r/2019/03/31/mapedit_leafpm.html)</small>](img/mapedit-leafpm-1.gif)