Skip to content
Snippets Groups Projects
03-vector_data.qmd 13.05 KiB

# Vector Data

## Importing and exporting data

The `st_read()` and `st_write()` function are used to import and export many types of files. The following lines import the administrative data in district level layer located in the **cambodia.gpkg** [geopackage](https://www.geopackage.org/) file.

```{r import, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)

district = st_read("data_cambodia/cambodia.gpkg", layer = "district")   #import district data
```

The following lines export the **district object** to a **data** folder in geopackage and shapefile format.

```{r export, class.output="code-out", warning=FALSE, message=FALSE}

st_write(obj = district, dsn = "data_cambodia/district.gpkg", delete_layer = TRUE)
st_write(obj = district, "data_cambodia/district.shp", layer_options = "ENCODING=UTF-8", delete_layer = TRUE)

```

## Display

**Preview of the variables** via the function `head()` and `plot()`.

```{r aff1, nm = TRUE, class.output="code-out", warning=FALSE, message=FALSE}
head(district)
plot(district)
```

**for Geometry display** only.

```{r aff2, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
plot(st_geometry(district))
```

## Coordinate systems

### Look up the coordinate system of an object

The function `st_crs()` makes it possible to consult the system of coordinates used and object sf.

```{r proj1, proj1, class.output="code-out", warning=FALSE, message=FALSE}
st_crs(district)
```

### Changing the coordinate system of an object

The function `st_transform()` allows to change the coordinate system of an sf object, to re-project it.

```{r, proj2, sm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
plot(st_geometry(district))
title("WGS 84 / UTM zone 48N")
dist_reproj <- st_transform(district, "epsg:4326")
plot(st_geometry(dist_reproj))
title("WGS84")
```

The [Spatial Reference](http://spatialreference.org/){target="_blank"} site provides reference for a large number of coordinate systems.

## Selection by attributes

The object `sf` **are** `data.frame`, so you can select their rows and columns in the same way as `data.frame`.

```{r selectAttr, class.output="code-out", warning=FALSE, message=FALSE}
# row Selection
district[1:2, ]

district[district$ADM1_EN == "Phnom Penh", ]

# column selection
district[district$ADM1_EN == "Phnom Penh", 1:4] 
```

## Spatial selection

### Intersections

Selection of roads that are intersecting **dangkao** district

```{r intersects, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
road <- st_read("data_cambodia/cambodia.gpkg",  layer = "road", quiet = TRUE)
dangkao <-  district[district$ADM2_EN == "Dangkao", ]
inter <- st_intersects(x = road, y = dangkao, sparse = FALSE)
head(inter)
dim(inter)
```

The **inter** object is a matrix which indicates for each of element of the **road** object (6 elements) whether it intersects each elements the **dangkao** object (1 element). The dimension of the matrix is therefore indeed 6 rows \* 1 column. Note the use of the parameter `sparse = FALSE` here. It is then possible to create a column from this object:

```{r intersects2, nm =TRUE, class.output="code-out", warning=FALSE, message=FALSE}
road$intersect_dangkao <- inter
plot(st_geometry(dangkao), col = "lightblue")
plot(st_geometry(road), add = TRUE)
plot(st_geometry(road[road$intersect_dangkao, ]),
      col = "tomato", lwd = 1.5, add = TRUE)
```

#### Difference between `sparse = TRUE` and `sparse = FALSE`

```{r st_intersparse, echo =FALSE, eval = TRUE, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
set.seed(41)
x <- mapsf::mf_get_mtq()
grid <- st_make_grid(x = x, n = c(2,2))
grid <- st_sf(id = 1:4, geom = grid)
pt <- st_sample(grid, size = 8)
pt <- st_sf(id = letters[1:length(pt)], geom = pt)

library(mapsf)
mf_init(grid, expandBB = c(0, 0, 0, .3))
mf_map(grid, add = T)
mf_label(grid, "id", cex = 2, pos = 3)
mf_map(pt, add = T, pch = 4, cex = 1.2, lwd = 2, col = "red")
mf_label(pt, "id", cex = 1, col = "red", font = 3, pos = 2, overlap = F)
legend(x = "topright", 
       col = c("black", NA, "red"), 
       legend = 
         c("    grid  ", NA, "    pt  "), 
       pch = c(22,NA,4), 
       pt.bg = "grey", 
       pt.cex = c(6,NA, 1.2), 
       bty = "n")
mf_title("st_intersects()", tab = FALSE, pos = "center")
```

-   `sparse = TRUE`

```{r st_intersparse2, class.output="code-out", warning=FALSE, message=FALSE}
inter <- st_intersects(x = grid, y = pt, sparse = TRUE)
inter
```

-   `sparse = FALSE`

```{r st_intersparse3, class.output="code-out", warning=FALSE, message=FALSE}
inter <- st_intersects(x = grid, y = pt, sparse = FALSE)
rownames(inter) <- grid$id
colnames(inter) <- pt$id
inter
```

### Contains / Within

Selection of roads contained in the municipality of *Dangkao*. The function `st_within()` works like the function `st_intersects()`

```{r within, nm = TRUE, class.output="code-out", warning=FALSE, message=FALSE}
road$within_dangkao <- st_within(road, dangkao, sparse = FALSE)
plot(st_geometry(dangkao), col = "lightblue")
plot(st_geometry(road), add = TRUE)
plot(st_geometry(road[road$within_dangkao, ]), col = "tomato",
     lwd = 2, add = TRUE)

```

## Operation of geometries

### Extract centroids

```{r centroid, nm = TRUE, class.output="code-out", warning=FALSE, message=FALSE}
dist_c <- st_centroid(district)
plot(st_geometry(district))
plot(st_geometry(dist_c), add = TRUE, cex = 1.2, col = "red", pch = 20)
```

### Aggregate polygons

```{r aggreg, nm = TRUE, class.output="code-out", warning=FALSE, message=FALSE}
cambodia_dist <- st_union(district)                 
plot(st_geometry(district), col = "lightblue")
plot(st_geometry(cambodia_dist), add = TRUE, lwd = 2, border = "red")
```

### Aggregate polygons based on a variable

```{r aggreg2, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
dist_union  <- aggregate(x = district[,c("T_POP")],
                   by = list(STATUT = district$Status),
                   FUN = "sum")
plot(dist_union)
```

### Create a buffer zone

```{r buffers, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
dangkao_buffer <- st_buffer(x = dangkao, dist = 1000)
plot(st_geometry(dangkao_buffer), col = "#E8DAEF", lwd=2, border = "#6C3483")
plot(st_geometry(dangkao), add = TRUE, lwd = 2)
```

### Making an intersection

By using the function `st_intersection()` we will cut one layer by another.

```{r intersect2, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
library(magrittr)
# creation of a buffer zone around the centroid of the municipality of Dangkao district
# using the pipe
zone <- st_geometry(dangkao) %>%
  st_centroid() %>%
  st_buffer(30000)
plot(st_geometry(district))
plot(zone, border = "#F06292", lwd = 2, add = TRUE)
dist_z <- st_intersection(x = district, y = zone)
plot(st_geometry(district))
plot(st_geometry(dist_z), col="#AF7AC5", border="#F9E79F", add=T)
plot(st_geometry(dist_z))
```
### Create regular grid

The function `st_make_grid()` allows you to create regular grid. The function produce and object `sfc`, you must then use the function `st_sf()` to transform the object `sfc` into and object `sf`. During this transformation we add here a column of unique identifiers.

```{r grid, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
grid <- st_make_grid(x = district, cellsize = 10000)
grid <- st_sf(ID = 1:length(grid), geom = grid)

plot(st_geometry(grid), col = "grey", border = "white")
plot(st_geometry(district), border = "grey50", add = TRUE)
```

### Counting points in a polygon (in a grid tile)

```{r intersect3.1, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}

# selection of grid tiles that intersect the district

inter <- st_intersects(grid, cambodia_dist, sparse = FALSE)
grid <- grid[inter, ]

case_cambodia <- st_read("data_cambodia/cambodia.gpkg", layer = "cases" , quiet = TRUE)
plot(st_geometry(grid), col = "grey", border = "white")
plot(st_geometry(case_cambodia), pch = 20, col = "red", add = TRUE, cex = 0.8)

inter <- st_intersects(grid, case_cambodia, sparse = TRUE)
length(inter)

```

Here we use the argument `sparse = TRUE`. The **inter** object is a list the length of the **grid** and each item in the list contain the index of the object items of **cases** and **grid** intersection.

For example **grid tile** 35th intersect with four **cases** 97, 138, 189, 522, 624, 696

```{r intersect3.3, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
inter[35]
plot(st_geometry(grid[35, ]))
plot(st_geometry(case_cambodia), add = T)
plot(st_geometry(case_cambodia[c(97, 138, 189, 522, 624, 696), ]), 
     col = "red", pch = 19, add = TRUE)
```

To count number of **case**, simply go to the list and report length of the elements.

```{r intersect3.4, class.output="code-out", warning=FALSE, message=FALSE}
grid$nb_case <- sapply(X = inter, FUN = length)   # create 'nb_case' column to store number of health centers in each grid tile 
plot(grid["nb_case"])
```

### Aggregate point values into polygons

In this example we import a csv file that contain data from a population grid. Once import we transform it `data.frame` into an object `sf`.

The objective is to aggregate the values id these points (the population contained in the "DENs" field) in the municipalities of the district.

```{r agval, class.output="code-out", warning=FALSE, message=FALSE}

pp_pop_raw <- read.csv("data_cambodia/pp_pop_dens.csv")            # import file
pp_pop_raw$id <- 1:nrow(pp_pop_raw)                                # adding a unique identifier
pp_pop <- st_as_sf(pp_pop_raw, coords = c("X", "Y"), crs = 32648)  # Transform into object sf
pp_pop <- st_transform(pp_pop, st_crs(district))                   # Transform projection
inter <- st_intersection(pp_pop, district)                         # Intersection
inter

```

By using the function `st_intersection()` we add to each point of the grid all the information on the municipality in which it is located.

We can then use the function `aggregate()` to aggregate the population by municipalities.
```{r agval2, class.output="code-out", warning=FALSE, message=FALSE}
resultat <- aggregate(x = list(pop_from_grid = inter$DENs), 
                      by = list(ADM2_EN = inter$ADM2_EN), 
                      FUN = "sum")
head(resultat)
```

We can then create a new object with this result.

```{r agval3, class.output="code-out", warning=FALSE, message=FALSE}
dist_result <- merge(district, resultat, by = "ADM2_EN", all.x = TRUE)
dist_result
```

## Measurements

### Create a distance matrix

If the dataset's projection system is specified, the distance are expressed in the projection measurement unit (most often in meter)

```{r distance, nm=TRUE, class.output="code-out", warning=FALSE, message=FALSE}
mat <- st_distance(x = dist_c, y = dist_c)
mat[1:5,1:5]
```

### Calculate routes

<img src="img/logo_osrm.svg" align="right" width="120"/> The package `osrm` [@R-osrm] acts as an interface R and the [OSRM](http://project-osrm.org/) [@luxen-vetter-2011]. This package allows to calculate time and distance matrices, road routes, isochrones. The package uses the OSRM demo server by default. In case of intensive use [it is strongly recommended to use your own instance of OSRM (with Docker)](https://github.com/Project-OSRM/osrm-backend#using-docker).

#### Calculate a route

The fonction `osrmRoute()` allows you to calculate routes.

```{r osrmroute, fig.width = 3, fig.height = 5, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
library(osrm)
library(maptiles)
district <- st_read("data_cambodia/cambodia.gpkg",layer = "district", quiet = TRUE)
district <- st_transform(district, 32648)

odongk  <- district[district$ADM2_PCODE == "KH0505", ] # Itinerary between Odongk district and Toul Kouk
takmau <- district[district$ADM2_PCODE == "KH0811",]
route <- osrmRoute(src = odongk, 
                   dst = takmau, 
                   returnclass = "sf")
osm <- get_tiles(route, crop = TRUE)
plot_tiles(osm)
plot(st_geometry(route), col = "#b23a5f", lwd = 6, add = T)
plot(st_geometry(route), col = "#eee0e5", lwd = 1, add = T)
```

#### Calculation of a time matrix

The function `osrmTable()` makes it possible to calculate matrices of distances or times by road.

In this example we calculate a time matrix between 2 addresses and health centers in Phnom Penh on foot.

```{r osrmtable, fig.width = 8, class.output="code-out", warning=FALSE, message=FALSE}
library(sf)
library(tidygeocoder)
hospital <- st_read("data_cambodia/cambodia.gpkg",layer= "hospital", quiet = TRUE)

hospital_pp <- hospital[hospital$PCODE == "12", ]     # Selection of health centers in Phnom Penh

adresses <- data.frame(adr = c("Royal Palace Park, Phnom Penh Phnom, Cambodia",
                              "Wat Phnom Daun Penh, Phnom Penh, Cambodia"))  # Geocoding of 2 addresses in Phnom Penh

places <- tidygeocoder::geocode(.tbl = adresses,address = adr)
places

# Calculation of the distance matrix between the 2 addresses and the health center in Phnom Penh

cal_mat <- osrmTable(src = places[,c(1,3,2)], 
                 dst = hospital_pp, 
                 osrm.profile = "foot")

cal_mat$durations[1:2, 1:5]

# Which address has better accessibility to health center in Phnom Penh?

boxplot(t(cal_mat$durations[,]), cex.axis = 0.7)

```