7  Basic statistics for spatial analysis

7.1 Load and visualize data

In this section, we load data that reference the cases of an imaginary disease throughout Cambodia.

library(sf)

#Import Cambodia country border
country = st_read("data_cambodia/cambodia.gpkg", layer = "country", quiet = TRUE)
#Import provincial administrative border of Cambodia
education = st_read("data_cambodia/cambodia.gpkg", layer = "education", quiet = TRUE)
#Import district administrative border of Cambodia
district = st_read("data_cambodia/cambodia.gpkg", layer = "district", quiet = TRUE)

# Import locations of cases from an imaginary disease
cases = st_read("data_cambodia/cambodia.gpkg", layer = "cases", quiet = TRUE)

The first step of any statistical analysis always consists on visualizing the data to check they were correctly loaded and to observe general pattern of the cases.

# View the cases object
head(cases)
Simple feature collection with 6 features and 2 fields
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: 255891 ymin: 1179092 xmax: 506647.4 ymax: 1467441
Projected CRS: WGS 84 / UTM zone 48N
  id Disease                           geom
1  0 W fever MULTIPOINT ((280036.2 12841...
2  1 W fever MULTIPOINT ((451859.5 11790...
3  2 W fever  MULTIPOINT ((255891 1467441))
4  5 W fever MULTIPOINT ((506647.4 12322...
5  6 W fever  MULTIPOINT ((440668 1197958))
6  7 W fever MULTIPOINT ((481594.5 12714...
# Map the cases
library(mapsf)

mf_map(x = district, border = "white")
mf_map(x = country,lwd = 2, col = NA, add = TRUE)
mf_map(x = subset(cases, Disease == "W fever"), lwd = .5, col = "#990000", pch = 20, add = TRUE)

7.2 Basics statistics

7.2.1 Autocorrelation

7.2.2 Moran’s test

7.3 Cluster analysis

In epidemiology, the definition of a cluster

7.3.1 Population-based clusters (kulldorf statistic)

7.3.2 Expectation-based cluster

In many case, population is not specific enough to