# Species Distribution Modeling applied to _Aedes albopictus_ breeding sites in public spaces (Montpellier) and generating an environmental bias map


## Description

This project contains the necessary scripts to perform :
- Species Distribution Modeling (SDM) using biomod2 package in R with environmental variables and breeding sites data of _Aedes albopictus_. This script allows the calculation of presence probabilities.
- Exploration of environmental biases between breeding sites data and environmental variables.

## Installation
Before running the scripts, make sure you have the following R packages installed:

install.packages(c("biomod2", "raster", "rgdal", "sp", "ggplot2", "FactorMineR", "terra", "dplyr","sf"))

## Usage

**1. Species Distribution Modeling (SDM)**

Script: biomod2_breeding_sites_public_spaces.R

This script processes breeding site presence data (.csv) and environmental variables (.tif raster files) to model the distribution of Aedes albopictus breeding sites in public spaces. The workflow includes:

1. Loading Data: Import presence data and raster layers of environmental variables.

2. Data Formatting: Prepare the data for biomod2.

3. Running Individual Models: Train different SDM algorithms and evaluate their performance.

4. Assessing Model Performance: Extract evaluation scores and variable importance scores.

5. Plotting Response Curves: Visualize the relationships between environmental variables and the species presence.

6. Ensemble Modeling: Combine multiple individual models to improve prediction accuracy.

7. Projection: Apply the final model to the entire study area to generate probability maps of breeding site presence.


**2. Sampling Bias Analysis**

Script: sampling_bias_mtp.R

This script analyzes potential sampling biases by comparing the spatial distribution of presence data with the underlying environmental variables. It includes:

1. Loading data

2. Extraction Environmental Characteristics : Extract environmental variables at breeding site locations from the environmental raster stack.

3. Performing Principal Component Analysis (PCA) : Identify how well the sampled sites represent the environmental variability of the study area

4. Calculating Environmental Distance to Sampled Sites: Compute the minimum environmental distance between each point in the study area and the closest sampled breeding site.

5. Generating an Environmental Bias Map: Convert the computed environmental distances into a raster layer for visualization and analysis.


## Data Requirements

Presence Data: A .csv file with at least two columns: longitude, latitude (WGS84 projection recommended).

Environmental Data: Raster layers (.tif format) representing environmental variables (e.g., spectral indices, textural indices, percentage of land cover, temperature, humidity).


## Output

- Model performance metrics (AUC, TSS)

- Variable importance scores

- Response curves

- Probability maps of breeding site presence

- Uncertainty associated with the probabilities

- Environmental Bias Map