Preparing the NDVI data cube
The Normalized Difference Vegetation Index (NDVI) is a commonly used index to estimate vegetation changes through satellite imagery. It does not correspond to a physical quantity, but is a good proxy for the biomass. It uses the fact that vegetation has a very low reflectance in the red band (which is a photosynthetic active radiation) and a very high reflection in the near infrared band (unused part of the sun’s light spectrum). It is usually between -1 and 1, with high negative values corresponding to open water bodies, high positive values corresponding to highly vegetated areas and values around 0 to mineral or human-made surfaces.
In the modspa_pixel processing chain, it is calculated as follows
Where:
RED: red band
NIR: near infrared band
\(corr_{ACORVI}\): correction parameter applied to the red band to smooth extreme NDVI values.
Download satellite imagery
The Sentinel-2 images can be automatically downloaded with the eodag
module by using the following function:
- modspa_pixel.preprocessing.download_S2.download_S2_data(start_date: str, end_date: str, preferred_provider: str, save_path: str, shapefile: str, mode: str = 'pixel', cloud_cover_limit: int = 80) List[str] [source]
download_S2_data uses the eodag module to look for all products of a given provider (copernicus or theia) during a specific time window and covering the whole shapefile enveloppe (several Sentinel-2 tiles might be needed, only one can be chosen for the pixel mode). It then downloads that data into the download path parametered in the config file. Paths to the downloaded data are returned and saved as a
csv
file.Arguments
- start_date:
str
beginning of the time window to download (format:
YYYY-MM-DD
)
- start_date:
- end_date:
str
end of the time window to download (format:
YYYY-MM-DD
)
- end_date:
- preferred_provider:
str
chosen source of the Sentinel-2 data (
copernicus
ortheia
)
- preferred_provider:
- save_path:
str
path where a csv file containing the product paths will be saved
- save_path:
- shapefile:
str
path to the shapefile (
.shp
) for which the data is downloaded
- shapefile:
- mode:
str
default = 'pixel'
run download code in ‘pixel’ or ‘parcel’ mode
- mode:
- cloud_cover_limit:
int
default = 80
maximum percentage to pass the filter before download (between 0 and 100)
- cloud_cover_limit:
Returns
- product_paths:
list[str]
a list of the paths to the downloaded data
- product_paths:
This will download all the Sentinel-2 images found during the specified window and over the specified area (in the config file) into the download directory as zip
or tar
archives. Specific bands can then be extracted from the archive using this function:
- modspa_pixel.preprocessing.download_S2.extract_zip_archives(download_path: str, list_paths: List[str] | str, bands_to_extract: List[str], save_path: str, remove_archive: bool = False) List[str] [source]
Extract specific bands in a zip archive for a list of tar archives.
Arguments
- download_path:
str
path in which the archives will be extracted (usually where the archives are located)
- download_path:
- list_paths:
List[str]
list of paths to the zip archives
- list_paths:
- bands_to_extract:
List[str]
list of strings that will be used to match specific bands. For example if you are looking for bands B3 and B4 in a given archive, bands_to_extract = [‘*_B3.TIF’, ‘*_B4.TIF’]. This depends on the product architecture.
- bands_to_extract:
- save_path:
str
path where a csv file containing the product paths will be saved
- save_path:
- remove_archive:
bool
default = False
boolean to choose whether to remove the archive or not
- remove_archive:
Returns
- product_list:
List[str]
list of the paths to the extracted products
- product_list:
The archives can then be deleted, freeing up some disk space.
Note
The scripts to download LandSat data will be added soon.
Calculate NDVI
The NDVI calculation is done using the xarray
module. This allows for an easy parallelization of the NDVI calculation (with the integrated dask
module). The first step is to calculate the NDVI for the existing images and save them in a data cube (a stack of two dimensional images along a time dimension). This first data cube is called the NDVI pre_cube. The chosen file format is netCDF4
(.nc
) for a more efficient reading and writing process.
To limit the size of the input datasets, the NDVI data is converted to the uint8
data type (one Byte per pixel). It means NDVI values are saved as integers between 0 and 255 (which correspond to 0 and 1 values). This gives a precision of about 0.4 % for the NDVI values, which is lower than the uncertainty of the satellite measurements. Little actual data is lost.
The NDVI pre_cube can be created with the following function:
- modspa_pixel.preprocessing.calculate_ndvi.calculate_ndvi(extracted_paths: List[str] | str, save_dir: str, boundary_shapefile_path: str, resolution: int = 20, chunk_size: dict = {'time': 2, 'x': 4096, 'y': 4096}, acorvi_corr: int = 500) str [source]
Calculate ndvi images in a xarray dataset (a data cube) and save it. ndvi values are scaled and saved as
uint8
(0 to 255).Warning
Current version for Copernicus Sentinel-2 images
Arguments
- extracted_paths:
Union[List[str], str]
list of paths to extracted sentinel-2 products or path to
csv`
file containing those paths
- extracted_paths:
- save_dir:
str
directory in which to save the ndvi pre-cube
- save_dir:
- boundary_shapefile_path:
str
shapefile path to save the geographical bounds of the ndvi tile, used later to download weather products
- boundary_shapefile_path:
- resolution:
int
default = 20
resolution in meters
Warning
only 10 and 20 meters currently supported
- resolution:
- chunk_size:
dict
default = {'x': 4096, 'y': 4096, 'time': 2}
dictionnary containing the chunk size for the xarray dask calculation
- chunk_size:
- acorvi_corr:
int
default = 500
acorvi correction parameter to add to the red band to adjust ndvi values
- acorvi_corr:
Returns
- ndvi_cube_path:
str
path to save the ndvi pre-cube
- ndvi_cube_path:
Once this data cube is written, it needs to be interpolated along the time dimension. The processing chain requires NDVI data at a daily frequency, and high resolution satellite imagery rarely has a revisit time smaller than 5 days. The daily interpolation is also done with the xarray
module. The resulting dataset is also saved with the uint8
data type.
The final NDVI cube can be created with the following function:
- modspa_pixel.preprocessing.calculate_ndvi.interpolate_ndvi(ndvi_path: str, save_dir: str, config_file: str, chunk_size: dict = {'time': -1, 'x': 512, 'y': 512}) str [source]
Interpolate the ndvi cube to a daily frequency between the desired dates defined in the
json
config file.Arguments
- ndvi_path:
str
path to ndvi pre-cube
- save_dir:
str
path to save interpolated ndvi cube
- config_file:
str
path to
json
config file- chunk_size:
dict
default = {'x': 512, 'y': 512, 'time': -1}
chunk size to use by dask for calculation,
'time' = -1
means the chunk has the whole time dimension in it. The Dataset can’t be divided along the time axis for interpolation.
Returns
None
- ndvi_path:
Warning
Both of the previous functions are ressource hungry (CPU and RAM), it can take up to an hour or more depending on the size of the dataset and the specifications of your machine.