| This is left | Text is centered | And this is right-aligned |
| More text | Even more text | And even more to the right |
As we can see, the use of a ViT encoder does not necessarly improve the accuracy of the calssification.
Indeed, the backbone used here was trained on RGB images, working on multispectral data seems to be to far a step here.
Moreover, the different forest types have allready diffrente spectral responses.
By desing, ViT are better than other ML methods to interpret the structure of an image and the relation between patches.
Here, the information being mosty spectral leads to poorer performances after the use of a DL encoder, since the spectral information is drowned between unrelevant structural and tetural information interpreted by the DL backbone.
Here is the resulting map obtained when infering the RF model without pre-processing:
```{image} ./_static/examples/rf_no_red.png
:alt: Random Forest inference without dimension reduction
...
...
@@ -27,6 +42,8 @@ Example table (temp)
:align: center
```
And with pre-processing:
```{image} ./_static/examples/rf_red.png
:alt: Random Forest inference with dimension reduction