<!-- Then save the change, reload the plug-in and try again. -->
## Similarity or Random Forest causes QGIS to crash
This crash is due to a bug during `geopandas` reading of a shapefile, only when it has allready read a shapefile. It is probably linked to `fiona` as well. If you have any idea on how to solve this, please do participate in the [corresponding issue](https://github.com/umr-amap/iamap/issues/28).
Meanwhile, if QGIS crashes and your file were not saved, you can still find them in the temp files like `/tmp/iamap_features` for instance.
## QGIS crashes at the start of encoding
This is probably an issue with `rtree` during the creation of the dataset that will be used by the plugin.
This issue should be solved but the solution has not been tested on all OSes. This is an issue with `rtree` during the creation of the dataset that will be used by the plugin.
Indeed, depending on the installation, `rtree` and QGIS may have conflicting `libspatialindex`. Currently, there is several solutions:
### 1. Uninstall `rtree` installed via pip and reinstall via package manager
...
...
@@ -62,8 +70,3 @@ This way, rtree and qgis will automatically share the same libspatialindex.
If you have any idea on how to solve this issue properly, do participate in the [corresponding issue](https://github.com/umr-amap/iamap/issues/13).
## Similarity or Random Forest causes QGIS to crash
This crash is due to a bug during `geopandas` reading of a shapefile, only when it has allready read a shapefile. It is probably linked to `fiona` as well. If you have any idea on how to solve this, please do participate in the [corresponding issue](https://github.com/umr-amap/iamap/issues/28).
Meanwhile, if QGIS crashes and your file were not saved, you can still find them in the temp files like `/tmp/iamap_features` for instance.
@@ -61,7 +61,7 @@ The features produced by a deep learning encoder are often of high dimensionalit
However, it can be cumbersome to deal with all these features and this high dimensionality feature space, especially when a majority are not really informative.
Therefore, it is possible to reduce the dimensions of a raster using a variety of algorithms.
We chose to rely on [scikit-learn](https://scikit-learn.org/) to provide the algorithms.
All algorighms available in the [decomposition](https://scikit-learn.org/stable/api/sklearn.decomposition.html) and the [cluster](https://scikit-learn.org/stable/api/sklearn.cluster.html) module can be used.
All algorighms available in the [decomposition](https://scikit-learn.org/stable/api/sklearn.decomposition.html) and the [cluster](https://scikit-learn.org/stable/api/sklearn.cluster.html) module that share a common API can be used.
Different algorithms have different arguments that can be passed. You can provide these as a json string in the corresponding field.
...
...
@@ -79,7 +79,7 @@ Different algorithms have different arguments that can be passed. You can provid
:align: center
```
Features or reduced features can be clustered (_i.e._ unsupervised classification) using algorithms form the scikit-learn [cluster](https://scikit-learn.org/stable/api/sklearn.cluster.html) module.
Features or reduced features can be clustered (_i.e._ unsupervised classification) using algorithms form the scikit-learn [cluster](https://scikit-learn.org/stable/api/sklearn.cluster.html) module that share a common API.
Different algorithms have different arguments that can be passed. You can provide these as a json string in the corresponding field.
...
...
@@ -116,7 +116,7 @@ Here, additionnaly to an input raster, you have to provide a shapefile (or any f
If the features you have seem informative, you can fit a Machine Learning model on it by providing ground truth points.
Thus, you have to provide an input shapfile (or any format that will be read by geopandas) and the column corresponding to the ground truth values.
Based on the algorithm you choose, these values will be interpreted as integers (classification) or floats (regression).
All models provided by scikit-learn [ensemble](https://scikit-learn.org/stable/api/sklearn.ensemble.html)(_e.g._ Random Forests, Gradient Boosting) and [neighbors](https://scikit-learn.org/stable/api/sklearn.neighbors.html)(_e.g._ KNN) module are available.
All models provided by scikit-learn [ensemble](https://scikit-learn.org/stable/api/sklearn.ensemble.html)(_e.g._ Random Forests, Gradient Boosting) and [neighbors](https://scikit-learn.org/stable/api/sklearn.neighbors.html)(_e.g._ KNN) module that share a common API are available.
@@ -22,18 +22,15 @@ We recommend tipycally to use a nomber of points between 1 and 10.
## Create a shapefile for Random Forest
To create a shapefile to train a random forest algorithm, you have to go to ``Layer --> create Layer --> new shapefile Layer`` in
the QGIs toolbar. Then a window should open. You can name your file like you want, please use "File encoding : UTF-8" and select "Geometry Type : Point".
You can then suppress "id" in the field list. You then need to add the column name you want the random forest to use for training and predicting.
For that, in "new field" type your name, and the click to "add to field list" (please Type : Text string). A name we recommend to use is "Type", as this is the default value
of the random forest algorithm. However you can use an another name but you will have to change the "Name of the column you want random forest to work on" in
the random forest interface.
You can then click "Ok" to close the window and create your shapefile.
You can then update your shapefile by clicking "Toogle editing" and "Add point feature" in the toolbar of QGIS.
You can then add your point with a Type of your choice.
We recommend to place at least 100 points for training data set. The more points you use for test data set the more robust the interpretation will be.
If you want to use only one dataset (will be split into 80% training points and 20% testing points) please place around 150 points.
the QGIs toolbar. Then a window should open. You can name your file like you want, we recommend using ``File encoding : UTF-8`` and ``Geometry Type : Point``.
You can then suppress ``id`` in the field list. You then need to add the column name you want the ML model to use for training and predicting.
For that, in ``new field`` type your name, and the click to ``add to field list`` (please Type : Text string).
You can then click ``Ok`` to close the window and create your shapefile.
You can then update your shapefile by clicking ``Toogle editing`` and ``Add point feature`` in the toolbar of QGIS.
You can then add your point with a value of your choice.
We recommend to place at least 100 points for training data set, although it depends of the task you want to achieve.
If you can, please ponder on how to setup train and test datasets (_e.g._ usage of cross-validation etc..).
Keep in mind that the more points you place the better the results will be.