Newer
Older
# PopSimul
Cécile Triay (IRD)
Alice Boizet (CIRAD)
Mathias Lorieux (IRD)
Mathieu Triay (Atelier Triay)
This tool enables to simulate a VCF of a bi-parental population. It can be used to produce a VCF under controlled conditions with setting parameters such as average depth, marker density and expected error rate.
It produces 3 VCF files:
- simulated genotypes with no noise and no sequencing error
- simulated genotypes with noise but no sequencing error
- simulated genotypes with noise and sequencing error
The Breakpoints csv file contains the positions of exact genotypes transitions based on the simulated genotype with no noise and no sequencing error.
This tool has been used to test NOISYmputer (https://gitlab.cirad.fr/noisymputer/noisymputerstandalone) to compare exact Breakpoints position to Breakpoints positions determined on imputed data.
## Requirements
python 3
## Running popsimul
``` python popsimul.py -o test1511.vcf -ni 10 -s 44000000 -cM 180 -nm 220000 -a 3 -m 10 -eA 0.03 -eB 0.05```
### Parameters description
You can see all parameters by running this command
``` python popsimul.py -h```
| Short | Long | Description | Default value |
|---|---|---|---|
| -o | --output | Name for the generated files : [output].vcf, [output]_errors.vcf, [output]_noise_errors.vcf, [output]_Breakpoints.csv | popsimul |
| -ni | --nb_individuals | number of individuals | 10 |
| -bp | --chr_bp | chromosome size | 44000000|
| -cM | --cM_size | size of genetic map | 180 |
| -nm | --nb_markers | number of markers| 220000 |
| -a | --average_depth | average depth | 3 |
| -m | --max_depth | max depth | 10 |
| -eA | --error_rate_A | error rate for A | 0.05 |
| -eB | --error_rate_B | error rate for B | 0.05 |