Skip to content
Snippets Groups Projects
README.md 1.74 KiB
Newer Older
alice.boizet_cirad.fr's avatar
alice.boizet_cirad.fr committed
Cécile Triay (IRD)  
Alice Boizet (CIRAD)  
Mathias Lorieux (IRD)  
Mathieu Triay (Atelier Triay)  
## Description
alice.boizet_cirad.fr's avatar
alice.boizet_cirad.fr committed
This tool enables to simulate a VCF of a bi-parental population. It can be used to produce a VCF under controlled conditions with setting parameters such as average depth, marker density and expected error rate. 
It produces 3 VCF files:
- simulated genotypes with no noise and no sequencing error
- simulated genotypes with noise but no sequencing error
- simulated genotypes with noise and sequencing error
alice.boizet_cirad.fr's avatar
alice.boizet_cirad.fr committed
The Breakpoints csv file contains the positions of exact genotypes transitions based on the simulated genotype with no noise and no sequencing error.  

This tool has been used to test NOISYmputer (https://gitlab.cirad.fr/noisymputer/noisymputerstandalone) to compare exact Breakpoints position to Breakpoints positions determined on imputed data.

## Requirements
python 3

## Running popsimul

``` python popsimul.py -o test1511.vcf -ni 10 -s 44000000 -cM 180 -nm 220000 -a 3 -m 10 -eA 0.03 -eB 0.05```

### Parameters description

You can see all parameters by running this command  
``` python popsimul.py -h```

| Short  | Long | Description | Default value |
|---|---|---|---|  
alice.boizet_cirad.fr's avatar
alice.boizet_cirad.fr committed
| -o  | --output | Name for the generated files :  [output].vcf, [output]_errors.vcf, [output]_noise_errors.vcf, [output]_Breakpoints.csv | popsimul |
| -ni  | --nb_individuals | number of individuals | 10 |
| -bp  | --chr_bp | chromosome size | 44000000|
| -cM | --cM_size | size of genetic map | 180 |
| -nm  | --nb_markers | number of markers| 220000 |
| -a  | --average_depth | average depth | 3 |
| -m  | --max_depth | max depth | 10 |
| -eA | --error_rate_A | error rate for A | 0.05 |
| -eB | --error_rate_B | error rate for B | 0.05 |