How to run frangiPANe
DATA:
project_name: "rice"
out_dir: "/home/christine/Documents/Dev/frangiPANe_snake"
fastq_dir: "/home/christine/Documents/Dev-test/data_test/fastq"
group_file: "/home/christine/Documents/Dev-test/data_test/rice_group.txt"
ref_file: "/home/christine/Documents/Dev-test/data_test/ref.fasta"
univec_file: "/home/christine/Documents/Dev-test/data_test/bank/UniVec_Core"
cpu_number: "6"
Find here a summary table with description of each data need to launch RattleSNP :
Name |
Description |
---|---|
Project Name |
the directory name that will contain all the results generated by this analysis |
Output Directory |
the parent directory that will contain the project_name directory |
Fastq Directory |
the directory that contains .fastq files of all the individuals |
Reference File |
the reference genome used to map all reads (fastq files) |
Group File |
It is common for several individuals to have different origins when building a pangenome. To exploit this diversity in the results, frangiPANe needs a tabulated file (separator = tabulation, no header); the first column corresponds to the name of the individuals (should correspond to .fastq files) and the second to the group they belong. NB : It is possible to use only one group.) |
Warning
The reference genome have to be provided in a fasta format. The reference genome must be indexed by BWA for subsequent analyzes. The REF.amb, REF.ann, REF.bwt, REF.pac and REF.sa files are created.) For FASTQ, naming convention accepted is NAME_R1.fastq.gz or NAME_R1.fq.gz or NAME_R1.fastq or NAME_R1.fq. Preferentially use short names and avoid special characters.
snakemake -r -c1 --configfile=config.yaml