How to run frangiPANe

DATA:
    project_name: "rice"
    out_dir: "/home/christine/Documents/Dev/frangiPANe_snake"
    fastq_dir: "/home/christine/Documents/Dev-test/data_test/fastq"
    group_file: "/home/christine/Documents/Dev-test/data_test/rice_group.txt"
    ref_file: "/home/christine/Documents/Dev-test/data_test/ref.fasta"
    univec_file: "/home/christine/Documents/Dev-test/data_test/bank/UniVec_Core"
    cpu_number: "6"

Find here a summary table with description of each data need to launch RattleSNP :

Name

Description

Project Name

the directory name that will contain all the results generated by this analysis

Output Directory

the parent directory that will contain the project_name directory

Fastq Directory

the directory that contains .fastq files of all the individuals

Reference File

the reference genome used to map all reads (fastq files)

Group File

It is common for several individuals to have different origins when building a pangenome. To exploit this diversity in the results, frangiPANe needs a tabulated file (separator = tabulation, no header); the first column corresponds to the name of the individuals (should correspond to .fastq files) and the second to the group they belong. NB : It is possible to use only one group.)

Warning

The reference genome have to be provided in a fasta format. The reference genome must be indexed by BWA for subsequent analyzes. The REF.amb, REF.ann, REF.bwt, REF.pac and REF.sa files are created.) For FASTQ, naming convention accepted is NAME_R1.fastq.gz or NAME_R1.fq.gz or NAME_R1.fastq or NAME_R1.fq. Preferentially use short names and avoid special characters.

snakemake -r -c1 --configfile=config.yaml