Skip to content
Snippets Groups Projects
Commit f6cd74ad authored by nicolas.fernandez_ird.fr's avatar nicolas.fernandez_ird.fr :shinto_shrine:
Browse files

Edit README.md : change Git path from transvihmi/ dir to nfernande/ subdir

parent 5c236367
No related branches found
No related tags found
No related merge requests found
......@@ -19,28 +19,101 @@
## ~ ABOUT ~ ##
RQC pipeline used to check reads qualities from NGS sequencing.
### RQC ###
RQC is a FAIR, open-source, scalable, modulable and traceable snakemake pipeline, used for Illumina Inc. short reads quality controls.
RQC is included as first step of **[GeVarLi](https://www.afroscreen.org/)** workflow.
### Genomic sequencing, a public health tool ###
The establishment of a surveillance and sequencing network is an essential public health tool for detecting and containing pathogens with epidemic potential. Genomic sequencing mak\
es it possible to identify pathogens, monitor the emergence and impact of variants, and adapt public health policies accordingly.
The Covid-19 epidemic has highlighted the disparities that remain between continents in terms of surveillance and sequencing systems. At the end of October 2021, of the 4,600,000 s\
equences shared on the public and free GISAID tool worldwide, only 49,000 came from the African continent, i.e. less than 1% of the cases of Covid-19 diagnosed on this continent.
### Features ###
- Control reads quality (_multiQC html report_)
- Reads quality control
- Fastq-Screen
- FastQC
- MultiQC (_html report_)
### Version ###
*V.2022.11*
### Citation ###
_none_
### Rulegraph ###
<img src="./resources/visuals/quality_control_rulegraph.png" width="250" height="150">
### Rulegraph ###
<img src="./resources/visuals/quality_control_rulegraph.png" width="500" height="250">
## ~ SUPPORT ~ ##
1. Read The Fabulous Manual!
2. Read de Awsome Wiki!
3. Create a new issue: Issues > New issue > Describe your issue
4. Send an email to [nicolas.fernandez@ird.fr](url)
## ~ CITATION ~ ##
If you use this pipeline, *please* cite this *RQC*, GitLab IRDForge repository and authors:
GitLab IRDForge repository: [https://forge.ird.fr/transvihmi/nfernandez/RQC](https://forge.ird.fr/transvihmi/nfernandez/RQC)
RQC, a FAIR, open-source, scalable, modulable and traceable snakemake pipeline,
for Illumina Inc. short reads quality controls.
Nicolas FERNANDEZ NUÑEZ _(1)_
_(1) UMI 233 - Recherches Translationnelles sur le VIH et les Maladies Infectieuses endémiques et émergentes (TransVIHMI), University of Montpellier (UM), French Institute\
of Health and Medical Research (INSERM), French National Research Institute for Sustainable Development (IRD)_
## ~ AUTHORS & ACKNOWLEDGMENTS ~ ##
- Nicolas Fernandez - IRD _(Developer and Maintener)_
- Christelle Butel - IRD _(Reporter)_
- DALL•E mini - OpenAI [Git](https://github.com/borisdayma/dalle-mini) _(Repo. avatar)_
## ~ LICENSE ~ ##
Licencied under [GPLv3](https://www.gnu.org/licenses/gpl-3.0.html)
Intellectual property belongs to [IRD](https://www.ird.fr/) and authors.
## ~ ROADMAP ~ ##
- Add MultiQC config template
## ~ PROJECT STATUS ~ ##
This project is **regularly update** and **actively maintened**
However, you can be volunteer to step in as **developer** or **maintainer**
## ~ CONTRIBUTING ~ ##
Open to contributions!
- Asking for update
- Proposing new feature
- Reporting issue
- Fixing issue
- Sharing code
- Citing tool
## ~ INSTALLATIONS ~ ##
# Conda _(mandatory)_ #
RQC _(with Snakemake)_ use the usefull **Conda** environment manager
RQC use the usefull **Conda** environment manager
So, if and only if, it's required _(Conda not already installed)_, please, first install **Conda**!
Download and install your OS adapted version of [Latest Miniconda Installer](https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links)
......@@ -65,23 +138,23 @@ rm -f ~/Miniconda3-latest-Linux-x86_64.sh && \
exit
```
Update Conda:
```
conda update -n base -c defaults conda
```
# RQC #
Clone _(with HTTPS)_ the [RQC](https://forge.ird.fr/transvihmi/Reads_Quality_Control) repository on GitLab _(ID: 404)_:
Clone to your home/ [RQC](https://forge.ird.fr/transvihmi/nfernandez/Reads_Quality_Control) GitLab IRDForg repository _(ID: 404)_:
```shell
git clone https://forge.ird.fr/transvihmi/RQC.git ~/Reads_Quality_Control/ && \
cd ~/Reads_Quality_Control/
git https://forge.ird.fr/transvihmi/nfernandez/RQC.git ~/RQC/
```
Difference between **Download** and **Clone**:
- To create a copy of a remote repository’s files on your computer, you can either **Download** or **Clone** the repository
- If you download it, you **cannot sync** the repository with the remote repository on GitLab
- Cloning a repository is the same as downloading, except it preserves the Git connection with the remote repository
- You can then modify the files locally and upload the changes to the remote repository on GitLab
- You can then **update** the files locally and download the changes from the remote repository on GitLab
Update RQC:
```shell
git reset --hard HEAD && \
git pull --verbose
cd ~/RQC/ && git reset --hard HEAD && git pull --verbose
```
## ~ USAGE ~ ##
......@@ -101,9 +174,9 @@ _Option-1: Edit **config.yaml** file in **./config/** directory_
_Option-2: Edit **fastq-screen.conf** file in **./config/** directory_
First run will auto-created _(only once)_:
- RQC-Base conda environment _(Snakemake, Mamba, Rename, GraphViz)_
- Snakemake-conda environments _(for each tools used by GeVarLi)_
- Indexes for BWA and BOWTIE2 aligners _(for each fasta genomes in resources)_
- Snakemake-Base conda environment _(Snakemake, Mamba, Rename, GraphViz)_
- RQC-conda environments _(for each tools used by RQC)_
- Indexes for BWA aligner _(for each fasta genomes in resources)_
_This may take some time, depending on your internet connection and your computer_
......@@ -153,16 +226,21 @@ Yours results are available in **./results/** directory, as follow:
```
### fastq-screen ###
Search in your libraries if the genomes of organisms you work on, along with PhiX, Vectors,
or other contaminants commonly seen in sequencing experiments.
More about [fastq-screen](https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/)
### fastqc ###
Modular set of analyses which you can use to give a quick impression of whether
your data has any problems of which you should be aware before doing any further analysis.
More about [fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
### multiqc ###
Compiled HTML report. More about [multiqc](https://multiqc.info/)
......@@ -171,66 +249,32 @@ Compiled HTML report. More about [multiqc](https://multiqc.info/)
You can edit default settings in **config.yaml** file into **./config/** directory:
### Resources ###
Edit to match your hardware configuration
- **cpus**: for tools that can _(i.e. bwa)_, could be use at most n cpus to run in parallel _(default config: '8')_
_**Note**: snakemake (with default Start bash script) will always use all cpus to parallelize jobs_
- **ram**: for tools that can _(i.e. samtools)_, limit memory usage to max n Gb _(default config: '16' Gb)_
- **tmpdir**: for tools that can _(i.e. pangolin)_, specify where you want the temp stuff _(default config: '$TMPDIR')_
### Environments ###
Edit if you want change some environments _(e.g. test a new version)_ in ./workflow/envs/{tools}_v.{version}.yaml files
### Fastq-Screen ###
- **config**: path to the fastq-screen configuration file _(default config: ./config/fastq-screen.conf)_
- **subset**: do not use the whole sequence file, but create a temporary dataset of this specified number of read _(default config: '1000')_
- **aligner**: specify the aligner to use for the mapping. Valid arguments are 'bowtie', bowtie2' or 'bwa' _(default config: 'bwa')_
#### fastq-screen.conf ####
- **databases**: enables you to configure multiple genomes databases _(aligner index files)_ to search against
## ~ SUPPORT ~ ##
1. Read The Fabulous Manual!
2. Read de Awsome Wiki! (todo...)
3. Create a new issue: Issues > New issue > Describe your issue
4. Send an email to [nicolas.fernandez@ird.fr](url)
## ~ ROADMAP ~ ##
- Open to suggestions
## ~ AUTHORS & ACKNOWLEDGMENTS ~ ##
- Nicolas Fernandez - IRD _(Developer and Maintener)_
- Christelle Butel - IRD _(Reporter)_
- Eddy Kinganda-Lusamaki - INRB _(Source)_
- DALL•E mini - OpenAI [Git](https://github.com/borisdayma/dalle-mini) _(Repo. avatar)_
## ~ CONTRIBUTING ~ ##
- Open to contributions!
- Testing code, finding issues, asking for update, proposing new features...
- Use Git tools to share!
## ~ PROJECT STATUS ~ ##
This project is **regularly update** and **actively maintened**
However, you can be volunteer to step in as **developer** or **maintainer**
For information about main git roles:
- **Guests** are _not active contributors_ in private projects, they can only see, and leave comments and issues
- **Reporters** are _read-only contributors_, they can't write to the repository, but can on issues
- **Developers** are _direct contributors_, they have access to everything to go from idea to production
_Unless something has been explicitly restricted_
- **Maintainers** are _super-developers_, they are able to push to master, deploy to production
_This role is often held by maintainers and engineering managers_
- **Owners** are essentially _group-admins_, they can give access to groups and have destructive capabilities
## ~ LICENSE ~ ##
Licencied under [GPLv3](https://www.gnu.org/licenses/gpl-3.0.html)
Intellectual property belongs to [IRD](https://www.ird.fr/) and authors.
- **databases**: enables you to configure multiple genomes databases _(aligner index files)_ to search against
### RQC map ###
```shell
🧩 Reads_Quality_Control/
├── 🖥️ Start_GeVarLi.sh
......@@ -301,6 +345,7 @@ Intellectual property belongs to [IRD](https://www.ird.fr/) and authors.
## ~ REFERENCES ~ ##
**Sustainable data analysis with Snakemake**
Felix Mölder, Kim Philipp Jablonski, Brice Letcher, Michael B. Hall, Christopher H. Tomkins-Tinch, Vanessa Sochat, Jan Forster, Soohyun Lee, Sven O. Twardziok, Alexander Kanitz, Andreas Wilm, Manuel Holtgrewe, Sven Rahmann, Sven Nahnsen, Johannes Köster
_F1000Research (2021)_
......@@ -351,4 +396,5 @@ _Online (2010)_
**Source code**: [https://github.com/s-andrews/FastQC](https://github.com/s-andrews/FastQC)
**Documentation**: [https://www.bioinformatics.babraham.ac.uk/projects/fastqc](https://www.bioinformatics.babraham.ac.uk/projects/fastqc)
###############################################################################
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment