Skip to content
Snippets Groups Projects
Commit 68d4e918 authored by Nicolas FERNANDEZ NUÑEZ's avatar Nicolas FERNANDEZ NUÑEZ
Browse files

Working on README.md

parent 4961f780
No related branches found
No related tags found
No related merge requests found
# RQCP: Reads Quality Control Pipeline #
# RQC: Reads Quality Control #
## Description ##
RQCP check NGS (illumina) reads quality and clean it if needed, as you set, using:
- Cutadapts to trim NGS sequencing adapters
- Sickle-trim to trim reads on base-calling quality score
- Fastq-join to join mates reads (forward R1 and Reverse R2) when it's possible
- FastQC to check global quality
- FastqScreen to check putative contamination(s)
- MultiQC to generate HTML reports
RQC is a bioinformatic pipeline used to check reads qualities from NGS sequencing
This is the **macOSX** version (specific conda environments).
## Badges ##
![Maintener](<https://badgen.net/badge/Maintener/Nicolas Fernandez/blue?scale=0.9>)
![MacOS](<https://badgen.net/badge/icon/Hight Sierra (10.13),Catalina (10.15),Big Sure (11)/cyan?icon=apple&label&list=|&scale=0.9>)
![MacOSX](<https://badgen.net/badge/icon/Hight Sierra (10.13) | Catalina (10.15) | Big Sure (11)/E6055C?icon=apple&label&list=|&scale=0.9>)
![Issues closed](<https://badgen.net/badge/Issues closed/2/green?scale=0.9>)
![Issues opened](<https://badgen.net/badge/Issues opened/0/yellow?scale=0.9>)
![Maintened](<https://badgen.net/badge/Maintened/Yes/red?scale=0.9>)
![Wiki](<https://badgen.net/badge/icon/Wiki/pink?icon=wiki&label&scale=0.9>)
![Open Source](<https://badgen.net/badge/icon/Open Source/purple?icon=https://upload.wikimedia.org/wikipedia/commons/4/44/Corazón.svg&label&scale=0.9>)
![GNU AGPL v3](<https://badgen.net/badge/Licence/GNU AGPL v3/grey?scale=0.9>)
![Gitlab](<https://badgen.net/badge/icon/Gitlab/orange?icon=gitlab&label&scale=0.9>)
![Bash](<https://badgen.net/badge/icon/Bash 3.2.57/black?icon=terminal&label&scale=0.9>)
![Python](<https://badgen.net/badge/icon/Python 3.8.7/black?icon=https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg&label&scale=0.9>)
![Snakemake](<https://badgen.net/badge/icon/Snakemake 5.11.2/black?icon=https://upload.wikimedia.org/wikipedia/commons/d/d3/Python_icon_%28black_and_white%29.svg&label&scale=0.9>)
![Conda](<https://badgen.net/badge/icon/Conda 4.10.3/black?icon=codacy&label&scale=0.9>)
## Visuals ##
_Good idea to include screenshots or GIFs (see ttygif or Asciinema)_
<img src="./visuals/rulegraph.png" width="150" height="300">
## Installation ##
### Conda _(prior!)_ ###
Download and install **Conda**: [Latest Miniconda Installer](https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links)
1. Donwload conda installer _(i.e. for Miniconda3 with Python 3.9 on MacOSX-64-bit)_:
```shel
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
```
2. Install conda using installer bash script:
_Follow the prompts on the installer screens_
```shell
Install **Conda** (_i.e. Miniconda3 with Python 3.9 on MacOSX-64-bit_)
[Latest Miniconda Installer](https://docs.conda.io/en/latest/miniconda.html#latest-miniconda-installer-links)
_Follow the screen prompt instructions_
```shell
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
bash Miniconda3-latest-MacOSX-x86_64.sh
```
3. Remove conda installer:
```shell
rm Miniconda3-latest-MacOSX-x86_64.sh
```
_Restart shell (close and reopen new terminal window)_
4. Restart shell, close and reopen new terminal window
### Snakemake _(prior!)_ ###
Install **Snakemake** using Conda package management system
_Follow the prompts on the installer screens_
Install **Snakemake** (_i.e. v.6.12.1_) using Conda
_Follow the screen prompt instructions_
```shell
conda install -c bioconda -c conda-forge snakemake
connda install -c conda-forge mamba --yes
mamba install -c bioconda rename --yes
mamba install -c conda-forge -c bioconda snakemake=6.12.1 --yes
```
### RQCP ###
### RQC ###
**Download** _OR_ clone the **Reads Quality Control Pipeline** project
**Download** _OR_ clone the **RQC pipeline** project
#### Download ####
#### Difference between **Download** and **Clone** ####
To create a copy of a remote repository’s files on your computer, you can either download or clone the repository
If you download it, you cannot sync the repository with the remote repository on GitLab
Cloning a repository is the same as downloading, except it preserves the Git connection with the remote repository
You can then modify the files locally and upload the changes to the remote repository on GitLab
- Download source code archive (_zip_, **tar.gz**, _tar.bz2_, _tar_): [RQCP on GitLab](https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline)
```shel
wget https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline/-/archive/main/Reads_Quality_Control_Pipeline-main.tar.gz -O ~/Desktop/
```
#### Download ####
_alternatively_:
- Download source code archive (_zip_, **tar.gz**, _tar.bz2_, _tar_)
[RQCP on GitLab](https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline)
![Image of download button](./visuals/download_button.png)
- Extract and remove the the archive (i.e. tar.gz):
```shell
tar -xzvf path/to/archive/Reads_Quality_Control_Pipeline-main.tar.gz
rm path/to/archive/Reads_Quality_Control_Pipeline-main.tar.gz
mv ~/Desktop/Reads_Quality_Control_Pipeline-main ~/Desktop/Reads_Quality_Control_Pipeline
cd ~/Desktop/Reads_Quality_Control_Pipeline
wget https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline/-/archive/main/Reads_Quality_Control_Pipeline-main.tar.gz
tar -xzvf Reads_Quality_Control_Pipeline-main.tar.gz
rm -f Reads_Quality_Control_Pipeline-main.tar.gz
mv Reads_Quality_Control_Pipeline-main/ ~/Desktop/RQC_Pipeline/
```
#### Clone ####
- Clone with **SSH** when you want to authenticate only one time
Authenticate with GitLab by following the instructions in the [SSH documentation](https://docs.gitlab.com/ee/ssh/index.html)
- Clone with **HTTPS** (_when you want to authenticate each time you perform an operation between your computer and GitLab_)
Authenticate with GitLab by following the instruction in the [2FA documentation](https://docs.gitlab.com/ee/user/profile/account/two_factor_authentication.html)
```shell
git clone git@gitlab.com:ird_transvihmi/Reads_Quality_Control_Pipeline.git
cd Reads_Quality_Control_Pipeline
git clone https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline.git
mv Reads_Quality_Control_Pipeline/ ~/Desktop/RQC_Pipeline/
```
Clone with **HTTPS** when you want to authenticate each time you perform an operation between your computer and GitLab
- Clone with **SSH** (_when you want to authenticate only one time_)
Authenticate with GitLab by following the instructions in the [SSH documentation](https://docs.gitlab.com/ee/ssh/index.html)
```shell
git clone https://gitlab.com/ird_transvihmi/Reads_Quality_Control_Pipeline.git
cd Reads_Quality_Control_Pipeline
git clone git@gitlab.com:ird_transvihmi/Reads_Quality_Control_Pipeline.git
mv Reads_Quality_Control_Pipeline/ ~/Desktop/RQC_Pipeline/
```
#### Difference between download and clone ####
To create a copy of a remote repository’s files on your computer, you can either download or clone the repository
If you download it, you cannot sync the repository with the remote repository on GitLab
Cloning a repository is the same as downloading, except it preserves the Git connection with the remote repository
You can then modify the files locally and upload the changes to the remote repository on GitLab
## Usage ##
- Copy your **paired-end** reads in **fastq.gz** format files into: **./resources/reads/** directory
- Edit **config.yaml** file on **./config/** directory, as you want, if needed
- Edit **fastq-screen.conf** file on **./config/** directory, as you want, if needed
- Be sure your bash script is executable, if not, you can run in a Terminal:
- Copy your **paired-end** reads in **fastq.gz** format files into: **./resources/reads/** directory
- (_option_) Edit **config.yaml** file on **./config/** directory, as you want, if needed
- (_option_) Edit **fastq-screen.conf** file on **./config/** directory, as you want, if needed
- Be sure your bash script is executable
```shell
sudo chmod +x path/to/Reads_Quality_Control_Pipeline/RQCP.sh
```
- Run **GeVarLi.sh** bash script by double-clicking on it _(a terminal window will open and analyzes start)_
- Run **RQCP.sh** bash script by double-clicking on it
- Enter project name (option)
A terminal will open. you can close it at the end.
### Results ###
Yours results are available in results\_Date\_Hour\_Project
- ... TODO ...
Yours results are available in results directory as follow:
... TODO ...
### Configuration ###
#### Resources ####
Edit to match your hardware configuration
#### Environments ####
Edit if you change some environments (i.e.new version) in ./workflow/envs/tools-version.yaml files
#### Datasets ####
Edit to choose datasets you want an quality control with FastQC et Fastq-Screen
#### Cutadapt ####
- **length**: Discard reads shorter than length, after trim (default config: '75')
- **kit**: Sequence of an adapter ligated to the 3' end of the first read (default config: truseq / nextera / small)
#### Sickle-trim ####
- **command**: Pipeline wait for paired-end reads (default config: 'pe') see: rule sickletrim on ./workflow/rules/reads_quality_control_pipeline.smk snake file
- **encoding**: If your data are from recent Illumina run, let 'sanger' (default config: 'sanger')
- **quality**: [Q-phred score](https://en.wikipedia.org/wiki/Phred_quality_score) limit (default config: '30')
- **length**: Read length limit, after trim (default config: '75')
##### Fastq-Join #####
- **percent**: Percent maximum difference (default config: 5)
- **overlap**: Minimum overlap (default config: 25)
Edit if you change some environments (i.e.new version) in ./workflow/envs/tools-version.yaml files
#### Fastq-Screen #####
- **config**: Path to the fastq-screen configuration file (default config: ./config/fastq-screen.conf)
- **subset**: Don't use the whole sequence file, but create a temporary dataset of this specified number of read (default config: '10000', set '0' for all dataset)
- **aligner**: Specify the aligner to use for the mapping. Valid arguments are 'bowtie', bowtie2' or 'bwa' (default config: 'bwa')
##### fastq-screen.conf #####
- **path**: Set this value to tell the program where to find your chosen aligner (default :/usr/local/\<tool\>
- **bismark**: Same for bismark (for bisulfite sequencing only)
- **threads**: Set this value to the number of cores you want for mapping reads (default: 1, but overwrited by Snakemake and config.yaml file)
- **databases**: This section enables you to configure multiple genomes databases (aligner index files) to search against in your screen
##### databases #####
For each genome you need to provide a database name (which **can't** contain spaces) and the location of the aligner index files
>The path to the index files **should include the basename** of the index, _(e.g: ./resources/databases//Human/Homo\_sapiens\_h38)_
......@@ -207,28 +186,34 @@ You can also ask for new databases, for genomes references not yet included, to
5. Call me to `+33.(0)4.67.41.55.xx` (No don't please _O\_o_!)
## Roadmap ##
Add a wiki !
Finish documentation about "terminal" and "results"
Add new features
## Contributing ##
Open to contributions :)
Testing code, finding issues, asking for update, proposing new features ...
Use Git tools to share!
## Authors and acknowledgment ##
- Nicolas Fernandez (Developer and Maintener)
- Christelle Butel (Reporter, User-addict, Fetaures inspiration source)
## License ##
[GPLv3](https://www.gnu.org/licenses/gpl-3.0.html)
## Project status ##
This project is regularly update and actively maintened
However, you can be volunteer to step in as a maintainer
[//]: # (I'm out of time for this project, development has slowed down, close to stopped completely, you can be volunteer to step in as a maintainer, or choose to fork this project allowing this project to keep going!)
For information about main git roles:
- **Guests** are _not active contributors_ in private projects, they can only see, and leave comments and issues.
- **Reporters** are _read-only contributors_, they can't write to the repository, but can on issues.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment