From 117fa3089e1b482e3adda7867af8bc9e46bf20e9 Mon Sep 17 00:00:00 2001 From: Jacques Dainat <jacques.dainat@ird.fr> Date: Wed, 5 Mar 2025 19:01:47 +0100 Subject: [PATCH] add flags --- docs/pages/bash_manip/bash_manip-7-sed.md | 66 ++++++++++++++--------- 1 file changed, 42 insertions(+), 24 deletions(-) diff --git a/docs/pages/bash_manip/bash_manip-7-sed.md b/docs/pages/bash_manip/bash_manip-7-sed.md index 301a84f..8a14aa8 100644 --- a/docs/pages/bash_manip/bash_manip-7-sed.md +++ b/docs/pages/bash_manip/bash_manip-7-sed.md @@ -1,28 +1,11 @@ -# Extracting from files +# SED +## setup -```bash -# or curl -O instead of wget -wget https://ftp.ensembl.org/pub/release-113/gff3/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.113.gff3.gz -gunzip Saccharomyces_cerevisiae.R64-1-1.113.gff3.gz -mv Saccharomyces_cerevisiae.R64-1-1.113.gff3 yeast.gff -``` - -You should now have a file called `yeast.gff` in your working directory. - -``` -##gff-version 3 -### -I sgd gene 335 649 . + . ID=gene:YAL069W;biotype=protein_coding;description=Dubious open reading frame%3B unlikely to encode a functional protein%2C based on available experimental and comparative sequence data [Source:SGD%3BAcc:S000002143];gene_id=YAL069W;logic_name=sgd -I sgd mRNA 335 649 . + . ID=transcript:YAL069W_mRNA;Parent=gene:YAL069W;biotype=protein_coding;tag=Ensembl_canonical;transcript_id=YAL069W_mRNA -I sgd exon 335 649 . + . Parent=transcript:YAL069W_mRNA;Name=YAL069W_mRNA-E1;constitutive=1;ensembl_end_phase=0;ensembl_phase=0;exon_id=YAL069W_mRNA-E1;rank=1 -I sgd CDS 335 649 . + 0 ID=CDS:YAL069W;Parent=transcript:YAL069W_mRNA;protein_id=YAL069W -### -``` - -The GFF/GTF format describe genomics features, such as genes, exons, CDS in a standardized format. -Every line starting with `#` is a comment. -Each line is a feature and contains 9 fields (tabulation separated). +??? quote "First click and follow the instructions below only if you start the course at this stage! Otherwise skip this step!" + {% + include-markdown "pages/bash_manip/bash_manip-0-setup.md" + %} ## Concept @@ -44,7 +27,39 @@ sed [Option(s)] 'Command(s)' [File(s)] | -r | Use extended regular expressions in the script | -s | Treat files as separate rather than as a single continuous long stream -At our level the options most useful would be `-n` and `-i` +At our level the options the most useful would be `-n`, `-i` and `-e` + +Skipping the option part, `sed` commands can be shaped in different way : +<pattern> +```bash +# case1 by line number +sed '<integer>FLAG' +# case2 by line matching +sed '/<pattern>/FLAG' +# case2.2 by line matching +sed '/<pattern>/FLAG <string>' +# case3 by match +sed 'FLAG/<pattern>/<string>/' +# case3.2 by match +sed 'FLAG/<pattern>/<string>/FLAG' +``` + +??? Note "Available FLAGs" + | Command | Description | Comment | Case1 `sed '<integer>FLAG'` | Case2 `sed '/<pattern>/FLAG'`| Case2.2 `sed '/<pattern>/FLAG <string>'` | Case3 `sed 'FLAG/<pattern>/<string>/'` | Case3.2 `sed 'FLAG/<pattern>/<string>/FLAG'`| + |----------|----------| ----| ----| ----| ----| ----| ----| + q | Quit after a line (`/<pattern>/q` or `<integer>q`) | | x | x | | | | + d | Delete lines (`/<pattern>/d` or `<integer>d`) | | x | x | | | | + p | Print matched lines (`-n '/<pattern>/p'`) | | | x | | | | + a | Append text after a line (`/<pattern>/a Add new text after`) | On macOS (BSD sed) the command requires a backslash (`\`) and a newline. | | | x | | | + i | Insert text before a line (`/<pattern>/i Add new text before`) | On macOS (BSD sed) the command requires a backslash (`\`) and a newline | | | x | | | + c | Change entire line (`/<pattern>/c This is a new line`) | | | | x | | | + y | character transliteration (`y/<characters>/<characters>/`) | | | | | x | | + s | Substitute first match on each line (`s/<pattern>/<string>/`) | | | | | x | | + s + g | Global - Substitute all occurrences on each line (`s/<pattern>/<string>/g`) | | | | | x | x | + s + i | Case-insensitive - Substitute all occurrences on each line (`s/<pattern>/<string>/i`) | | | | | x | x | + s + p | Print modified lines (`s/<pattern>/<string>/p`) | | | | | x | x | + s + g + i + p | A combination of s + flags i,p,g is possilbe (`s/<pattern>/<string>/pig`) | | | | | x | x | + ## Line selection @@ -58,6 +73,8 @@ sed -n 'line p' file |----------|----------| | `sed -n '8p' file` | Print line 8 | | `sed -n '8p; 16p' file` | Print lines 8 and 16 | +| `sed -n '8p; 16p' file` | Print lines 8 and 16 | +| `sed -n -e '8p' -e '16p' file` | Print lines 8 and 16 | | `sed -n '8,16 p' file` | Print lines from 8 to 16 | | `sed '8,$ p' file` | Print lines from line 8 to the end of the file | | `sed -n '1~8 p' file` | Print from line 1, every 8 lines | @@ -75,6 +92,7 @@ sed 'line d' file |----------|----------| | `sed '8d' file` | Delete line 8 | | `sed '8d; 16d' file` | Delete lines 8 and 16 | +| `sed -e '8d' -e '16d' file` | Delete lines 8 and 16 | | `sed '8,16 d' file` | Delete lines from 8 to 16 | | `sed '8,$ d' file` | Delete lines from line 8 to the end of the file | | `sed '1~8d' file` | Delete from line 1, every 8 lines | -- GitLab