diff --git a/docs/pages/bash_manip/bash_manip-7-sed.md b/docs/pages/bash_manip/bash_manip-7-sed.md index 8a14aa829b48d3ce61ceabfad4e041438fe13215..7162b16f78a8db4e476183c0f2a75e9edcc2e314 100644 --- a/docs/pages/bash_manip/bash_manip-7-sed.md +++ b/docs/pages/bash_manip/bash_manip-7-sed.md @@ -49,7 +49,7 @@ sed 'FLAG/<pattern>/<string>/FLAG' |----------|----------| ----| ----| ----| ----| ----| ----| q | Quit after a line (`/<pattern>/q` or `<integer>q`) | | x | x | | | | d | Delete lines (`/<pattern>/d` or `<integer>d`) | | x | x | | | | - p | Print matched lines (`-n '/<pattern>/p'`) | | | x | | | | + p | Print matched lines (`-n '/<pattern>/p'`) | Only with `-n` option | | x | | | | a | Append text after a line (`/<pattern>/a Add new text after`) | On macOS (BSD sed) the command requires a backslash (`\`) and a newline. | | | x | | | i | Insert text before a line (`/<pattern>/i Add new text before`) | On macOS (BSD sed) the command requires a backslash (`\`) and a newline | | | x | | | c | Change entire line (`/<pattern>/c This is a new line`) | | | | x | | | @@ -63,7 +63,7 @@ sed 'FLAG/<pattern>/<string>/FLAG' ## Line selection -**Syntax** +### Syntax ```bash sed -n 'line p' file @@ -79,10 +79,13 @@ sed -n 'line p' file | `sed '8,$ p' file` | Print lines from line 8 to the end of the file | | `sed -n '1~8 p' file` | Print from line 1, every 8 lines | +### Exercice + + ## Line deletion -**Syntax** +### Syntax ```bash sed 'line d' file @@ -97,9 +100,13 @@ sed 'line d' file | `sed '8,$ d' file` | Delete lines from line 8 to the end of the file | | `sed '1~8d' file` | Delete from line 1, every 8 lines | +### Exercice + + + ## Use of Regular Expression -**Syntax** +### Syntax ```bash sed 'RegEx' file @@ -110,9 +117,13 @@ sed 'RegEx' file | `sed '/^#/d' file` | Delete lines starting by # | | `sed -n '/\tmRNA\t/p' file` | Print lines matchin mRNA surrounded by tabulation | +### Exercice + + + ## Subsitution -**Syntax** +### Syntax ```bash sed 's/pattern/replacement/' file @@ -126,20 +137,37 @@ sed 's/pattern/replacement/' file | `s/pattern/replacement/i` | Substitute the first occurrence of pattern with replacement, ignoring case | | `s/pattern/replacement/gi` | Substitute all occurrences of pattern with replacement, ignoring case | -## Extract value +### Exercice + + + +## Capturing It is possible to extract part of a line. Let's take the example of the extraction of a value from an attribute (`tag=value`) with tag `Name` of the 9th column of a GFF/GTF file. -**Syntax** +### Syntax ```bash -sed 's/.*Name=\([^;]*\);.*/\1/p' file +sed -n 's/.*START\([^END]*\)END.*/\1/p' file.txt ``` * `-n` Suppresses default output (only prints matches). * `s/.../.../p` Substitutes text and prints only the matched part. -* `.*Name=` Matches everything before Name=. -* `\([^;]*\)` Captures everything after Name= until the first ;. -* `.*` Matches everything after ; (but doesn’t capture it). -* `\1` Outputs only the captured group (the Name value). -* `/p` Prints the result. \ No newline at end of file +* `.*` Matches anything before the START marker. +* `START` The fixed pattern before the part we want. +* `\(` Start of capture group (tells sed to remember this part). +* `[^END]*` Captures everything until it reaches the END marker. +* `\)` End of capture group. +* `END` The fixed text after the part we want. +* `.*` Matches everything after the END marker. +* `\1` Prints the first captured group (here only 1 has been captured). +* `p` Explicitly prints the result (only used with -n). + +### Exercice + +!!! question "List all names that are associated to PIERRE (e.g. OLIVIER that is used to do PIERRE-OLIVER)" + +??? example "Click to show the solution" + ```bash + sed -n 's/.*;PIERRE-\([^;]*\);.*/\1/p' nat2021.csv | sort -u + ```