Skip to content
Snippets Groups Projects
bash_manip-4-regex.md 2.54 KiB
Newer Older
# Regular Expression

Regular expressions (regex) are sequences of characters that define a search pattern. They are used for pattern matching within strings.  
It is a powerful tools for text processing and can be used in various command-line utilities like `grep`, `sed`, and `awk` to search, match, and manipulate text.

## Regular Expression Summary

| Symbol | Description | Example | Matches |
|--------|-------------|---------|---------|
| `.`    | Any single character except newline | `a.b` | `aab`, `acb`, `a1b` |
| `^`    | Start of a line | `^abc` | `abc` at the start of a line |
| `$`    | End of a line | `abc$` | `abc` at the end of a line |
| `*`    | Zero or more of the preceding element | `ab*c` | `ac`, `abc`, `abbc` |
| `+`    | One or more of the preceding element | `ab+c` | `abc`, `abbc` |
| `?`    | Zero or one of the preceding element | `ab?c` | `ac`, `abc` |
| `{n}`  | Exactly n of the preceding element | `a{3}` | `aaa` |
| `{n,}` | n or more of the preceding element | `a{2,}` | `aa`, `aaa`, `aaaa` |
| `{n,m}`| Between n and m of the preceding element | `a{2,3}` | `aa`, `aaa` |
| `[]`   | Any one of the characters within the brackets | `[abc]` | `a`, `b`, `c` |
| `[^]`  | Any one character not within the brackets | `[^abc]` | Any character except `a`, `b`, `c` |
| `|`    | Alternation (OR) | `a|b` | `a`, `b` |
| `()`   | Grouping | `(abc)` | `abc` |
| `\d`   | Any digit (0-9) | `\d` | `0`, `1`, `2`, ..., `9` |
| `\D`   | Any non-digit | `\D` | Any character except `0-9` |
| `\w`   | Any word character (alphanumeric + underscore) | `\w` | `a`, `b`, `1`, `_` |
| `\W`   | Any non-word character | `\W` | Any character except `a-z`, `A-Z`, `0-9`, `_` |
| `\s`   | Any whitespace character | `\s` | Space, tab, newline |
| `\S`   | Any non-whitespace character | `\S` | Any character except space, tab, newline |

It is possible to use POSIX character classes:  

| Symbol | Description |
|--------|-------------|
| [:alnum:] | equivqlent to A-Za-z0-9 |
| [:alpha:] | equivalent to A-Za-z |
| [:blank:] | equivalent to space or tab |
| [:digit:] | equivalent to 0-9 |


!!! Warning
    Do not confound with **Globbing** (Pathname expansion) used to match filename!
    `?`  Any single character  
    `*`  Zero or more characters
    `[]` Specify a range. Any character of the range or none of them using `!` inside the bracket.
    `{term1,term2}`  Specify a list of terms separated by commas and each term must be a name or a wildcard.
    `{term1..term2}` Called brace expansion, this syntax expands all the terms between term1 and term2 (Letters or Integers).