Newer
Older
??? quote "First click and follow the instructions below only if you start the course at this stage! Otherwise skip this step!"
{%
include-markdown "pages/bash_manip/bash_manip-0-setup.md"
%}
## Concept
`Grep` stands for "global regular expression print". It searches through the contents of files for lines that match a specified **pattern**.
The basic syntax is:
It has a lot of options but the most common ones are:
| Command | Explanation |
|---------|-------------|
| -i | Ignore case distinctions in patterns and data. |
| -v | Invert the match, showing lines that do not match the pattern. |
| -n | Prefix each line of output with the line number. |
| -c | Print only a count of matching lines per file. |
| -o | Print each match on a new line. |
!!! question "How many lines contain the number `2` in `nat2021.csv` file?"
??? example "Click to show the solution"
```bash
grep 2 nat2021.csv
# 553461
```
!!! question "How many occurence of number `2` exists in `nat2021.csv` file?"
??? example "Click to show the solution"
```bash
grep -o 2 nat2021.csv | wc -l
# -o makes grep print each match on a new line.
# wc -l counts the number of lines, which equals the total occurrences
# 871258
```
!!! question "Select all line related of the year 2001 in `nat2021.csv` file"
Pay attention that value 2021 may occur in 2 different columns: `annais` (column3) and `nombre` (column4)
??? example "Click to show the solution"
```bash
grep ";2021;" nat2021.csv
```
!!! question "How many diffent names have been provided in 2021 (_PRENOMS_RARES count for 1)?"
??? example "Click to show the solution"
```bash
grep ";2021;" nat2021.csv | wc -l
# result: 13501
```
!!! question "Is there more diversity in male or female names in 2021?"
??? example "Click to show the solution"
```bash
# female - field one contains male female information (-f 1) then count female (grep -c 2)
grep ";2021;" nat2021.csv | cut -d ';' -f 1 | grep -c 2
# male - field one contains male female information (-f 1) then count male (grep -c 1)
grep ";2021;" nat2021.csv | cut -d ';' -f 1 | grep -c 1
!!! question "How many person are called PARIS in 2021?"
??? example "Click to show the solution"
```bash
# female
grep "PARIS;2021;" nat2021.csv
# result 16 (5 male and 11 female)
```
The rare name ([see here for documentation](https://www.insee.fr/fr/statistiques/2540004?sommaire=4767262#documentation)) are set as `_PRENOMS_RARES`.
!!! question "Could you find the number of rare name per year ? Do you see any pattern?"
??? example "Click to show the solution"
```bash
grep ";_PRENOMS_RARES;" nat2021.csv
```
People tends to provide more and more rare names.
!!! question "What year was the most prolific year for the name ZINEDINE?"