Skip to content
Snippets Groups Projects
bash_manip-3-grep.md 2.13 KiB
Newer Older
# Extracting from files

To do this exercice you will need to download French First name data from "Institut national de la statistique
et des études économiques"

```bash
wget https://www.insee.fr/fr/statistiques/fichier/2540004/nat2021_csv.zip
unzip nat2021_csv.zip 
```

You should now have a file called `nat2021.csv` in your working directory.



## Searching patterns (grep)

!!! question "Select all line related of the year 2001 in `nat2021.csv` file"

??? example "Click to show the solution"  
    ```bash
    grep ";2021;" nat2021.csv
    ```

!!! question "How many names have been provided in 2021?"

??? example "Click to show the solution"  
    ```bash
    grep ";2021;" nat2021.csv | wc -l
    # result: 13501
    ```

!!! question "Is there more diversity in male or female names in 2021"?

??? example "Click to show the solution"  
    ```bash
    # female
    grep ";2021;" nat2021.csv | grep "^2" | wc -l
    # result: 7112
    # male
    grep ";2021;" nat2021.csv | grep "^1" | wc -l
    # result: 6389
    ```

!!! question "How many person are called PARIS in 2021"?

??? example "Click to show the solution"  
    ```bash
    # female
    grep "PARIS;2021;" nat2021.csv
    # result 16 (5 male and 11 female)
    ```

The rare name ([see here for documentation](https://www.insee.fr/fr/statistiques/2540004?sommaire=4767262#documentation)) are set as `_PRENOMS_RARES`.

!!! question "Could you find all rare name ? Do you see any pattern?"

??? example "Click to show the solution"  
    ```bash
    grep ";_PRENOMS_RARES;" nat2021.csv
    ```
    People tends to provide more and more rare names.


!!! question "What year was the most prolific fot the name ZINEDINE?"

??? example "Click to show the solution"  
    ```bash
    # command
    grep ";ZINEDINE;" nat2021.csv | sort -n -t ';' -k4
    # result: 1998
    ```



## Redirecting an output (>)

You can redirect a result and store it in a file thanks to the `>` redirection:  
`command > filename`

!!! question "Save all the names from 2005 in a dedicated file?"

??? example "Click to show the solution"  
    ```bash
    # command
    grep ";2025;" nat2021.csv > names2005.txt
    ```