Skip to content
Snippets Groups Projects
bash_manip-0-setup.md 980 B
Newer Older
To do this exercice you will use data of first names given to children born in France since 1900 downloaded from "Institut national de la statistique et des études économiques" (see [here](https://www.insee.fr/fr/statistiques/8205621?sommaire=8205628) for details).

```bash
wget https://www.insee.fr/fr/statistiques/fichier/2540004/nat2021_csv.zip
unzip nat2021_csv.zip 
```

You should now have a file called `nat2021.csv` in your working directory.

The data contained in this file have this shape:
```
sexe;preusuel;annais;nombre
2;SANDRINE;1973;17605
1;JEAN;1960;17607
1;_PRENOMS_RARES;1904;1430
```

The first line is the header where `preusuel` means `prenom usuel` and `annais` means `année naissance`.  
The subsequent lines are the data.  
1 in column sex means male and 2 means female.
`_PRENOMS_RARES` are rare first names. They are classified as rare following criteria described [here](https://www.insee.fr/fr/statistiques/8205621?sommaire=8205628#documentation).