Biodiversity data: From data collection to publication
Welcome
Registrations open each spring and the training course takes place during fall. More information is available on the FRB-CESAB website.
Subscribe to the FRB-CESAB newsletter to stay informed.
This five-day training course, co-organized by the FRB-CESAB, the PNDB, and GBIF France aims to work on the different stages of the data lifecycle, from acquisition to opening, including management, storage and the drafting of data papers.
We will 1) contextualize the issues surrounding the understanding, sharing and (re)use of biodiversity data and metadata, and 2) enhance the skills of communities involved in one or more stages of the data lifecycle.

Program
General context
- Current challenges in biodiversity data
- The data ecosystem
- Reproducibility concepts
- What is data/metadata?
- Major types of biodiversity data
- Framework and good practices (data lifecycle, FAIR, etc.)
- Data management plan
Data acquisition
- Best practices for data collection
- Major biodiversity databases
- Major environmental databases
- Data acquisition: Web portals, API & web scraping
Data management
- Structure, formats & files
- Processing & cleansing data
- Introduction to OpenRefine
- Virtual Research Environments
DataSHARE projects
- FRB-CESAB DataSHARE projects
- Zoom on Islets & PhenoFish groups
Legal aspects
- Sharing, licensing, etc.
Opening data
- (Meta)data: standards
- Storage and archiving
- Dissemination and sharing
- Data & software paper
Prerequisites
Please follow this tutorial to install your working environment before attending the training course.
For this training course, you just need to install R, RStudio IDE (optional) and Rtools (for Windows users only). Do not install git, Quarto, Pandoc or Docker. And do not forget to configure RStudio IDE as explaining here.
Material
All the material used in this training course (slides, data, exercises) is available at: https://github.com/biodiversitydata/
Citation
Casajus N, Coux C, Le Bras Y, Norvez O, Archambeau A-S & Pamerlon S (2025) Biodiversity data: From data collection to publication. An FRB-CESAB, PNDB & GBIF France training course. URL: https://biodiversitydata.github.io/
Contributions
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Reuse
Text and figures are licensed under Creative Commons Attribution CC By 4.0, unless otherwise noted.
See also
Discover the other training courses provided by the FRB-CESAB and its partners: https://frbcesab.github.io/training-courses/