

Reminder of the context and the issues
Hot versus cold storage
Carbon impact
Resources



| Aspect | Hot Data | Cold Data |
|---|---|---|
| Access Frequency | Frequently accessed | Infrequently accessed |
| Storage Type | High-performance storage (e.g., SSDs, in-memory databases) | Cost-effective storage (e.g., HDDs, magnetic tapes, cloud archives) |
| Retrieval Speed | Fast retrieval required | Slower retrieval acceptable |
| Use Case | Real-time transactions, active user sessions, real-time analytics | Historical records, backups, archived files |
| Cost | Generally higher due to performance requirements | Generally lower due to slower access speeds |
| Data management | Requires optimization for speed and efficiency | Focused on long-term storage and cost-efficiency |


The GAIA Data project aims to develop and implement an integrated and distributed platform of services and data for the observation, modeling and understanding of the Earth system, biodiversity and the environment


Carbon footprint Ideally: bring the calculation closer to the {meta}data and not the other way around
| Repositories names | Supporting by | thematic, institutional, generic | disciplinary fields | Accepted data (keywords) | embargo | Persistent identifier | Volume limit |
|---|---|---|---|---|---|---|---|
| InDoRES | CNRS-Ecology, MNHN | thematic ( and institutional) | Ecology, Environment, Bio-archaeology | Environmental, ecological and geographical data | yes | DOI | 2 GB per data set but planned to increase to 4 or 5 GB soon |
| EaSy Data | Data Terra, BRGM | thematic | Earth and Environmental Sciences | Long tail data from the earth system and environment (example: project issues) | yes (2 years max.) | DOI | 5 GB per file, 100 GB per deposit. Possibility to make the request if larger volume |
| SEANOE | Ifremer | thematic (and institutional) | Oceanography | Georeferenced marine data | yes (2 years max.) | DOI | 100 GB |
| Data SUD | IRD | institutional | all fields covered by IRD agents | ??? | ??? | DOI | ??? |
| GBIF | the international GBIF community | thematic | Life sciences, Biodiversity, Animal biology, Plant biology, Ecology, Environment; Ecosystems | Taxa, occurrence data, sampling data, all standardized according to Darwin core or ABCD standards. | yes | DOI | no |
| Recherche Data Gouv | Recherche Data Gouv | generic | all fields | all | yes | DOI | ??? |
“How much energy is used in saving to the cloud? That’s a complicated question. It takes energy to get the data to the data center—miles of fiber optic cables, studded with other fixtures of internet infrastructure that all require power along the way. At the center, your data is stored multiple times on hard disks, and the constant activity of all those disks creates a lot of heat, which necessitates energy-intensive air conditioners to protect the equipment from overheating.” Asked by Mark Williams of Cambridge, U.K. to Justin Adamson - SAGE (Sound Advice for a Green Earth) Source Standford Magazine
“Various studies estimate them to be between 2.3 – 3.7 percent of global CO₂ emissions, which is equivalent to the emissions of the entire aviation industry” Source MyClimate.Org
The ecological dynamics we find ourselves in are not entirely a consequence of design limits, but of human practices and choices — among individuals, communities, corporations, and governments — combined with a deficit of will and imagination to bring about a sustainable Cloud. The Cloud is both cultural and technological. Like any aspect of culture, the Cloud’s trajectory — and its ecological impacts — are not predetermined or unchangeable. Like any aspect of culture, they are mutable.”Steven Gonzalez Monserrate is an anthropologist and a PhD candidate at MIT. Source MIT The Reader