Overview of GeoLifeCLEF 2023: Species Composition Prediction with High Spatial Resolution at Continental Scale Using Remote Sensing
- URL: http://arxiv.org/abs/2509.25816v1
- Date: Tue, 30 Sep 2025 05:49:16 GMT
- Title: Overview of GeoLifeCLEF 2023: Species Composition Prediction with High Spatial Resolution at Continental Scale Using Remote Sensing
- Authors: Christophe Botella, Benjamin Deneu, Diego Marcos, Maximilien Servajean, Theo Larcher, Cesar Leblanc, Joaquim Estopinan, Pierre Bonnet, Alexis Joly,
- Abstract summary: We organized an open machine learning challenge called GeoLifeCLEF 2023.<n>The training dataset comprised of 5 million plant species distributed across Europe.<n>We evaluated models ability to predict species in 22 thousand small plots based on standardized surveys.
- Score: 9.66382598562254
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the spatio-temporal distribution of species is a cornerstone of ecology and conservation. By pairing species observations with geographic and environmental predictors, researchers can model the relationship between an environment and the species which may be found there. To advance the state- of-the-art in this area with deep learning models and remote sensing data, we organized an open machine learning challenge called GeoLifeCLEF 2023. The training dataset comprised 5 million plant species observations (single positive label per sample) distributed across Europe and covering most of its flora, high-resolution rasters: remote sensing imagery, land cover, elevation, in addition to coarse-resolution data: climate, soil and human footprint variables. In this multi-label classification task, we evaluated models ability to predict the species composition in 22 thousand small plots based on standardized surveys. This paper presents an overview of the competition, synthesizes the approaches used by the participating teams, and analyzes the main results. In particular, we highlight the biases faced by the methods fitted to single positive labels when it comes to the multi-label evaluation, and the new and effective learning strategy combining single and multi-label data in training.
Related papers
- Overview of PlantCLEF 2025: Multi-Species Plant Identification in Vegetation Quadrat Images [2.526933812879881]
The PlantCLEF 2025 challenge relies on a new test set of 2,105 high-resolution multi-label images annotated by experts and covering around 400 species.<n>The goal is to predict all species present in a quadrat image using single-label training data.
arXiv Detail & Related papers (2025-09-22T11:21:53Z) - BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning [60.80381372245902]
We find emergent behaviors in biological vision models via large-scale contrastive vision-language training.<n>We train BioCLIP 2 on TreeOfLife-200M to distinguish different species.<n>We identify emergent properties in the learned embedding space of BioCLIP 2.
arXiv Detail & Related papers (2025-05-29T17:48:20Z) - SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in Ecology [3.743127390843568]
Self-supervised learning has enabled learning representations from unlabeled data.<n>These models are often trained on datasets biased toward areas of high human activity.<n>To better capture vegetation seasonality at a global scale, we propose a simple phenology-informed sampling strategy.
arXiv Detail & Related papers (2025-04-25T10:58:44Z) - Feedforward Few-shot Species Range Estimation [61.60698161072356]
Knowing where a particular species can or cannot be found on Earth is crucial for ecological research and conservation efforts.<n> accurate range estimates are only available for a relatively small proportion of all known species.<n>We outline a new approach for few-shot species range estimation to address the challenge of accurately estimating the range of a species from limited data.
arXiv Detail & Related papers (2025-02-20T19:13:29Z) - Combining Observational Data and Language for Species Range Estimation [63.65684199946094]
We propose a novel approach combining millions of citizen science species observations with textual descriptions from Wikipedia.<n>Our framework maps locations, species, and text descriptions into a common space, enabling zero-shot range estimation from textual descriptions.<n>Our approach also acts as a strong prior when combined with observational data, resulting in more accurate range estimation with less data.
arXiv Detail & Related papers (2024-10-14T17:22:55Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - Bird Distribution Modelling using Remote Sensing and Citizen Science
data [31.375576105932442]
Climate change is a major driver of biodiversity loss.
There are significant knowledge gaps about the distribution of species.
We propose an approach leveraging computer vision to improve species distribution modelling.
arXiv Detail & Related papers (2023-05-01T20:27:11Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z) - The GeoLifeCLEF 2020 Dataset [13.274586385114622]
We present the GeoLifeCLEF 2020 dataset, which consists of 1.9 million species observations paired with high-resolution remote sensing imagery, land cover data, and altitude.
We also discuss the GeoLifeCLEF 2020 competition, which aims to use this dataset to advance the state-of-the-art in location-based species recommendation.
arXiv Detail & Related papers (2020-04-08T18:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.