Overview of PlantCLEF 2021: cross-domain plant identification
- URL: http://arxiv.org/abs/2509.18697v1
- Date: Tue, 23 Sep 2025 06:26:24 GMT
- Title: Overview of PlantCLEF 2021: cross-domain plant identification
- Authors: Herve Goeau, Pierre Bonnet, Alexis Joly,
- Abstract summary: The LifeCLEF 2021 plant identification challenge was designed to assess the extent to which automated identification of flora can be improved by using herbarium collections.<n>It is based on a dataset of about 1,000 species mainly focused on the Guiana Shield of South America.<n>The challenge was evaluated as a cross-domain classification task where the training set consisted of several hundred thousand herbarium sheets and a few thousand photos.
- Score: 2.961584451143903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated plant identification has improved considerably thanks to recent advances in deep learning and the availability of training data with more and more field photos. However, this profusion of data concerns only a few tens of thousands of species, mainly located in North America and Western Europe, much less in the richest regions in terms of biodiversity such as tropical countries. On the other hand, for several centuries, botanists have systematically collected, catalogued and stored plant specimens in herbaria, especially in tropical regions, and recent efforts by the biodiversity informatics community have made it possible to put millions of digitised records online. The LifeCLEF 2021 plant identification challenge (or "PlantCLEF 2021") was designed to assess the extent to which automated identification of flora in data-poor regions can be improved by using herbarium collections. It is based on a dataset of about 1,000 species mainly focused on the Guiana Shield of South America, a region known to have one of the highest plant diversities in the world. The challenge was evaluated as a cross-domain classification task where the training set consisted of several hundred thousand herbarium sheets and a few thousand photos to allow learning a correspondence between the two domains. In addition to the usual metadata (location, date, author, taxonomy), the training data also includes the values of 5 morphological and functional traits for each species. The test set consisted exclusively of photos taken in the field. This article presents the resources and evaluations of the assessment carried out, summarises the approaches and systems used by the participating research groups and provides an analysis of the main results.
Related papers
- Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model [62.98256440452042]
We construct the first Ancient Plant Seed Image Classification dataset.<n>It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China.<n>In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%.
arXiv Detail & Related papers (2025-12-20T07:18:22Z) - Plant identification based on noisy web data: the amazing performance of deep learning (LifeCLEF 2017) [2.961584451143903]
The LifeCLEF plant identification challenge is an important milestone towards automated plant identification systems.<n>The LifeCLEF 2017 challenge aimed at evaluating to what extent a large noisy training dataset collected through the web and containing a lot of labelling errors can compete with a smaller but trusted training dataset checked by experts.<n>This paper presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.
arXiv Detail & Related papers (2025-09-25T07:47:43Z) - Overview of LifeCLEF Plant Identification task 2019: diving into data deficient tropical countries [2.961584451143903]
The LifeCLEF 2019 Plant Identification challenge was designed to evaluate automated identification on the flora of data deficient regions.<n>It is based on a dataset of 10K species mainly focused on the Guiana shield and the Northern Amazon rainforest.<n>This paper presents the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.
arXiv Detail & Related papers (2025-09-23T06:42:30Z) - Overview of LifeCLEF Plant Identification task 2020 [2.961584451143903]
The LifeCLEF 2020 Plant Identification challenge (or "PlantCLEF 2020") was designed to evaluate to what extent automated identification on the flora of data deficient regions can be improved by the use of herbarium collections.<n>It is based on a dataset of about 1,000 species mainly focused on the South America's Guiana Shield, an area known to have one of the greatest diversity of plants in the world.<n>The challenge was evaluated as a cross-domain classification task where the training set consist of several hundred thousand herbarium sheets and few thousand of photos to enable learning a mapping between the two domains.
arXiv Detail & Related papers (2025-09-23T06:35:19Z) - Overview of PlantCLEF 2022: Image-based plant identification at global scale [2.961584451143903]
It is estimated that there are more than 300,000 species of vascular plants in the world.<n>Deep learning techniques now seem mature enough to address the ultimate but realistic problem of global identification of plant biodiversity.<n>The PlantCLEF2022 challenge edition proposes to take a step in this direction by tackling a multi-image (and metadata) classification problem.
arXiv Detail & Related papers (2025-09-22T11:40:21Z) - Overview of PlantCLEF 2025: Multi-Species Plant Identification in Vegetation Quadrat Images [2.526933812879881]
The PlantCLEF 2025 challenge relies on a new test set of 2,105 high-resolution multi-label images annotated by experts and covering around 400 species.<n>The goal is to predict all species present in a quadrat image using single-label training data.
arXiv Detail & Related papers (2025-09-22T11:21:53Z) - Overview of PlantCLEF 2024: multi-species plant identification in vegetation plot images [2.7110107174608173]
The PlantCLEF 2024 challenge leverages a new test set of thousands of multi-label images annotated by experts and covering over 800 species.<n>It provides a large training set of 1.7 million individual plant images as well as state-of-the-art vision transformer models pre-trained on this data.<n>The aim is to predict all the plant species present on a high-resolution plot image.
arXiv Detail & Related papers (2025-09-19T08:51:41Z) - BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning [51.341003735575335]
We find emergent behaviors in biological vision models via large-scale contrastive vision-language training.<n>We train BioCLIP 2 on TreeOfLife-200M to distinguish different species.<n>We identify emergent properties in the learned embedding space of BioCLIP 2.
arXiv Detail & Related papers (2025-05-29T17:48:20Z) - Feedforward Few-shot Species Range Estimation [61.60698161072356]
Knowing where a particular species can or cannot be found on Earth is crucial for ecological research and conservation efforts.<n> accurate range estimates are only available for a relatively small proportion of all known species.<n>We outline a new approach for few-shot species range estimation to address the challenge of accurately estimating the range of a species from limited data.
arXiv Detail & Related papers (2025-02-20T19:13:29Z) - HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using
Harvest Piles and Remote Sensing [50.4506590177605]
HarvestNet is a dataset for mapping the presence of farms in the Ethiopian regions of Tigray and Amhara during 2020-2023.
We introduce a new approach based on the detection of harvest piles characteristic of many smallholder systems.
We conclude that remote sensing of harvest piles can contribute to more timely and accurate cropland assessments in food insecure regions.
arXiv Detail & Related papers (2023-08-23T11:03:28Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - The Herbarium 2021 Half-Earth Challenge Dataset [1.1470070927586016]
Herbarium sheets present a unique view of the world's botanical history, evolution, and diversity.
With the increased digitisation of herbaria worldwide and the advances in the fine-grained classification domain, there are a lot of opportunities for supporting research in this field.
Existing datasets are either too small, or not diverse enough, in terms of represented taxa, geographic distribution or host institutions.
We present the Herbarium Half-Earth dataset, the largest and most diverse dataset of herbarium specimens to date for automatic taxon recognition.
arXiv Detail & Related papers (2021-05-28T13:24:12Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z) - Automatic image-based identification and biomass estimation of
invertebrates [70.08255822611812]
Time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed.
We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology.
We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task.
arXiv Detail & Related papers (2020-02-05T21:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.