Plant identification based on noisy web data: the amazing performance of deep learning (LifeCLEF 2017)
- URL: http://arxiv.org/abs/2509.20856v1
- Date: Thu, 25 Sep 2025 07:47:43 GMT
- Title: Plant identification based on noisy web data: the amazing performance of deep learning (LifeCLEF 2017)
- Authors: Herve Goeau, Pierre Bonnet, Alexis Joly,
- Abstract summary: The LifeCLEF plant identification challenge is an important milestone towards automated plant identification systems.<n>The LifeCLEF 2017 challenge aimed at evaluating to what extent a large noisy training dataset collected through the web and containing a lot of labelling errors can compete with a smaller but trusted training dataset checked by experts.<n>This paper presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.
- Score: 2.961584451143903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The 2017-th edition of the LifeCLEF plant identification challenge is an important milestone towards automated plant identification systems working at the scale of continental floras with 10.000 plant species living mainly in Europe and North America illustrated by a total of 1.1M images. Nowadays, such ambitious systems are enabled thanks to the conjunction of the dazzling recent progress in image classification with deep learning and several outstanding international initiatives, such as the Encyclopedia of Life (EOL), aggregating the visual knowledge on plant species coming from the main national botany institutes. However, despite all these efforts the majority of the plant species still remain without pictures or are poorly illustrated. Outside the institutional channels, a much larger number of plant pictures are available and spread on the web through botanist blogs, plant lovers web-pages, image hosting websites and on-line plant retailers. The LifeCLEF 2017 plant challenge presented in this paper aimed at evaluating to what extent a large noisy training dataset collected through the web and containing a lot of labelling errors can compete with a smaller but trusted training dataset checked by experts. To fairly compare both training strategies, the test dataset was created from a third data source, i.e. the Pl@ntNet mobile application that collects millions of plant image queries all over the world. This paper presents more precisely the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.
Related papers
- Towards Ancient Plant Seed Classification: A Benchmark Dataset and Baseline Model [62.98256440452042]
We construct the first Ancient Plant Seed Image Classification dataset.<n>It contains 8,340 images from 17 genus- or species-level seed categories excavated from 18 archaeological sites across China.<n>In both quantitative and qualitative analyses, our approach surpasses existing state-of-the-art image classification methods, achieving an accuracy of 90.5%.
arXiv Detail & Related papers (2025-12-20T07:18:22Z) - LifeCLEF Plant Identification Task 2014 [2.4049084513913983]
The LifeCLEFs plant identification task provides a testbed for a system-oriented evaluation of plant identification about 500 species trees and herbaceous plants.<n>The main originality of this data is that it was specifically built through a citizen sciences initiative conducted by Tela Botanica, a French social network of amateur and expert botanists.<n>This overview presents more precisely the resources and assessments of task, summarizes the retrieval approaches employed by the participating groups, and provides an analysis of the main eval- uation results.
arXiv Detail & Related papers (2025-09-28T14:16:15Z) - Overview of LifeCLEF Plant Identification task 2019: diving into data deficient tropical countries [2.961584451143903]
The LifeCLEF 2019 Plant Identification challenge was designed to evaluate automated identification on the flora of data deficient regions.<n>It is based on a dataset of 10K species mainly focused on the Guiana shield and the Northern Amazon rainforest.<n>This paper presents the resources and assessments of the challenge, summarizes the approaches and systems employed by the participating research groups, and provides an analysis of the main outcomes.
arXiv Detail & Related papers (2025-09-23T06:42:30Z) - Overview of LifeCLEF Plant Identification task 2020 [2.961584451143903]
The LifeCLEF 2020 Plant Identification challenge (or "PlantCLEF 2020") was designed to evaluate to what extent automated identification on the flora of data deficient regions can be improved by the use of herbarium collections.<n>It is based on a dataset of about 1,000 species mainly focused on the South America's Guiana Shield, an area known to have one of the greatest diversity of plants in the world.<n>The challenge was evaluated as a cross-domain classification task where the training set consist of several hundred thousand herbarium sheets and few thousand of photos to enable learning a mapping between the two domains.
arXiv Detail & Related papers (2025-09-23T06:35:19Z) - Overview of PlantCLEF 2021: cross-domain plant identification [2.961584451143903]
The LifeCLEF 2021 plant identification challenge was designed to assess the extent to which automated identification of flora can be improved by using herbarium collections.<n>It is based on a dataset of about 1,000 species mainly focused on the Guiana Shield of South America.<n>The challenge was evaluated as a cross-domain classification task where the training set consisted of several hundred thousand herbarium sheets and a few thousand photos.
arXiv Detail & Related papers (2025-09-23T06:26:24Z) - Overview of PlantCLEF 2022: Image-based plant identification at global scale [2.961584451143903]
It is estimated that there are more than 300,000 species of vascular plants in the world.<n>Deep learning techniques now seem mature enough to address the ultimate but realistic problem of global identification of plant biodiversity.<n>The PlantCLEF2022 challenge edition proposes to take a step in this direction by tackling a multi-image (and metadata) classification problem.
arXiv Detail & Related papers (2025-09-22T11:40:21Z) - Overview of PlantCLEF 2024: multi-species plant identification in vegetation plot images [2.7110107174608173]
The PlantCLEF 2024 challenge leverages a new test set of thousands of multi-label images annotated by experts and covering over 800 species.<n>It provides a large training set of 1.7 million individual plant images as well as state-of-the-art vision transformer models pre-trained on this data.<n>The aim is to predict all the plant species present on a high-resolution plot image.
arXiv Detail & Related papers (2025-09-19T08:51:41Z) - Agave crop segmentation and maturity classification with deep learning
data-centric strategies using very high-resolution satellite imagery [101.18253437732933]
We present an Agave tequilana Weber azul crop segmentation and maturity classification using very high resolution satellite imagery.
We solve real-world deep learning problems in the very specific context of agave crop segmentation.
With the resulting accurate models, agave production forecasting can be made available for large regions.
arXiv Detail & Related papers (2023-03-21T03:15:29Z) - Semantic Image Segmentation with Deep Learning for Vine Leaf Phenotyping [59.0626764544669]
In this study, we use Deep Learning methods to semantically segment grapevine leaves images in order to develop an automated object detection system for leaf phenotyping.
Our work contributes to plant lifecycle monitoring through which dynamic traits such as growth and development can be captured and quantified.
arXiv Detail & Related papers (2022-10-24T14:37:09Z) - Potato Crop Stress Identification in Aerial Images using Deep
Learning-based Object Detection [60.83360138070649]
The paper presents an approach for analyzing aerial images of a potato crop using deep neural networks.
The main objective is to demonstrate automated spatial recognition of a healthy versus stressed crop at a plant level.
Experimental validation demonstrated the ability for distinguishing healthy and stressed plants in field images, achieving an average Dice coefficient of 0.74.
arXiv Detail & Related papers (2021-06-14T21:57:40Z) - Semi-Supervised Semantic Segmentation in Earth Observation: The
MiniFrance Suite, Dataset Analysis and Multi-task Network Study [82.02173199363571]
We introduce a novel large-scale dataset for semi-supervised semantic segmentation in Earth Observation, the MiniFrance suite.
MiniFrance has several unprecedented properties: it is large-scale, containing over 2000 very high resolution aerial images, accounting for more than 200 billions samples (pixels)
We present tools for data representativeness analysis in terms of appearance similarity and a thorough study of MiniFrance data, demonstrating that it is suitable for learning and generalizes well in a semi-supervised setting.
arXiv Detail & Related papers (2020-10-15T15:36:58Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.