Open-Set Recognition of Novel Species in Biodiversity Monitoring
- URL: http://arxiv.org/abs/2503.01691v1
- Date: Mon, 03 Mar 2025 16:04:46 GMT
- Title: Open-Set Recognition of Novel Species in Biodiversity Monitoring
- Authors: Yuyan Chen, Nico Lang, B. Christian Schmidt, Aditya Jain, Yves Basset, Sara Beery, Maxim Larrivée, David Rolnick,
- Abstract summary: We introduce Open-Insects, a fine-grained image recognition benchmark dataset for open-set recognition and out-of-distribution detection.<n>We evaluate a variety of open-set recognition algorithms, including post-hoc methods, training-time regularization, and training with auxiliary data.
- Score: 21.00825480154685
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning is increasingly being applied to facilitate long-term, large-scale biodiversity monitoring. With most species on Earth still undiscovered or poorly documented, species-recognition models are expected to encounter new species during deployment. We introduce Open-Insects, a fine-grained image recognition benchmark dataset for open-set recognition and out-of-distribution detection in biodiversity monitoring. Open-Insects makes it possible to evaluate algorithms for new species detection on several geographical open-set splits with varying difficulty. Furthermore, we present a test set recently collected in the wild with 59 species that are likely new to science. We evaluate a variety of open-set recognition algorithms, including post-hoc methods, training-time regularization, and training with auxiliary data, finding that the simple post-hoc approach of utilizing softmax scores remains a strong baseline. We also demonstrate how to leverage auxiliary data to improve the detection performance when the training dataset is limited. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery.
Related papers
- Few-shot Species Range Estimation [61.60698161072356]
Knowing where a particular species can or cannot be found on Earth is crucial for ecological research and conservation efforts.<n>We outline a new approach for few-shot species range estimation to address the challenge of accurately estimating the range of a species from limited data.<n>During inference, our model takes a set of spatial locations as input, along with optional metadata such as text or an image, and outputs a species encoding that can be used to predict the range of a previously unseen species in feed-forward manner.
arXiv Detail & Related papers (2025-02-20T19:13:29Z) - Generating Binary Species Range Maps [12.342459602972609]
Species distribution models (SDMs) and, more recently, deep learning-based variants offer a potential automated alternative.
Deep learning-based SDMs generate a continuous probability representing the predicted presence of a species at a given location.
We evaluate different approaches for automatically identifying the best thresholds for binarizing range maps using presence-only data.
arXiv Detail & Related papers (2024-08-28T17:17:20Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Active Learning-Based Species Range Estimation [20.422188189640053]
We propose a new active learning approach for efficiently estimating the geographic range of a species from a limited number of on the ground observations.
We show that it is possible to generate this candidate set of ranges by using models that have been trained on large weakly supervised community collected observation data.
We conduct a detailed evaluation of our approach and compare it to existing active learning methods using an evaluation dataset containing expert-derived ranges for one thousand species.
arXiv Detail & Related papers (2023-11-03T17:45:18Z) - Species196: A One-Million Semi-supervised Dataset for Fine-grained
Species Recognition [30.327642724046903]
Species196 is a large-scale semi-supervised dataset of 196-category invasive species.
It collects over 19K images with expert-level accurate annotations Species196-L, and 1.2M unlabeled images of invasive species Species196-U.
arXiv Detail & Related papers (2023-09-25T14:46:01Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - Unsupervised deep learning techniques for powdery mildew recognition
based on multispectral imaging [63.62764375279861]
This paper presents a deep learning approach to automatically recognize powdery mildew on cucumber leaves.
We focus on unsupervised deep learning techniques applied to multispectral imaging data.
We propose the use of autoencoder architectures to investigate two strategies for disease detection.
arXiv Detail & Related papers (2021-12-20T13:29:13Z) - Fine-Grained Visual Classification of Plant Species In The Wild: Object
Detection as A Reinforced Means of Attention [9.427845067849177]
We explore the idea of using object detection as a form of attention to mitigate the effects of data variability.
We introduce a bottom-up approach based on detecting plant organs and fusing the predictions of a variable number of organ-based species classifiers.
We curate a new dataset with a long-tail distribution for evaluating plant organ detection and organ-based species identification.
arXiv Detail & Related papers (2021-06-03T21:22:18Z) - Dynamic $\beta$-VAEs for quantifying biodiversity by clustering
optically recorded insect signals [0.6091702876917281]
We propose an adaptive variant of the variational autoencoder (VAE) capable of clustering data by phylogenetic groups.
We demonstrate the usefulness of the dynamic $beta$-VAE on optically recorded insect signals from regions of southern Scandinavia.
arXiv Detail & Related papers (2021-02-10T16:14:13Z) - Automatic image-based identification and biomass estimation of
invertebrates [70.08255822611812]
Time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed.
We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology.
We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task.
arXiv Detail & Related papers (2020-02-05T21:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.