Species196: A One-Million Semi-supervised Dataset for Fine-grained
Species Recognition
- URL: http://arxiv.org/abs/2309.14183v3
- Date: Sat, 28 Oct 2023 07:48:50 GMT
- Title: Species196: A One-Million Semi-supervised Dataset for Fine-grained
Species Recognition
- Authors: Wei He, Kai Han, Ying Nie, Chengcheng Wang, Yunhe Wang
- Abstract summary: Species196 is a large-scale semi-supervised dataset of 196-category invasive species.
It collects over 19K images with expert-level accurate annotations Species196-L, and 1.2M unlabeled images of invasive species Species196-U.
- Score: 30.327642724046903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The development of foundation vision models has pushed the general visual
recognition to a high level, but cannot well address the fine-grained
recognition in specialized domain such as invasive species classification.
Identifying and managing invasive species has strong social and ecological
value. Currently, most invasive species datasets are limited in scale and cover
a narrow range of species, which restricts the development of deep-learning
based invasion biometrics systems. To fill the gap of this area, we introduced
Species196, a large-scale semi-supervised dataset of 196-category invasive
species. It collects over 19K images with expert-level accurate annotations
Species196-L, and 1.2M unlabeled images of invasive species Species196-U. The
dataset provides four experimental settings for benchmarking the existing
models and algorithms, namely, supervised learning, semi-supervised learning,
self-supervised pretraining and zero-shot inference ability of large
multi-modal models. To facilitate future research on these four learning
paradigms, we conduct an empirical study of the representative methods on the
introduced dataset. The dataset is publicly available at
https://species-dataset.github.io/.
Related papers
- Generating Binary Species Range Maps [12.342459602972609]
Species distribution models (SDMs) and, more recently, deep learning-based variants offer a potential automated alternative.
Deep learning-based SDMs generate a continuous probability representing the predicted presence of a species at a given location.
We evaluate different approaches for automatically identifying the best thresholds for binarizing range maps using presence-only data.
arXiv Detail & Related papers (2024-08-28T17:17:20Z) - Active Learning-Based Species Range Estimation [20.422188189640053]
We propose a new active learning approach for efficiently estimating the geographic range of a species from a limited number of on the ground observations.
We show that it is possible to generate this candidate set of ranges by using models that have been trained on large weakly supervised community collected observation data.
We conduct a detailed evaluation of our approach and compare it to existing active learning methods using an evaluation dataset containing expert-derived ranges for one thousand species.
arXiv Detail & Related papers (2023-11-03T17:45:18Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology
Classification and Anomaly Detection [57.85347204640585]
We develop a Universal Domain Adaptation method DeepAstroUDA.
It can be applied to datasets with different types of class overlap.
For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets.
arXiv Detail & Related papers (2022-11-01T18:07:21Z) - Dise\~no y desarrollo de aplicaci\'on m\'ovil para la clasificaci\'on de
flora nativa chilena utilizando redes neuronales convolucionales [0.0]
This study introduces the development of a chilean species dataset and an optimized classification model implemented to a mobile app.
The data set was built by putting together pictures of several species captured on the field and by selecting some pictures available from other datasets available online.
The best models were implemented on a mobile app, obtaining a 95% correct prediction rate with respect to the set of tests.
arXiv Detail & Related papers (2021-06-11T19:43:47Z) - Streaming Self-Training via Domain-Agnostic Unlabeled Images [62.57647373581592]
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models.
Key to SST are two crucial observations: (1) domain-agnostic unlabeled images enable us to learn better models with a few labeled examples without any additional knowledge or supervision; and (2) learning is a continuous process and can be done by constructing a schedule of learning updates.
arXiv Detail & Related papers (2021-04-07T17:58:39Z) - I-Nema: A Biological Image Dataset for Nematode Recognition [3.1918817988202606]
Nematode worms are one of most abundant metazoan groups on the earth, occupying diverse ecological niches.
Accurate recognition or identification of nematodes are of great importance for pest control, soil ecology, bio-geography, habitat conservation and against climate changes.
Computer vision and image processing have witnessed a few successes in species recognition of nematodes; however, it is still in great demand.
arXiv Detail & Related papers (2021-03-15T12:29:37Z) - TraND: Transferable Neighborhood Discovery for Unsupervised Cross-domain
Gait Recognition [77.77786072373942]
This paper proposes a Transferable Neighborhood Discovery (TraND) framework to bridge the domain gap for unsupervised cross-domain gait recognition.
We design an end-to-end trainable approach to automatically discover the confident neighborhoods of unlabeled samples in the latent space.
Our method achieves state-of-the-art results on two public datasets, i.e., CASIA-B and OU-LP.
arXiv Detail & Related papers (2021-02-09T03:07:07Z) - Two-View Fine-grained Classification of Plant Species [66.75915278733197]
We propose a novel method based on a two-view leaf image representation and a hierarchical classification strategy for fine-grained recognition of plant species.
A deep metric based on Siamese convolutional neural networks is used to reduce the dependence on a large number of training samples and make the method scalable to new plant species.
arXiv Detail & Related papers (2020-05-18T21:57:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.