Insect Identification in the Wild: The AMI Dataset
- URL: http://arxiv.org/abs/2406.12452v1
- Date: Tue, 18 Jun 2024 09:57:02 GMT
- Title: Insect Identification in the Wild: The AMI Dataset
- Authors: Aditya Jain, Fagner Cunha, Michael James Bunsen, Juan Sebastián Cañas, Léonard Pasi, Nathan Pinoy, Flemming Helsing, JoAnne Russo, Marc Botham, Michael Sabourin, Jonathan Fréchette, Alexandre Anctil, Yacksecari Lopez, Eduardo Navarro, Filonila Perez Pimentel, Ana Cecilia Zamora, José Alejandro Ramirez Silva, Jonathan Gagnon, Tom August, Kim Bjerge, Alba Gomez Segura, Marc Bélisle, Yves Basset, Kent P. McFarland, David Roy, Toke Thomas Høye, Maxim Larrivée, David Rolnick,
- Abstract summary: Insects represent half of all global biodiversity, yet many of the world's insects are disappearing.
Despite this crisis, data on insect diversity and abundance remain woefully inadequate.
We provide the first large-scale machine learning benchmarks for fine-grained insect recognition.
- Score: 35.41544843896443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Insects represent half of all global biodiversity, yet many of the world's insects are disappearing, with severe implications for ecosystems and agriculture. Despite this crisis, data on insect diversity and abundance remain woefully inadequate, due to the scarcity of human experts and the lack of scalable tools for monitoring. Ecologists have started to adopt camera traps to record and study insects, and have proposed computer vision algorithms as an answer for scalable data processing. However, insect monitoring in the wild poses unique challenges that have not yet been addressed within computer vision, including the combination of long-tailed data, extremely similar classes, and significant distribution shifts. We provide the first large-scale machine learning benchmarks for fine-grained insect recognition, designed to match real-world tasks faced by ecologists. Our contributions include a curated dataset of images from citizen science platforms and museums, and an expert-annotated dataset drawn from automated camera traps across multiple continents, designed to test out-of-distribution generalization under field conditions. We train and evaluate a variety of baseline algorithms and introduce a combination of data augmentation techniques that enhance generalization across geographies and hardware setups. Code and datasets are made publicly available.
Related papers
- Self-supervised transformer-based pre-training method with General Plant Infection dataset [3.969851116372513]
This study proposes an advanced network architecture that combines Contrastive Learning and Masked Image Modeling (MIM)
The proposed network architecture demonstrates effectiveness in addressing plant pest and disease recognition tasks, achieving notable detection accuracy.
Our code and dataset will be publicly available to advance research in plant pest and disease recognition.
arXiv Detail & Related papers (2024-07-20T15:48:35Z) - A machine learning pipeline for automated insect monitoring [17.034158815607128]
Camera traps, conventionally used for monitoring terrestrial vertebrates, are now being modified for insects, especially moths.
We describe a complete, open-source machine learning-based software pipeline for automated monitoring of moths via camera traps.
arXiv Detail & Related papers (2024-06-18T19:51:16Z) - Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks.
It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection.
Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z) - SatBird: Bird Species Distribution Modeling with Remote Sensing and
Citizen Science Data [68.2366021016172]
We present SatBird, a satellite dataset of locations in the USA with labels derived from presence-absence observation data from the citizen science database eBird.
We also provide a dataset in Kenya representing low-data regimes.
We benchmark a set of baselines on our dataset, including SOTA models for remote sensing tasks.
arXiv Detail & Related papers (2023-11-02T02:00:27Z) - A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect
Dataset [18.211840156134784]
This paper presents a curated million-image dataset, primarily to train computer-vision models capable of providing image-based taxonomic assessment.
The dataset also presents compelling characteristics, the study of which would be of interest to the broader machine learning community.
arXiv Detail & Related papers (2023-07-19T20:54:08Z) - Machine Learning Challenges of Biological Factors in Insect Image Data [3.867363075280544]
The BIOSCAN project seeks to study changes in biodiversity on a global scale.
One component of the project is focused on studying the species interaction and dynamics of all insects.
Over 1.5 million images per year will be collected, each needing taxonomic classification.
arXiv Detail & Related papers (2022-11-04T15:58:20Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Scene-Aware
Ambidextrous Bin Picking via Physics-based Metaverse Synthesis [72.85526892440251]
We introduce MetaGraspNet, a large-scale photo-realistic bin picking dataset constructed via physics-based metaverse synthesis.
The proposed dataset contains 217k RGBD images across 82 different article types, with full annotations for object detection, amodal perception, keypoint detection, manipulation order and ambidextrous grasp labels for a parallel-jaw and vacuum gripper.
We also provide a real dataset consisting of over 2.3k fully annotated high-quality RGBD images, divided into 5 levels of difficulties and an unseen object set to evaluate different object and layout properties.
arXiv Detail & Related papers (2022-08-08T08:15:34Z) - Spatial Monitoring and Insect Behavioural Analysis Using Computer Vision
for Precision Pollination [6.2997667081978825]
Insects are the most important global pollinator of crops and play a key role in maintaining the sustainability of natural ecosystems.
Current computer vision facilitated insect tracking in complex outdoor environments is restricted in spatial coverage.
This article introduces a novel system to facilitate markerless data capture for insect counting, insect motion tracking, behaviour analysis and pollination prediction.
arXiv Detail & Related papers (2022-05-10T05:11:28Z) - REGRAD: A Large-Scale Relational Grasp Dataset for Safe and
Object-Specific Robotic Grasping in Clutter [52.117388513480435]
We present a new dataset named regrad to sustain the modeling of relationships among objects and grasps.
Our dataset is collected in both forms of 2D images and 3D point clouds.
Users are free to import their own object models for the generation of as many data as they want.
arXiv Detail & Related papers (2021-04-29T05:31:21Z) - Automatic image-based identification and biomass estimation of
invertebrates [70.08255822611812]
Time-consuming sorting and identification of taxa pose strong limitations on how many insect samples can be processed.
We propose to replace the standard manual approach of human expert-based sorting and identification with an automatic image-based technology.
We use state-of-the-art Resnet-50 and InceptionV3 CNNs for the classification task.
arXiv Detail & Related papers (2020-02-05T21:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.