AnuraSet: A dataset for benchmarking Neotropical anuran calls
identification in passive acoustic monitoring
- URL: http://arxiv.org/abs/2307.06860v1
- Date: Tue, 11 Jul 2023 22:25:21 GMT
- Title: AnuraSet: A dataset for benchmarking Neotropical anuran calls
identification in passive acoustic monitoring
- Authors: Juan Sebasti\'an Ca\~nas, Maria Paula Toro-G\'omez, Larissa Sayuri
Moreira Sugai, Hern\'an Dar\'io Ben\'itez Restrepo, Jorge Rudas, Breyner
Posso Bautista, Lu\'is Felipe Toledo, Simone Dena, Ad\~ao Henrique Rosa
Domingos, Franco Leandro de Souza, Selvino Neckel-Oliveira, Anderson da Rosa,
V\'itor Carvalho-Rocha, Jos\'e Vin\'icius Bernardy, Jos\'e Luiz Massao
Moreira Sugai, Carolina Em\'ilia dos Santos, Rog\'erio Pereira Bastos, Diego
Llusia, Juan Sebasti\'an Ulloa
- Abstract summary: This paper introduces a large-scale dataset of anuran calls recorded by passive acoustic monitoring (PAM)
We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem.
We highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Global change is predicted to induce shifts in anuran acoustic behavior,
which can be studied through passive acoustic monitoring (PAM). Understanding
changes in calling behavior requires the identification of anuran species,
which is challenging due to the particular characteristics of neotropical
soundscapes. In this paper, we introduce a large-scale multi-species dataset of
anuran amphibians calls recorded by PAM, that comprises 27 hours of expert
annotations for 42 different species from two Brazilian biomes. We provide open
access to the dataset, including the raw recordings, experimental setup code,
and a benchmark with a baseline model of the fine-grained categorization
problem. Additionally, we highlight the challenges of the dataset to encourage
machine learning researchers to solve the problem of anuran call identification
towards conservation policy. All our experiments and resources can be found on
our GitHub repository https://github.com/soundclim/anuraset.
Related papers
- Enhancing Sample Utilization in Noise-Robust Deep Metric Learning With Subgroup-Based Positive-Pair Selection [84.78475642696137]
The existence of noisy labels in real-world data negatively impacts the performance of deep learning models.
We propose a noise-robust DML framework with SubGroup-based Positive-pair Selection (SGPS)
SGPS constructs reliable positive pairs for noisy samples to enhance the sample utilization.
arXiv Detail & Related papers (2025-01-19T14:41:55Z) - Noisy Ostracods: A Fine-Grained, Imbalanced Real-World Dataset for Benchmarking Robust Machine Learning and Label Correction Methods [7.00297060532893]
The Noisy Ostracods dataset is a noisy dataset for genus and species classification of crustacean ostracods.
The noise is open-set, including new classes discovered during curation that were not part of the original annotation.
The Noisy Ostracods dataset is highly imbalanced with a imbalance factor $rho$ = 22429.
arXiv Detail & Related papers (2024-12-03T09:30:57Z) - Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data [1.3506669466260703]
The framework automatically extracted labeled data from available platforms for selected avian species.
The labeled data were embedded into recordings, including environmental sounds and noise, and were used to train convolutional recurrent neural network (CRNN) models.
The Adapted SED-CRNN model reached a F1 score of 0.73, demonstrating its efficiency under noisy, real-world conditions.
arXiv Detail & Related papers (2024-06-19T14:14:24Z) - All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic
Data [1.7916003204531015]
We propose a validation scheme for estimating call density in a body of data.
We use these distributions to predict site-level densities, which may be subject to distribution shifts.
arXiv Detail & Related papers (2024-02-23T14:52:44Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Spatial Implicit Neural Representations for Global-Scale Species Mapping [72.92028508757281]
Given a set of locations where a species has been observed, the goal is to build a model to predict whether the species is present or absent at any location.
Traditional methods struggle to take advantage of emerging large-scale crowdsourced datasets.
We use Spatial Implicit Neural Representations (SINRs) to jointly estimate the geographical range of 47k species simultaneously.
arXiv Detail & Related papers (2023-06-05T03:36:01Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z) - HumBugDB: A Large-scale Acoustic Mosquito Dataset [15.108701811353097]
This paper presents the first large-scale multi-species dataset of acoustic recordings of mosquitoes tracked continuously in free flight.
We present 20 hours of audio recordings that we have expertly labelled and tagged precisely in time.
18 hours of recordings contain annotations from 36 different species.
arXiv Detail & Related papers (2021-10-14T14:18:17Z) - Discriminative Singular Spectrum Classifier with Applications on
Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently.
Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces.
The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z) - Improving Medical Image Classification with Label Noise Using
Dual-uncertainty Estimation [72.0276067144762]
We discuss and define the two common types of label noise in medical images.
We propose an uncertainty estimation-based framework to handle these two label noise amid the medical image classification task.
arXiv Detail & Related papers (2021-02-28T14:56:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.