Unsupervised outlier detection to improve bird audio dataset labels
- URL: http://arxiv.org/abs/2504.18650v1
- Date: Fri, 25 Apr 2025 19:04:40 GMT
- Title: Unsupervised outlier detection to improve bird audio dataset labels
- Authors: Bruce Collins,
- Abstract summary: Non-target bird species sounds can result in dataset labeling discrepancies referred to as label noise.<n>We present a cleaning process consisting of audio preprocessing followed by dimensionality reduction and unsupervised outlier detection.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Xeno-Canto bird audio repository is an invaluable resource for those interested in vocalizations and other sounds made by birds around the world. This is particularly the case for machine learning researchers attempting to improve on the bird species recognition accuracy of classification models. However, the task of extracting labeled datasets from the recordings found in this crowd-sourced repository faces several challenges. One challenge of particular significance to machine learning practitioners is that one bird species label is applied to each audio recording, but frequently other sounds are also captured including other bird species, other animal sounds, anthropogenic and other ambient sounds. These non-target bird species sounds can result in dataset labeling discrepancies referred to as label noise. In this work we present a cleaning process consisting of audio preprocessing followed by dimensionality reduction and unsupervised outlier detection (UOD) to reduce the label noise in a dataset derived from Xeno-Canto recordings. We investigate three neural network dimensionality reduction techniques: two flavors of convolutional autoencoders and variational deep embedding (VaDE (Jiang, 2017)). While both methods show some degree of effectiveness at detecting outliers for most bird species datasets, we found significant variation in the performance of the methods from one species to the next. We believe that the results of this investigation demonstrate that the application of our cleaning process can meaningfully reduce the label noise of bird species datasets derived from Xeno-Canto audio repository but results vary across species.
Related papers
- An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon [0.6282171844772422]
This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch.
We leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques.
The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field.
arXiv Detail & Related papers (2025-04-22T21:21:41Z) - A Bird Song Detector for improving bird identification through Deep Learning: a case study from DoƱana [2.7924253850013416]
We develop a pipeline for automatic bird vocalization identification in Donana National Park (SW Spain)<n>We manually annotated 461 minutes of audio from three habitats across nine locations, yielding 3,749 annotations for 34 classes.<n>Applying the Bird Song Detector before classification improved species identification, as all classification models performed better when analyzing only the segments where birds were detected.
arXiv Detail & Related papers (2025-03-19T13:19:06Z) - NBM: an Open Dataset for the Acoustic Monitoring of Nocturnal Migratory Birds in Europe [0.0]
This work presents the Nocturnal Bird Migration dataset, a collection of 13,359 annotated vocalizations from 117 species of the Western Palearctic.<n>The dataset includes precise time and frequency annotations, gathered by dozens of bird enthusiasts across France.<n>In particular, we prove the utility of this database by training an original two-stage deep object detection model tailored for the processing of audio data.
arXiv Detail & Related papers (2024-12-04T18:55:45Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Exploring Meta Information for Audio-based Zero-shot Bird Classification [113.17261694996051]
This study investigates how meta-information can improve zero-shot audio classification.
We use bird species as an example case study due to the availability of rich and diverse meta-data.
arXiv Detail & Related papers (2023-09-15T13:50:16Z) - Unsupervised classification to improve the quality of a bird song
recording dataset [0.0]
We introduce a data-centric novel labelling function composed of three successive steps: time-frequency sound unit segmentation, feature computation for each sound unit, and classification of each sound unit as bird song or noise.
Our labelling function was able to significantly reduce the initial label noise present in the dataset by up to a factor of three.
arXiv Detail & Related papers (2023-02-15T10:01:58Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z) - Training Classifiers that are Universally Robust to All Label Noise
Levels [91.13870793906968]
Deep neural networks are prone to overfitting in the presence of label noise.
We propose a distillation-based framework that incorporates a new subcategory of Positive-Unlabeled learning.
Our framework generally outperforms at medium to high noise levels.
arXiv Detail & Related papers (2021-05-27T13:49:31Z) - Modelling Animal Biodiversity Using Acoustic Monitoring and Deep
Learning [0.0]
This paper outlines an approach for achieving this using state of the art in machine learning to automatically extract features from time-series audio signals.
The acquired bird songs are processed using mel-frequency cepstrum (MFC) to extract features which are later classified using a multilayer perceptron (MLP)
Our proposed method achieved promising results with 0.74 sensitivity, 0.92 specificity and an accuracy of 0.74.
arXiv Detail & Related papers (2021-03-12T13:50:31Z) - Improving Medical Image Classification with Label Noise Using
Dual-uncertainty Estimation [72.0276067144762]
We discuss and define the two common types of label noise in medical images.
We propose an uncertainty estimation-based framework to handle these two label noise amid the medical image classification task.
arXiv Detail & Related papers (2021-02-28T14:56:45Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.