Semi-supervised classification of bird vocalizations
- URL: http://arxiv.org/abs/2502.13440v1
- Date: Wed, 19 Feb 2025 05:31:13 GMT
- Title: Semi-supervised classification of bird vocalizations
- Authors: Simen Hexeberg, Mandar Chitre, Matthias Hoffmann-Kuhnt, Bing Wen Low,
- Abstract summary: Changes in bird populations can indicate broader changes in ecosystems.
We propose a semi-supervised acoustic bird detector to allow the detection of time-overlapping calls.
It achieves a mean F0.5 score of 0.701 across 315 classes from 110 bird species on a hold-out test set.
- Score: 0.0
- License:
- Abstract: Changes in bird populations can indicate broader changes in ecosystems, making birds one of the most important animal groups to monitor. Combining machine learning and passive acoustics enables continuous monitoring over extended periods without direct human involvement. However, most existing techniques require extensive expert-labeled datasets for training and cannot easily detect time-overlapping calls in busy soundscapes. We propose a semi-supervised acoustic bird detector designed to allow both the detection of time-overlapping calls (when separated in frequency) and the use of few labeled training samples. The classifier is trained and evaluated on a combination of community-recorded open-source data and long-duration soundscape recordings from Singapore. It achieves a mean F0.5 score of 0.701 across 315 classes from 110 bird species on a hold-out test set, with an average of 11 labeled training samples per class. It outperforms the state-of-the-art BirdNET classifier on a test set of 103 bird species despite significantly fewer labeled training samples. The detector is further tested on 144 microphone-hours of continuous soundscape data. The rich soundscape in Singapore makes suppression of false positives a challenge on raw, continuous data streams. Nevertheless, we demonstrate that achieving high precision in such environments with minimal labeled training data is possible.
Related papers
- NBM: an Open Dataset for the Acoustic Monitoring of Nocturnal Migratory Birds in Europe [0.0]
This work presents the Nocturnal Bird Migration dataset, a collection of 13,359 annotated vocalizations from 117 species of the Western Palearctic.
The dataset includes precise time and frequency annotations, gathered by dozens of bird enthusiasts across France.
In particular, we prove the utility of this database by training an original two-stage deep object detection model tailored for the processing of audio data.
arXiv Detail & Related papers (2024-12-04T18:55:45Z) - Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data [1.3506669466260703]
The framework automatically extracted labeled data from available platforms for selected avian species.
The labeled data were embedded into recordings, including environmental sounds and noise, and were used to train convolutional recurrent neural network (CRNN) models.
The Adapted SED-CRNN model reached a F1 score of 0.73, demonstrating its efficiency under noisy, real-world conditions.
arXiv Detail & Related papers (2024-06-19T14:14:24Z) - AudioProtoPNet: An interpretable deep learning model for bird sound classification [1.49199020343864]
This study introduces AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification.
It is an inherently interpretable model that uses a ConvNeXt backbone to extract embeddings.
The model was trained on the BirdSet training dataset, which consists of 9,734 bird species and over 6,800 hours of recordings.
arXiv Detail & Related papers (2024-04-16T09:37:41Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Exploring Meta Information for Audio-based Zero-shot Bird Classification [113.17261694996051]
This study investigates how meta-information can improve zero-shot audio classification.
We use bird species as an example case study due to the availability of rich and diverse meta-data.
arXiv Detail & Related papers (2023-09-15T13:50:16Z) - Self-Supervised Pretraining Improves Performance and Inference
Efficiency in Multiple Lung Ultrasound Interpretation Tasks [65.23740556896654]
We investigated whether self-supervised pretraining could produce a neural network feature extractor applicable to multiple classification tasks in lung ultrasound analysis.
When fine-tuning on three lung ultrasound tasks, pretrained models resulted in an improvement of the average across-task area under the receiver operating curve (AUC) by 0.032 and 0.061 on local and external test sets respectively.
arXiv Detail & Related papers (2023-09-05T21:36:42Z) - Deep object detection for waterbird monitoring using aerial imagery [56.1262568293658]
In this work, we present a deep learning pipeline that can be used to precisely detect, count, and monitor waterbirds using aerial imagery collected by a commercial drone.
By utilizing convolutional neural network-based object detectors, we show that we can detect 16 classes of waterbird species that are commonly found in colonial nesting islands along the Texas coast.
arXiv Detail & Related papers (2022-10-10T17:37:56Z) - Few-shot Long-Tailed Bird Audio Recognition [3.8073142980733]
We propose a sound detection and classification pipeline to analyze soundscape recordings.
Our solution achieved 18th place of 807 teams at the BirdCLEF 2022 Challenge hosted on Kaggle.
arXiv Detail & Related papers (2022-06-22T04:14:25Z) - Reliable Label Correction is a Good Booster When Learning with Extremely
Noisy Labels [65.79898033530408]
We introduce a novel framework, termed as LC-Booster, to explicitly tackle learning under extreme noise.
LC-Booster incorporates label correction into the sample selection, so that more purified samples, through the reliable label correction, can be utilized for training.
Experiments show that LC-Booster advances state-of-the-art results on several noisy-label benchmarks.
arXiv Detail & Related papers (2022-04-30T07:19:03Z) - UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise.
We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z) - Recognizing bird species in diverse soundscapes under weak supervision [0.2148535041822524]
We present a robust classification approach for avian vocalization in complex and diverse soundscapes, achieving second place in the BirdCLEF 2021 challenge.
We illustrate how to make full use of pre-trained convolutional neural networks, by using an efficient modeling and training routine supplemented by novel augmentation methods.
arXiv Detail & Related papers (2021-07-16T06:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.