Recognizing bird species in diverse soundscapes under weak supervision
- URL: http://arxiv.org/abs/2107.07728v1
- Date: Fri, 16 Jul 2021 06:54:38 GMT
- Title: Recognizing bird species in diverse soundscapes under weak supervision
- Authors: Christof Henkel, Pascal Pfeiffer and Philipp Singer
- Abstract summary: We present a robust classification approach for avian vocalization in complex and diverse soundscapes, achieving second place in the BirdCLEF 2021 challenge.
We illustrate how to make full use of pre-trained convolutional neural networks, by using an efficient modeling and training routine supplemented by novel augmentation methods.
- Score: 0.2148535041822524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a robust classification approach for avian vocalization in complex
and diverse soundscapes, achieving second place in the BirdCLEF2021 challenge.
We illustrate how to make full use of pre-trained convolutional neural
networks, by using an efficient modeling and training routine supplemented by
novel augmentation methods. Thereby, we improve the generalization of weakly
labeled crowd-sourced data to productive data collected by autonomous recording
units. As such, we illustrate how to progress towards an accurate automated
assessment of avian population which would enable global biodiversity
monitoring at scale, impossible by manual annotation.
Related papers
- Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics [2.6740633963478095]
We explore the effectiveness of transfer learning in large-scale bird sound classification.
Our experiments demonstrate that both fine-tuning and knowledge distillation yield strong performance.
We advocate for more comprehensive labeling practices within the animal sound community.
arXiv Detail & Related papers (2024-09-21T11:33:12Z) - Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Multimodal Foundation Models for Zero-shot Animal Species Recognition in
Camera Trap Images [57.96659470133514]
Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe.
Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts.
Reducing the reliance on costly labelled data has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor.
arXiv Detail & Related papers (2023-11-02T08:32:00Z) - Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with
Transformers [2.404305970432934]
We propose a shift towards end-to-end learning in bird sound monitoring by combining self-supervised (SSL) and deep active learning (DAL)
We aim to bypass traditional spectrogram conversions, enabling direct raw audio processing.
arXiv Detail & Related papers (2023-08-14T13:06:10Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Classification of animal sounds in a hyperdiverse rainforest using
Convolutional Neural Networks [0.0]
Automated species detection from passively recorded soundscapes via machine-learning approaches is a promising technique.
We use soundscapes from a tropical forest in Borneo and a Convolutional Neural Network model (CNN) created with transfer learning.
Our results suggest that transfer learning and data augmentation can make the use of CNNs to classify species' vocalizations feasible even for small soundscape-based projects with many rare species.
arXiv Detail & Related papers (2021-11-29T21:34:57Z) - Parsing Birdsong with Deep Audio Embeddings [0.5599792629509227]
We present a semi-supervised approach to identify characteristic calls and environmental noise.
We utilize several methods to learn a latent representation of audio samples, including a convolutional autoencoder and two pre-trained networks.
arXiv Detail & Related papers (2021-08-20T14:45:44Z) - Zoo-Tuning: Adaptive Transfer from a Zoo of Models [82.9120546160422]
Zoo-Tuning learns to adaptively transfer the parameters of pretrained models to the target task.
We evaluate our approach on a variety of tasks, including reinforcement learning, image classification, and facial landmark detection.
arXiv Detail & Related papers (2021-06-29T14:09:45Z) - Iterative Human and Automated Identification of Wildlife Images [25.579224100175434]
Camera trapping is increasingly used to monitor wildlife, but this technology typically requires extensive data annotation.
Our proposed iterative human and automated identification approach is capable of learning from wildlife imagery data with a long-tailed distribution.
Our approach can achieve a 90% accuracy employing only 20% of the human annotations of existing approaches.
arXiv Detail & Related papers (2021-05-05T20:51:30Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.