Related papers: An empirical investigation into audio pipeline approaches for classifying bird species

An empirical investigation into audio pipeline approaches for classifying bird species

URL: http://arxiv.org/abs/2108.04449v1
Date: Tue, 10 Aug 2021 05:02:38 GMT
Title: An empirical investigation into audio pipeline approaches for classifying bird species
Authors: David Behr, Ciira wa Maina, Vukosi Marivate
Abstract summary: This paper is an investigation into aspects of an audio classification pipeline that will be appropriate for the monitoring of bird species on edges devices. Two classification approaches will be taken into consideration, one which explores the effectiveness of a traditional Deep Neural Network(DNN) and another that makes use of Convolutional layers.
Score: 0.9158130615768508
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper is an investigation into aspects of an audio classification pipeline that will be appropriate for the monitoring of bird species on edges devices. These aspects include transfer learning, data augmentation and model optimization. The hope is that the resulting models will be good candidates to deploy on edge devices to monitor bird populations. Two classification approaches will be taken into consideration, one which explores the effectiveness of a traditional Deep Neural Network(DNN) and another that makes use of Convolutional layers.This study aims to contribute empirical evidence of the merits and demerits of each approach.

Related papers

An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon [0.6282171844772422]
This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch. We leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques. The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field.
arXiv Detail & Related papers (2025-04-22T21:21:41Z)
Visual WetlandBirds Dataset: Bird Species Identification and Behavior Recognition in Videos [0.0]
This study introduces the first fine-grained video dataset specifically designed for bird behavior detection and species classification. The proposed dataset comprises 178 videos recorded in Spanish wetlands, capturing 13 different bird species performing 7 distinct behavior classes.
arXiv Detail & Related papers (2025-01-15T16:34:20Z)
AudioProtoPNet: An interpretable deep learning model for bird sound classification [1.49199020343864]
This study introduces AudioProtoPNet, an adaptation of the Prototypical Part Network (ProtoPNet) for multi-label bird sound classification. It is an inherently interpretable model that uses a ConvNeXt backbone to extract embeddings. The model was trained on the BirdSet training dataset, which consists of 9,734 bird species and over 6,800 hours of recordings.
arXiv Detail & Related papers (2024-04-16T09:37:41Z)
Efficient Transferability Assessment for Selection of Pre-trained Detectors [63.21514888618542]
This paper studies the efficient transferability assessment of pre-trained object detectors. We build up a detector transferability benchmark which contains a large and diverse zoo of pre-trained detectors. Experimental results demonstrate that our method outperforms other state-of-the-art approaches in assessing transferability.
arXiv Detail & Related papers (2024-03-14T14:23:23Z)
Exploring Meta Information for Audio-based Zero-shot Bird Classification [113.17261694996051]
This study investigates how meta-information can improve zero-shot audio classification. We use bird species as an example case study due to the availability of rich and diverse meta-data.
arXiv Detail & Related papers (2023-09-15T13:50:16Z)
Active Bird2Vec: Towards End-to-End Bird Sound Monitoring with Transformers [2.404305970432934]
We propose a shift towards end-to-end learning in bird sound monitoring by combining self-supervised (SSL) and deep active learning (DAL) We aim to bypass traditional spectrogram conversions, enabling direct raw audio processing.
arXiv Detail & Related papers (2023-08-14T13:06:10Z)
Transfer Learning with Semi-Supervised Dataset Annotation for Birdcall Classification [0.0]
We present working notes on transfer learning with semi-supervised dataset annotation for the BirdCLEF 2023 competition. Our approach utilizes existing off-the-shelf models, BirdNET and MixIT, to address representation and labeling challenges in the competition.
arXiv Detail & Related papers (2023-06-29T07:56:27Z)
Improving Primate Sounds Classification using Binary Presorting for Deep Learning [6.044912425856236]
In this work, we introduce a generalized approach that first relabels subsegments of MEL spectrogram representations. For both the binary pre-sorting and the classification, we make use of convolutional neural networks (CNN) and various data-augmentation techniques. We showcase the results of this approach on the challenging textitComparE 2021 dataset, with the task of classifying between different primate species sounds.
arXiv Detail & Related papers (2023-06-28T09:35:09Z)
Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data. Main aim of the identified model is to predict new data from previous observations. We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z)
Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z)
Parsing Birdsong with Deep Audio Embeddings [0.5599792629509227]
We present a semi-supervised approach to identify characteristic calls and environmental noise. We utilize several methods to learn a latent representation of audio samples, including a convolutional autoencoder and two pre-trained networks.
arXiv Detail & Related papers (2021-08-20T14:45:44Z)
Intersection Regularization for Extracting Semantic Attributes [72.53481390411173]
We consider the problem of supervised classification, such that the features that the network extracts match an unseen set of semantic attributes. For example, when learning to classify images of birds into species, we would like to observe the emergence of features that zoologists use to classify birds. We propose training a neural network with discrete top-level activations, which is followed by a multi-layered perceptron (MLP) and a parallel decision tree.
arXiv Detail & Related papers (2021-03-22T14:32:44Z)
Classification of Smoking and Calling using Deep Learning [49.10965021800014]
A pipeline is introduced to perform the classification of smoking and calling by modifying the pretrained V3. Brightness enhancing based on deep learning is implemented to improve the classification of this classification task along with other useful training tricks.
arXiv Detail & Related papers (2020-12-15T00:59:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.