One-shot learning for acoustic identification of bird species in
non-stationary environments
- URL: http://arxiv.org/abs/2105.00202v1
- Date: Sat, 1 May 2021 09:43:20 GMT
- Title: One-shot learning for acoustic identification of bird species in
non-stationary environments
- Authors: Michelangelo Acconcjaioco and Stavros Ntalampiras
- Abstract summary: We propose a framework able to detect changes in the class dictionary and incorporate new classes on the fly.
We design an one-shot learning architecture composed of a Siamese Neural Network operating in the logMel spectrogram space.
- Score: 5.177947445379688
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work introduces the one-shot learning paradigm in the computational
bioacoustics domain. Even though, most of the related literature assumes
availability of data characterizing the entire class dictionary of the problem
at hand, that is rarely true as a habitat's species composition is only known
up to a certain extent. Thus, the problem needs to be addressed by
methodologies able to cope with non-stationarity. To this end, we propose a
framework able to detect changes in the class dictionary and incorporate new
classes on the fly. We design an one-shot learning architecture composed of a
Siamese Neural Network operating in the logMel spectrogram space. We
extensively examine the proposed approach on two datasets of various bird
species using suitable figures of merit. Interestingly, such a learning scheme
exhibits state of the art performance, while taking into account extreme
non-stationarity cases.
Related papers
- Generalization in birdsong classification: impact of transfer learning methods and dataset characteristics [2.6740633963478095]
We explore the effectiveness of transfer learning in large-scale bird sound classification.
Our experiments demonstrate that both fine-tuning and knowledge distillation yield strong performance.
We advocate for more comprehensive labeling practices within the animal sound community.
arXiv Detail & Related papers (2024-09-21T11:33:12Z) - Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution [1.3654846342364308]
State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce.
We propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task.
We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection.
arXiv Detail & Related papers (2024-05-09T12:03:38Z) - Open-World Semantic Segmentation Including Class Similarity [31.799000996671975]
This paper tackles open-world semantic segmentation, i.e., the variant of interpreting image data in which objects occur that have not been seen during training.
We propose a novel approach that performs accurate closed-world semantic segmentation and can identify new categories without requiring any additional training data.
arXiv Detail & Related papers (2024-03-12T11:11:19Z) - Improving Primate Sounds Classification using Binary Presorting for Deep
Learning [6.044912425856236]
In this work, we introduce a generalized approach that first relabels subsegments of MEL spectrogram representations.
For both the binary pre-sorting and the classification, we make use of convolutional neural networks (CNN) and various data-augmentation techniques.
We showcase the results of this approach on the challenging textitComparE 2021 dataset, with the task of classifying between different primate species sounds.
arXiv Detail & Related papers (2023-06-28T09:35:09Z) - A Closer Look at Few-shot Classification Again [68.44963578735877]
Few-shot classification consists of a training phase and an adaptation phase.
We empirically prove that the training algorithm and the adaptation algorithm can be completely disentangled.
Our meta-analysis for each phase reveals several interesting insights that may help better understand key aspects of few-shot classification.
arXiv Detail & Related papers (2023-01-28T16:42:05Z) - Few-shot Open-set Recognition Using Background as Unknowns [58.04165813493666]
Few-shot open-set recognition aims to classify both seen and novel images given only limited training data of seen classes.
Our proposed method not only outperforms multiple baselines but also sets new results on three popular benchmarks.
arXiv Detail & Related papers (2022-07-19T04:19:29Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Parsing Birdsong with Deep Audio Embeddings [0.5599792629509227]
We present a semi-supervised approach to identify characteristic calls and environmental noise.
We utilize several methods to learn a latent representation of audio samples, including a convolutional autoencoder and two pre-trained networks.
arXiv Detail & Related papers (2021-08-20T14:45:44Z) - Intersection Regularization for Extracting Semantic Attributes [72.53481390411173]
We consider the problem of supervised classification, such that the features that the network extracts match an unseen set of semantic attributes.
For example, when learning to classify images of birds into species, we would like to observe the emergence of features that zoologists use to classify birds.
We propose training a neural network with discrete top-level activations, which is followed by a multi-layered perceptron (MLP) and a parallel decision tree.
arXiv Detail & Related papers (2021-03-22T14:32:44Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Transferring Dense Pose to Proximal Animal Classes [83.84439508978126]
We show that it is possible to transfer the knowledge existing in dense pose recognition for humans, as well as in more general object detectors and segmenters, to the problem of dense pose recognition in other classes.
We do this by establishing a DensePose model for the new animal which is also geometrically aligned to humans.
We also introduce two benchmark datasets labelled in the manner of DensePose for the class chimpanzee and use them to evaluate our approach.
arXiv Detail & Related papers (2020-02-28T21:43:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.