All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic
Data
- URL: http://arxiv.org/abs/2402.15360v1
- Date: Fri, 23 Feb 2024 14:52:44 GMT
- Title: All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic
Data
- Authors: Amanda K. Navine, Tom Denton, Matthew J. Weldy, Patrick J. Hart
- Abstract summary: We propose a validation scheme for estimating call density in a body of data.
We use these distributions to predict site-level densities, which may be subject to distribution shifts.
- Score: 1.7916003204531015
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Passive acoustic monitoring (PAM) studies generate thousands of hours of
audio, which may be used to monitor specific animal populations, conduct broad
biodiversity surveys, detect threats such as poachers, and more. Machine
learning classifiers for species identification are increasingly being used to
process the vast amount of audio generated by bioacoustic surveys, expediting
analysis and increasing the utility of PAM as a management tool. In common
practice, a threshold is applied to classifier output scores, and scores above
the threshold are aggregated into a detection count. The choice of threshold
produces biased counts of vocalizations, which are subject to false
positive/negative rates that may vary across subsets of the dataset. In this
work, we advocate for directly estimating call density: The proportion of
detection windows containing the target vocalization, regardless of classifier
score. Our approach targets a desirable ecological estimator and provides a
more rigorous grounding for identifying the core problems caused by
distribution shifts -- when the defining characteristics of the data
distribution change -- and designing strategies to mitigate them. We propose a
validation scheme for estimating call density in a body of data and obtain,
through Bayesian reasoning, probability distributions of confidence scores for
both the positive and negative classes. We use these distributions to predict
site-level densities, which may be subject to distribution shifts. We test our
proposed methods on a real-world study of Hawaiian birds and provide simulation
results leveraging existing fully annotated datasets, demonstrating robustness
to variations in call density and classifier model quality.
Related papers
- Sequential Change Point Detection via Denoising Score Matching [8.22915954499148]
This paper proposes a score-based CUSUM change-point detection, in which the score functions of the data distribution are estimated by injecting noise.
We validate the practical efficacy of our method through numerical experiments on two synthetic datasets and a real-world earthquake precursor detection task.
arXiv Detail & Related papers (2025-01-22T06:04:57Z) - Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance.
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Towards Automated Animal Density Estimation with Acoustic Spatial
Capture-Recapture [2.5193666094305938]
Digital recorders allow surveyors to gather large volumes of data at low cost.
But identifying target species vocalisations in these data is non-trivial.
Machine learning (ML) methods are often used to do the identification.
We propose three methods for acoustic spatial capture-recapture inference.
arXiv Detail & Related papers (2023-08-24T15:29:24Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Propagating Variational Model Uncertainty for Bioacoustic Call Label
Smoothing [15.929064190849665]
We focus on using the predictive uncertainty signal calculated by Bayesian neural networks to guide learning in the self-same task the model is being trained on.
Not opting for costly Monte Carlo sampling of weights, we propagate the approximate hidden variance in an end-to-end manner.
We show that, through the explicit usage of the uncertainty in the loss calculation, the variational model is led to improved predictive and calibration performance.
arXiv Detail & Related papers (2022-10-19T13:04:26Z) - Learning to Adapt to Domain Shifts with Few-shot Samples in Anomalous
Sound Detection [7.631596468553607]
Anomaly detection has many important applications, such as monitoring industrial equipment.
We propose a framework that adapts to new conditions with few-shot samples.
We evaluate our proposed method on a recently-released dataset of audio measurements from different machine types.
arXiv Detail & Related papers (2022-04-05T00:22:25Z) - Benchmarking Uncertainty Qualification on Biosignal Classification Tasks
under Dataset Shift [16.15816241847314]
We propose a framework to evaluate the capability of the estimated uncertainty in capturing different types of biosignal dataset shifts.
In particular, we use three classification tasks based on respiratory sounds and electrocardiography signals to benchmark five representative uncertainty qualification methods.
arXiv Detail & Related papers (2021-12-16T20:42:17Z) - Discriminative Singular Spectrum Classifier with Applications on
Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently.
Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces.
The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z) - Deep Semi-supervised Knowledge Distillation for Overlapping Cervical
Cell Instance Segmentation [54.49894381464853]
We propose to leverage both labeled and unlabeled data for instance segmentation with improved accuracy by knowledge distillation.
We propose a novel Mask-guided Mean Teacher framework with Perturbation-sensitive Sample Mining.
Experiments show that the proposed method improves the performance significantly compared with the supervised method learned from labeled data only.
arXiv Detail & Related papers (2020-07-21T13:27:09Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.