Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions
Using Trainable Kernels and Augmentations
- URL: http://arxiv.org/abs/2207.13843v1
- Date: Thu, 28 Jul 2022 01:05:40 GMT
- Title: Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions
Using Trainable Kernels and Augmentations
- Authors: Devesh Khandelwal, Sean Campos, Shwetha Nagaraj, Fred Nugen, Alberto
Todeschini
- Abstract summary: We demonstrate a unique recipe to enhance the effectiveness of audio machine learning approaches by fusing pre-processing techniques into a deep learning model.
Our solution accelerates training and inference performance by optimizing hyper- parameters through training instead of costly random searches to build a reliable mosquito detector from audio signals.
- Score: 17.77602155559703
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we demonstrate a unique recipe to enhance the effectiveness of
audio machine learning approaches by fusing pre-processing techniques into a
deep learning model. Our solution accelerates training and inference
performance by optimizing hyper-parameters through training instead of costly
random searches to build a reliable mosquito detector from audio signals. The
experiments and the results presented here are part of the MOS C submission of
the ACM 2022 challenge. Our results outperform the published baseline by 212%
on the unpublished test set. We believe that this is one of the best real-world
examples of building a robust bio-acoustic system that provides reliable
mosquito detection in noisy conditions.
Related papers
- Contrastive and Transfer Learning for Effective Audio Fingerprinting through a Real-World Evaluation Protocol [1.8842532732272859]
Recent advances in song identification leverage deep neural networks to learn compact audio fingerprints directly from raw waveforms.<n>While these methods perform well under controlled conditions, their accuracy drops significantly in real-world scenarios where the audio is captured via mobile devices in noisy environments.<n>We generate three recordings of the same audio, each with increasing levels of noise, captured using a mobile device's microphone.<n>Our results reveal a substantial performance drop for two state-of-the-art CNN-based models under this protocol, compared to previously reported benchmarks.
arXiv Detail & Related papers (2025-07-08T15:13:26Z) - Synthetic data enables context-aware bioacoustic sound event detection [18.158806322128527]
We propose a methodology for training foundation models that enhances their in-context learning capabilities.
We generate over 8.8 thousand hours of strongly-labeled audio and train a query-by-example, transformer-based model to perform few-shot bioacoustic sound event detection.
We make our trained model available via an API, to provide ecologists and ethologists with a training-free tool for bioacoustic sound event detection.
arXiv Detail & Related papers (2025-03-01T02:03:22Z) - Noise-Robust Target-Speaker Voice Activity Detection Through Self-Supervised Pretraining [21.26555178371168]
Target-Speaker Voice Activity Detection (TS-VAD) is the task of detecting the presence of speech from a known target-speaker in an audio frame.
Deep neural network-based models have shown good performance in this task.
We propose a causal, Self-Supervised Learning (SSL) pretraining framework to enhance TS-VAD performance in noisy conditions.
arXiv Detail & Related papers (2025-01-06T18:00:14Z) - Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature [1.1455937444848385]
We propose a robust set of features derived from a thorough research of contemporary practices in voice pathology detection.
We combine this feature set, containing data from the publicly available Saarbr"ucken Voice Database (SVD), with preprocessing using the K-Means Synthetic Minority Over-Sampling Technique algorithm.
Our approach has achieved the state-of-the-art performance, measured by unweighted average recall in voice pathology detection.
arXiv Detail & Related papers (2024-10-14T14:17:52Z) - Describe Where You Are: Improving Noise-Robustness for Speech Emotion Recognition with Text Description of the Environment [21.123477804401116]
Speech emotion recognition (SER) systems often struggle in real-world environments, where ambient noise severely degrades their performance.
This paper explores a novel approach that exploits prior knowledge of testing environments to maximize SER performance under noisy conditions.
arXiv Detail & Related papers (2024-07-25T02:30:40Z) - Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC)
NPC consists of a detection module and a correction module.
We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z) - An Efficient Membership Inference Attack for the Diffusion Model by
Proximal Initialization [58.88327181933151]
In this paper, we propose an efficient query-based membership inference attack (MIA)
Experimental results indicate that the proposed method can achieve competitive performance with only two queries on both discrete-time and continuous-time diffusion models.
To the best of our knowledge, this work is the first to study the robustness of diffusion models to MIA in the text-to-speech task.
arXiv Detail & Related papers (2023-05-26T16:38:48Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - Inference and Denoise: Causal Inference-based Neural Speech Enhancement [83.4641575757706]
This study addresses the speech enhancement (SE) task within the causal inference paradigm by modeling the noise presence as an intervention.
The proposed causal inference-based speech enhancement (CISE) separates clean and noisy frames in an intervened noisy speech using a noise detector and assigns both sets of frames to two mask-based enhancement modules (EMs) to perform noise-conditional SE.
arXiv Detail & Related papers (2022-11-02T15:03:50Z) - NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional
Resampling [34.565077865854484]
We propose noise adaptive speech enhancement with target-conditional resampling (NASTAR)
NASTAR uses a feedback mechanism to simulate adaptive training data via a noise extractor and a retrieval model.
Experimental results show that NASTAR can effectively use one noisy speech sample to adapt an SE model to a target condition.
arXiv Detail & Related papers (2022-06-18T00:15:48Z) - Hard Sample Aware Noise Robust Learning for Histopathology Image
Classification [4.75542005200538]
We introduce a novel hard sample aware noise robust learning method for histopathology image classification.
To distinguish the informative hard samples from the harmful noisy ones, we build an easy/hard/noisy (EHN) detection model.
We propose a noise suppressing and hard enhancing (NSHE) scheme to train the noise robust model.
arXiv Detail & Related papers (2021-12-05T11:07:55Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.