MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System
- URL: http://arxiv.org/abs/2409.18542v1
- Date: Fri, 27 Sep 2024 08:21:31 GMT
- Title: MIMII-Gen: Generative Modeling Approach for Simulated Evaluation of Anomalous Sound Detection System
- Authors: Harsh Purohit, Tomoya Nishida, Kota Dohi, Takashi Endo, Yohei Kawaguchi,
- Abstract summary: Insufficient recordings and the scarcity of anomalies present significant challenges in developing robust anomaly detection systems.
We propose a novel approach for generating diverse anomalies in machine sound using a latent diffusion-based model that integrates an encoder-decoder framework.
- Score: 5.578413517654703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Insufficient recordings and the scarcity of anomalies present significant challenges in developing and validating robust anomaly detection systems for machine sounds. To address these limitations, we propose a novel approach for generating diverse anomalies in machine sound using a latent diffusion-based model that integrates an encoder-decoder framework. Our method utilizes the Flan-T5 model to encode captions derived from audio file metadata, enabling conditional generation through a carefully designed U-Net architecture. This approach aids our model in generating audio signals within the EnCodec latent space, ensuring high contextual relevance and quality. We objectively evaluated the quality of our generated sounds using the Fr\'echet Audio Distance (FAD) score and other metrics, demonstrating that our approach surpasses existing models in generating reliable machine audio that closely resembles actual abnormal conditions. The evaluation of the anomaly detection system using our generated data revealed a strong correlation, with the area under the curve (AUC) score differing by 4.8\% from the original, validating the effectiveness of our generated data. These results demonstrate the potential of our approach to enhance the evaluation and robustness of anomaly detection systems across varied and previously unseen conditions. Audio samples can be found at \url{https://hpworkhub.github.io/MIMII-Gen.github.io/}.
Related papers
- SONAR: A Synthetic AI-Audio Detection Framework and Benchmark [59.09338266364506]
SONAR is a synthetic AI-Audio Detection Framework and Benchmark.
It aims to provide a comprehensive evaluation for distinguishing cutting-edge AI-synthesized auditory content.
It is the first framework to uniformly benchmark AI-audio detection across both traditional and foundation model-based deepfake detection systems.
arXiv Detail & Related papers (2024-10-06T01:03:42Z) - Do You Remember? Overcoming Catastrophic Forgetting for Fake Audio
Detection [54.20974251478516]
We propose a continual learning algorithm for fake audio detection to overcome catastrophic forgetting.
When fine-tuning a detection network, our approach adaptively computes the direction of weight modification according to the ratio of genuine utterances and fake utterances.
Our method can easily be generalized to related fields, like speech emotion recognition.
arXiv Detail & Related papers (2023-08-07T05:05:49Z) - From Discrete Tokens to High-Fidelity Audio Using Multi-Band Diffusion [84.138804145918]
Deep generative models can generate high-fidelity audio conditioned on various types of representations.
These models are prone to generate audible artifacts when the conditioning is flawed or imperfect.
We propose a high-fidelity multi-band diffusion-based framework that generates any type of audio modality from low-bitrate discrete representations.
arXiv Detail & Related papers (2023-08-02T22:14:29Z) - The role of noise in denoising models for anomaly detection in medical
images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images.
Unsupervised anomaly detection approaches have been proposed using only normal data for training.
We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z) - Denoising diffusion models for out-of-distribution detection [2.113925122479677]
We exploit the view of denoising probabilistic diffusion models (DDPM) as denoising autoencoders.
We use DDPMs to reconstruct an input that has been noised to a range of noise levels, and use the resulting multi-dimensional reconstruction error to classify out-of-distribution inputs.
arXiv Detail & Related papers (2022-11-14T20:35:11Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Hierarchical Conditional Variational Autoencoder Based Acoustic Anomaly
Detection [8.136103644634348]
Existing approaches such as deep autoencoder (DAE), variational autoencoder (VAE), conditional variational autoencoder (CVAE) etc. have limited representation capabilities in the latent space.
We propose a new method named as hierarchical conditional variational autoencoder (HCVAE)
This method utilizes available taxonomic hierarchical knowledge about industrial facility to refine the latent space representation.
arXiv Detail & Related papers (2022-06-11T08:15:01Z) - Canonical Polyadic Decomposition and Deep Learning for Machine Fault
Detection [0.0]
It is impossible to collect enough data to learn all types of faults from a machine.
New algorithms, trained using data from healthy conditions only, were developed to perform unsupervised anomaly detection.
A key issue in the development of these algorithms is the noise in the signals, as it impacts the anomaly detection performance.
arXiv Detail & Related papers (2021-07-20T14:06:50Z) - Automatic Feature Extraction for Heartbeat Anomaly Detection [7.054093620465401]
We focus on automatic feature extraction for raw audio heartbeat sounds, aimed at anomaly detection applications in healthcare.
We learn features with the help of an autoencoder composed by a 1D non-causal convolutional encoder and a WaveNet decoder.
arXiv Detail & Related papers (2021-02-24T13:55:24Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z) - Identifying Audio Adversarial Examples via Anomalous Pattern Detection [4.556497931273283]
We show that 2 of the recent and current state-of-the-art adversarial attacks on audio processing systems lead to higher-than-expected activation at some subset of nodes.
We can detect these attacks with up to an AUC of 0.98 with no degradation in performance on benign samples.
arXiv Detail & Related papers (2020-02-13T12:08:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.