Related papers: From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms

From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms

URL: http://arxiv.org/abs/2502.18328v1
Date: Tue, 25 Feb 2025 16:22:42 GMT
Title: From Vision to Sound: Advancing Audio Anomaly Detection with Vision-Based Algorithms
Authors: Manuel Barusco, Francesco Borsatti, Davide Dalle Pezze, Francesco Paissan, Elisabetta Farella, Gian Antonio Susto,
Abstract summary: We investigate the adaptation of such algorithms to the audio domain to address the problem of Audio Anomaly Detection (AAD)<n>Unlike most existing AAD methods, which primarily classify anomalous samples, our approach introduces fine-grained temporal-frequency localization of anomalies within the spectrogram.<n>We evaluate our approach on industrial and environmental benchmarks, demonstrating the effectiveness of VAD techniques in detecting anomalies in audio signals.
Score: 6.643376250301589
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advances in Visual Anomaly Detection (VAD) have introduced sophisticated algorithms leveraging embeddings generated by pre-trained feature extractors. Inspired by these developments, we investigate the adaptation of such algorithms to the audio domain to address the problem of Audio Anomaly Detection (AAD). Unlike most existing AAD methods, which primarily classify anomalous samples, our approach introduces fine-grained temporal-frequency localization of anomalies within the spectrogram, significantly improving explainability. This capability enables a more precise understanding of where and when anomalies occur, making the results more actionable for end users. We evaluate our approach on industrial and environmental benchmarks, demonstrating the effectiveness of VAD techniques in detecting anomalies in audio signals. Moreover, they improve explainability by enabling localized anomaly identification, making audio anomaly detection systems more interpretable and practical.

Related papers

AVadCLIP: Audio-Visual Collaboration for Robust Video Anomaly Detection [57.649223695021114]
We present a novel weakly supervised framework that leverages audio-visual collaboration for robust video anomaly detection. Our framework demonstrates superior performance across multiple benchmarks, with audio integration significantly boosting anomaly detection accuracy.
arXiv Detail & Related papers (2025-04-06T13:59:16Z)
Unsupervised Anomaly Detection Using Diffusion Trend Analysis [48.19821513256158]
We propose a method to detect anomalies by analysis of reconstruction trend depending on the degree of degradation. The proposed method is validated on an open dataset for industrial anomaly detection.
arXiv Detail & Related papers (2024-07-12T01:50:07Z)
ATAC-Net: Zoomed view works better for Anomaly Detection [1.024113475677323]
ATAC-Net is a framework that trains to detect anomalies from a minimal set of known prior anomalies. We substantiate its superiority to some of the current state-of-the-art techniques in a comparable setting.
arXiv Detail & Related papers (2024-06-20T15:18:32Z)
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image. In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting. Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z)
Adaptive Fake Audio Detection with Low-Rank Model Squeezing [50.7916414913962]
Traditional approaches, such as finetuning, are computationally intensive and pose a risk of impairing the acquired knowledge of known fake audio types. We introduce the concept of training low-rank adaptation matrices tailored specifically to the newly emerging fake audio types. Our approach offers several advantages, including reduced storage memory requirements and lower equal error rates.
arXiv Detail & Related papers (2023-06-08T06:06:42Z)
The role of noise in denoising models for anomaly detection in medical images [62.0532151156057]
Pathological brain lesions exhibit diverse appearance in brain images. Unsupervised anomaly detection approaches have been proposed using only normal data for training. We show that optimization of the spatial resolution and magnitude of the noise improves the performance of different model training regimes.
arXiv Detail & Related papers (2023-01-19T21:39:38Z)
Framing Algorithmic Recourse for Anomaly Detection [18.347886926848563]
We present an approach -- Context preserving Algorithmic Recourse for Anomalies in Tabular data (CARAT) CARAT uses a transformer based encoder-decoder model to explain an anomaly by finding features with low likelihood. Semantically coherent counterfactuals are generated by modifying the highlighted features, using the overall context of features in the anomalous instance(s)
arXiv Detail & Related papers (2022-06-29T03:30:51Z)
Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold. We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples. We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z)
Canonical Polyadic Decomposition and Deep Learning for Machine Fault Detection [0.0]
It is impossible to collect enough data to learn all types of faults from a machine. New algorithms, trained using data from healthy conditions only, were developed to perform unsupervised anomaly detection. A key issue in the development of these algorithms is the noise in the signals, as it impacts the anomaly detection performance.
arXiv Detail & Related papers (2021-07-20T14:06:50Z)
Spotting adversarial samples for speaker verification by neural vocoders [102.1486475058963]
We adopt neural vocoders to spot adversarial samples for automatic speaker verification (ASV) We find that the difference between the ASV scores for the original and re-synthesize audio is a good indicator for discrimination between genuine and adversarial samples. Our codes will be made open-source for future works to do comparison.
arXiv Detail & Related papers (2021-07-01T08:58:16Z)
Identifying Audio Adversarial Examples via Anomalous Pattern Detection [4.556497931273283]
We show that 2 of the recent and current state-of-the-art adversarial attacks on audio processing systems lead to higher-than-expected activation at some subset of nodes. We can detect these attacks with up to an AUC of 0.98 with no degradation in performance on benign samples.
arXiv Detail & Related papers (2020-02-13T12:08:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.