Exploring Green AI for Audio Deepfake Detection
- URL: http://arxiv.org/abs/2403.14290v1
- Date: Thu, 21 Mar 2024 10:54:21 GMT
- Title: Exploring Green AI for Audio Deepfake Detection
- Authors: Subhajit Saha, Md Sahidullah, Swagatam Das,
- Abstract summary: State-of-the-art audio deepfake detectors leveraging deep neural networks exhibit impressive recognition performance.
Deep NLP models produce around 626k lbs of COtextsubscript2 which is equivalent to five times of average US car emission at its lifetime.
This study presents a novel framework for audio deepfake detection that can be seamlessly trained using standard CPU resources.
- Score: 21.17957700009653
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The state-of-the-art audio deepfake detectors leveraging deep neural networks exhibit impressive recognition performance. Nonetheless, this advantage is accompanied by a significant carbon footprint. This is mainly due to the use of high-performance computing with accelerators and high training time. Studies show that average deep NLP model produces around 626k lbs of CO\textsubscript{2} which is equivalent to five times of average US car emission at its lifetime. This is certainly a massive threat to the environment. To tackle this challenge, this study presents a novel framework for audio deepfake detection that can be seamlessly trained using standard CPU resources. Our proposed framework utilizes off-the-shelve self-supervised learning (SSL) based models which are pre-trained and available in public repositories. In contrast to existing methods that fine-tune SSL models and employ additional deep neural networks for downstream tasks, we exploit classical machine learning algorithms such as logistic regression and shallow neural networks using the SSL embeddings extracted using the pre-trained model. Our approach shows competitive results compared to the commonly used high-carbon footprint approaches. In experiments with the ASVspoof 2019 LA dataset, we achieve a 0.90\% equal error rate (EER) with less than 1k trainable model parameters. To encourage further research in this direction and support reproducible results, the Python code will be made publicly accessible following acceptance. Github: https://github.com/sahasubhajit/Speech-Spoofing-
Related papers
- Co-training for Low Resource Scientific Natural Language Inference [65.37685198688538]
We propose a novel co-training method that assigns weights based on the training dynamics of the classifiers to the distantly supervised labels.
By assigning importance weights instead of filtering out examples based on an arbitrary threshold on the predicted confidence, we maximize the usage of automatically labeled data.
The proposed method obtains an improvement of 1.5% in Macro F1 over the distant supervision baseline, and substantial improvements over several other strong SSL baselines.
arXiv Detail & Related papers (2024-06-20T18:35:47Z) - Legged Robot State Estimation With Invariant Extended Kalman Filter
Using Neural Measurement Network [2.0405494347486197]
We develop a state estimation framework that integrates a neural measurement network (NMN) with an invariant extended Kalman filter.
Our approach significantly reduces position drift compared to the existing model-based state estimator.
arXiv Detail & Related papers (2024-02-01T06:06:59Z) - UNFUSED: UNsupervised Finetuning Using SElf supervised Distillation [53.06337011259031]
We introduce UnFuSeD, a novel approach to leverage self-supervised learning for audio classification.
We use the encoder to generate pseudo-labels for unsupervised fine-tuning before the actual fine-tuning step.
UnFuSeD achieves state-of-the-art results on the LAPE Benchmark, significantly outperforming all our baselines.
arXiv Detail & Related papers (2023-03-10T02:43:36Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - The Devil Is in the Details: An Efficient Convolutional Neural Network
for Transport Mode Detection [3.008051369744002]
Transport mode detection is a classification problem aiming to design an algorithm that can infer the transport mode of a user given multimodal signals.
We show that a small, optimized model can perform as well as a current deep model.
arXiv Detail & Related papers (2021-09-16T08:05:47Z) - A robust approach for deep neural networks in presence of label noise:
relabelling and filtering instances during training [14.244244290954084]
We propose a robust training strategy against label noise, called RAFNI, that can be used with any CNN.
RAFNI consists of three mechanisms: two mechanisms that filter instances and one mechanism that relabels instances.
We evaluated our algorithm using different data sets of several sizes and characteristics.
arXiv Detail & Related papers (2021-09-08T16:11:31Z) - Cascade Bagging for Accuracy Prediction with Few Training Samples [8.373420721376739]
We propose a novel framework to train an accuracy predictor under few training samples.
The framework consists ofdata augmentation methods and an ensemble learning algorithm.
arXiv Detail & Related papers (2021-08-12T09:10:52Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - A Deep Unsupervised Feature Learning Spiking Neural Network with
Binarized Classification Layers for EMNIST Classification using SpykeFlow [0.0]
unsupervised learning technique of spike timing dependent plasticity (STDP) using binary activations are used to extract features from spiking input data.
The accuracies obtained for the balanced EMNIST data set compare favorably with other approaches.
arXiv Detail & Related papers (2020-02-26T23:47:35Z) - REST: Robust and Efficient Neural Networks for Sleep Monitoring in the
Wild [62.36144064259933]
We propose REST, a new method that simultaneously tackles both issues via adversarial training and controlling the Lipschitz constant of the neural network.
We demonstrate that REST produces highly-robust and efficient models that substantially outperform the original full-sized models in the presence of noise.
By deploying these models to an Android application on a smartphone, we quantitatively observe that REST allows models to achieve up to 17x energy reduction and 9x faster inference.
arXiv Detail & Related papers (2020-01-29T17:23:16Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.