Related papers: Poisoned Acoustics

Poisoned Acoustics

URL: http://arxiv.org/abs/2602.22258v1
Date: Wed, 25 Feb 2026 01:09:43 GMT
Title: Poisoned Acoustics
Authors: Harrison Dahme,
Abstract summary: Training-data poisoning attacks can induce targeted, undetectable failure in deep neural networks by corrupting a vanishingly small fraction of training labels.<n>We demonstrate this on acoustic vehicle classification using the MELAUDIS urban intersection dataset.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training-data poisoning attacks can induce targeted, undetectable failure in deep neural networks by corrupting a vanishingly small fraction of training labels. We demonstrate this on acoustic vehicle classification using the MELAUDIS urban intersection dataset (approx. 9,600 audio clips, 6 classes): a compact 2-D convolutional neural network (CNN) trained on log-mel spectrograms achieves 95.7% Attack Success Rate (ASR) -- the fraction of target-class test samples misclassified under the attack -- on a Truck-to-Car label-flipping attack at just p=0.5% corruption (48 records), with zero detectable change in aggregate accuracy (87.6% baseline; 95% CI: 88-100%, n=3 seeds). We prove this stealth is structural: the maximum accuracy drop from a complete targeted attack is bounded above by the minority class fraction (beta). For real-world class imbalances (Truck approx. 3%), this bound falls below training-run noise, making aggregate accuracy monitoring provably insufficient regardless of architecture or attack method. A companion backdoor trigger attack reveals a novel trigger-dominance collapse: when the target class is a dataset minority, the spectrogram patch trigger becomes functionally redundant--clean ASR equals triggered ASR, and the attack degenerates to pure label flipping. We formalize the ML training pipeline as an attack surface and propose a trust-minimized defense combining content-addressed artifact hashing, Merkle-tree dataset commitment, and post-quantum digital signatures (ML-DSA-65/CRYSTALS-Dilithium3, NIST FIPS 204) for cryptographically verifiable data provenance.

Related papers

BadCLIP++: Stealthy and Persistent Backdoors in Multimodal Contrastive Learning [73.46118996284888]
Research on backdoor attacks against multimodal contrastive learning models faces two key challenges: stealthiness and persistence.<n>We propose BadCLIP++, a unified framework that tackles both challenges.<n>For stealthiness, we introduce a semantic-fusion QR micro-trigger that embeds imperceptible patterns near task-relevant regions.<n>For persistence, we stabilize trigger embeddings via radius shrinkage and centroid alignment.
arXiv Detail & Related papers (2026-02-19T08:31:16Z)
Variance-Based Defense Against Blended Backdoor Attacks [0.0]
Backdoor attacks represent a subtle yet effective class of cyberattacks targeting AI models.<n>We propose a novel defense method that trains a model on the given dataset, detects poisoned classes, and extracts the critical part of the attack trigger.
arXiv Detail & Related papers (2025-06-02T09:01:35Z)
Defending Against Repetitive Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off [20.713624299599722]
Semi-supervised learning (SSL) has achieved remarkable performance with a small fraction of labeled data.<n>This large pool of untrusted data is extremely vulnerable to data poisoning, leading to potential backdoor attacks.<n>We propose a novel method, Unlabeled Data Purification (UPure), to disrupt the association between trigger patterns and target classes.
arXiv Detail & Related papers (2024-07-14T12:42:11Z)
Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks. We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable. We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Mitigating the Impact of Adversarial Attacks in Very Deep Networks [10.555822166916705]
Deep Neural Network (DNN) models have vulnerabilities related to security concerns. Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models. We propose an attack-agnostic-based defense method for mitigating their influence.
arXiv Detail & Related papers (2020-12-08T21:25:44Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Mitigating Backdoor Attacks in Federated Learning [9.582197388445204]
We propose a new and effective method to mitigate backdoor attacks after the training phase. Specifically, we design a federated pruning method to remove redundant neurons in the network and then adjust the model's extreme weight values. Experiments conducted on distributed Fashion-MNIST show that our method can reduce the average attack success rate from 99.7% to 1.9% with a 5.5% loss of test accuracy.
arXiv Detail & Related papers (2020-10-28T18:39:28Z)
Deep Partition Aggregation: Provable Defense against General Poisoning Attacks [136.79415677706612]
Adrial poisoning attacks distort training data in order to corrupt the test-time behavior of a classifier. We propose two novel provable defenses against poisoning attacks. DPA is a certified defense against a general poisoning threat model. SS-DPA is a certified defense against label-flipping attacks.
arXiv Detail & Related papers (2020-06-26T03:16:31Z)
Evading Deepfake-Image Detectors with White- and Black-Box Attacks [75.13740810603686]
We show that a popular forensic approach trains a neural network to distinguish real from synthetic content. We develop five attack case studies on a state-of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22.
arXiv Detail & Related papers (2020-04-01T17:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.