ENJ: Optimizing Noise with Genetic Algorithms to Jailbreak LSMs
- URL: http://arxiv.org/abs/2509.11128v1
- Date: Sun, 14 Sep 2025 06:39:38 GMT
- Title: ENJ: Optimizing Noise with Genetic Algorithms to Jailbreak LSMs
- Authors: Yibo Zhang, Liang Lin,
- Abstract summary: Evolutionary Noise Jailbreak (ENJ)<n>This paper proposes a genetic algorithm to transform environmental noise from a passive interference into an actively optimizable attack carrier for jailbreaking LSMs.<n>Experiments on multiple mainstream speech models show that ENJ's attack effectiveness is significantly superior to existing baseline methods.
- Score: 61.09812971042288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The widespread application of Large Speech Models (LSMs) has made their security risks increasingly prominent. Traditional speech adversarial attack methods face challenges in balancing effectiveness and stealth. This paper proposes Evolutionary Noise Jailbreak (ENJ), which utilizes a genetic algorithm to transform environmental noise from a passive interference into an actively optimizable attack carrier for jailbreaking LSMs. Through operations such as population initialization, crossover fusion, and probabilistic mutation, this method iteratively evolves a series of audio samples that fuse malicious instructions with background noise. These samples sound like harmless noise to humans but can induce the model to parse and execute harmful commands. Extensive experiments on multiple mainstream speech models show that ENJ's attack effectiveness is significantly superior to existing baseline methods. This research reveals the dual role of noise in speech security and provides new critical insights for model security defense in complex acoustic environments.
Related papers
- Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification [55.56234913868664]
We propose Test-time Adaptive Hierarchical Co-enhanced Denoising Network (TAHCD) for reliable learning on multimodal data.<n>The proposed method achieves superior classification performance, robustness, and generalization compared with state-of-the-art reliable multimodal learning approaches.
arXiv Detail & Related papers (2026-01-12T03:14:12Z) - SmoothGuard: Defending Multimodal Large Language Models with Noise Perturbation and Clustering Aggregation [23.12897429892901]
Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual inputs.<n>Despite their success, these models remain highly vulnerable to adversarial manipulations, raising concerns about their safety and reliability in deployment.<n>We introduce SmoothGuard, a lightweight and model-agnostic defense framework that enhances the robustness of MLLMs through randomized noise injection and clustering-based prediction aggregation.
arXiv Detail & Related papers (2025-10-29T14:56:27Z) - Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance [54.88271057438763]
Noise Awareness Guidance (NAG) is a correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule.<n>NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.
arXiv Detail & Related papers (2025-10-14T13:31:34Z) - When Good Sounds Go Adversarial: Jailbreaking Audio-Language Models with Benign Inputs [1.911526481015]
Our research introduces WhisperInject, a two-stage adversarial audio attack framework.<n>It can manipulate state-of-the-art audio language models to generate harmful content.<n>Our method uses imperceptible perturbations in audio inputs that remain benign to human listeners.
arXiv Detail & Related papers (2025-08-05T12:14:01Z) - Unveiling Hidden Vulnerabilities in Digital Human Generation via Adversarial Attacks [14.356235723912564]
We propose a novel framework designed to generate adversarial examples capable of effectively compromising any digital human generation model.<n>Our approach introduces a textbf Dual Heterogeneous Noise Generator (DHNG), which leverages Variational Autoencoders (VAE) and ControlNet to produce diverse, targeted noise tailored to the original image features.<n>Extensive experiments demonstrate TBA's superiority, achieving a remarkable 41.0% increase in estimation error, with an average improvement of approximately 17.0%.
arXiv Detail & Related papers (2025-04-24T11:42:10Z) - Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation [25.410770364140856]
Cross-domain speech enhancement (SE) is often faced with severe challenges due to the scarcity of noise and background information in an unseen target domain.
This study puts forward a novel data simulation method to address this issue, leveraging noise-extractive techniques and generative adversarial networks (GANs)
We introduce the notion of dynamic perturbation, which can inject controlled perturbations into the noise embeddings during inference.
arXiv Detail & Related papers (2024-09-03T02:29:01Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - High-Fidelity Speech Synthesis with Minimal Supervision: All Using
Diffusion Models [56.00939852727501]
Minimally-supervised speech synthesis decouples TTS by combining two types of discrete speech representations.
Non-autoregressive framework enhances controllability, and duration diffusion model enables diversified prosodic expression.
arXiv Detail & Related papers (2023-09-27T09:27:03Z) - Guided Diffusion Model for Adversarial Purification [103.4596751105955]
Adversarial attacks disturb deep neural networks (DNNs) in various algorithms and frameworks.
We propose a novel purification approach, referred to as guided diffusion model for purification (GDMP)
On our comprehensive experiments across various datasets, the proposed GDMP is shown to reduce the perturbations raised by adversarial attacks to a shallow range.
arXiv Detail & Related papers (2022-05-30T10:11:15Z) - Learning to Generate Noise for Multi-Attack Robustness [126.23656251512762]
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations.
In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system.
We propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks.
arXiv Detail & Related papers (2020-06-22T10:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.