Special-Character Adversarial Attacks on Open-Source Language Model
- URL: http://arxiv.org/abs/2508.14070v1
- Date: Tue, 12 Aug 2025 03:42:59 GMT
- Title: Special-Character Adversarial Attacks on Open-Source Language Model
- Authors: Ephraiem Sarabamoun,
- Abstract summary: Large language models (LLMs) have achieved remarkable performance across diverse natural language processing tasks.<n>Character-level adversarial manipulations presents significant security challenges for real-world deployments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved remarkable performance across diverse natural language processing tasks, yet their vulnerability to character-level adversarial manipulations presents significant security challenges for real-world deployments.
Related papers
- Continual-learning for Modelling Low-Resource Languages from Large Language Models [1.462912591880424]
Small language models (SLM) built for low-resource languages pose the challenge of catastrophic forgetting.<n>This work proposes to employ a continual learning strategy using parts-of-speech (POS)-based code-switching.<n> Experiments conducted on vision language tasks such as visual question answering and language modelling task exhibits the success of the proposed architecture.
arXiv Detail & Related papers (2026-01-09T15:51:12Z) - Design Patterns for Securing LLM Agents against Prompt Injections [26.519964636138585]
prompt injection attacks exploit the agent's resilience on natural language inputs.<n>We propose a set of principled design patterns for building AI agents with provable resistance to prompt injection.
arXiv Detail & Related papers (2025-06-10T14:23:55Z) - Adversarial Attack Classification and Robustness Testing for Large Language Models for Code [19.47426054151291]
This study investigates how adversarial perturbations in natural language inputs affect Large Language Models for Code (LLM4Code)<n>It examines the effects of perturbations at the character, word, and sentence levels to identify the most impactful vulnerabilities.
arXiv Detail & Related papers (2025-06-09T17:02:29Z) - Language Matters: How Do Multilingual Input and Reasoning Paths Affect Large Reasoning Models? [59.970391602080205]
Despite multilingual training, LRMs tend to default to reasoning in high-resource languages at test time.<n>Cultural reasoning degrades performance on reasoning tasks but benefits cultural tasks, while safety evaluations exhibit language-specific behavior.
arXiv Detail & Related papers (2025-05-23T02:46:18Z) - A Preliminary Study of Large Language Models for Multilingual Vulnerability Detection [13.269680075539135]
Large language models (LLMs) offer language-agnostic capabilities and enhanced semantic understanding.<n>Recent advancements in large language models (LLMs) offer language-agnostic capabilities and enhanced semantic understanding.<n>Our findings reveal that the PLM CodeT5P achieves the best performance in multilingual vulnerability detection.
arXiv Detail & Related papers (2025-05-12T09:19:31Z) - MrGuard: A Multilingual Reasoning Guardrail for Universal LLM Safety [56.77103365251923]
Large Language Models (LLMs) are susceptible to adversarial attacks such as jailbreaking.<n>This vulnerability is exacerbated in multilingual settings, where multilingual safety-aligned data is often limited.<n>We introduce a multilingual guardrail with reasoning for prompt classification.
arXiv Detail & Related papers (2025-04-21T17:15:06Z) - SMILE: Speech Meta In-Context Learning for Low-Resource Language Automatic Speech Recognition [55.2480439325792]
Speech Meta In-Context LEarning (SMILE) is an innovative framework that combines meta-learning with speech in-context learning (SICL)<n>We show that SMILE consistently outperforms baseline methods in training-free few-shot multilingual ASR tasks.
arXiv Detail & Related papers (2024-09-16T16:04:16Z) - Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey [63.4581186135101]
Large language models (LLMs) have made remarkable advancements in natural language processing.<n>Privacy and security issues have been revealed throughout their life cycle.<n>This survey outlines and analyzes potential countermeasures.
arXiv Detail & Related papers (2024-06-12T07:55:32Z) - TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning [63.481446315733145]
Cross-lingual backdoor attacks against multilingual large language models (LLMs) are under-explored.<n>Our research focuses on how poisoning the instruction-tuning data for one or two languages can affect the outputs for languages whose instruction-tuning data were not poisoned.<n>Our method exhibits remarkable efficacy in models like mT5 and GPT-4o, with high attack success rates, surpassing 90% in more than 7 out of 12 languages.
arXiv Detail & Related papers (2024-04-30T14:43:57Z) - CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs.
Our studies reveal a new and universal safety vulnerability of these models against code input.
We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z) - Multilingual Jailbreak Challenges in Large Language Models [96.74878032417054]
In this study, we reveal the presence of multilingual jailbreak challenges within large language models (LLMs)
We consider two potential risky scenarios: unintentional and intentional.
We propose a novel textscSelf-Defense framework that automatically generates multilingual training data for safety fine-tuning.
arXiv Detail & Related papers (2023-10-10T09:44:06Z) - The Cybersecurity Crisis of Artificial Intelligence: Unrestrained
Adoption and Natural Language-Based Attacks [0.0]
The widespread integration of autoregressive-large language models (AR-LLMs) has introduced critical vulnerabilities with uniquely scalable characteristics.
In this commentary, we analyse these vulnerabilities, their dependence on natural language as a vector of attack, and their challenges to cybersecurity best practices.
arXiv Detail & Related papers (2023-09-25T10:48:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.