Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
- URL: http://arxiv.org/abs/2408.11749v1
- Date: Wed, 21 Aug 2024 16:16:34 GMT
- Title: Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
- Authors: Yiyi Chen, Russa Biswas, Heather Lent, Johannes Bjerva,
- Abstract summary: Large Language Models (LLMs) are susceptible to malicious influence by cyber attackers through intrusions such as adversarial, backdoor, and embedding inversion attacks.
This study explores the security of multilingual LLMs in the context of embedding inversion attacks and investigates cross-lingual and cross-script inversion across 20 languages.
Our findings indicate that languages written in Arabic script and Cyrillic script are particularly vulnerable to embedding inversion, as are languages within the Indo-Aryan language family.
- Score: 3.2297018268473665
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Large Language Models (LLMs) are susceptible to malicious influence by cyber attackers through intrusions such as adversarial, backdoor, and embedding inversion attacks. In response, the burgeoning field of LLM Security aims to study and defend against such threats. Thus far, the majority of works in this area have focused on monolingual English models, however, emerging research suggests that multilingual LLMs may be more vulnerable to various attacks than their monolingual counterparts. While previous work has investigated embedding inversion over a small subset of European languages, it is challenging to extrapolate these findings to languages from different linguistic families and with differing scripts. To this end, we explore the security of multilingual LLMs in the context of embedding inversion attacks and investigate cross-lingual and cross-script inversion across 20 languages, spanning over 8 language families and 12 scripts. Our findings indicate that languages written in Arabic script and Cyrillic script are particularly vulnerable to embedding inversion, as are languages within the Indo-Aryan language family. We further observe that inversion models tend to suffer from language confusion, sometimes greatly reducing the efficacy of an attack. Accordingly, we systematically explore this bottleneck for inversion models, uncovering predictable patterns which could be leveraged by attackers. Ultimately, this study aims to further the field's understanding of the outstanding security vulnerabilities facing multilingual LLMs and raise awareness for the languages most at risk of negative impact from these attacks.
Related papers
- Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis [5.029635172046762]
Language Confusion is a phenomenon where Large Language Models (LLMs) generate text that is neither in the desired language, nor in a contextually appropriate language.
We introduce a novel metric, Language Confusion Entropy, designed to measure and quantify this confusion.
arXiv Detail & Related papers (2024-10-17T05:43:30Z) - Understanding and Mitigating Language Confusion in LLMs [76.96033035093204]
We evaluate 15 typologically diverse languages with existing and newly-created English and multilingual prompts.
We find that Llama Instruct and Mistral models exhibit high degrees of language confusion.
We find that language confusion can be partially mitigated via few-shot prompting, multilingual SFT and preference tuning.
arXiv Detail & Related papers (2024-06-28T17:03:51Z) - Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.
But can these models relate corresponding concepts across languages, effectively being crosslingual?
This study evaluates six state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z) - TuBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning [63.481446315733145]
Cross-lingual backdoor attacks against multilingual large language models (LLMs) are under-explored.
Our research focuses on how poisoning the instruction-tuning data for one or two languages can affect the outputs for languages whose instruction-tuning data were not poisoned.
Our method exhibits remarkable efficacy in models like mT5 and GPT-4o, with high attack success rates, surpassing 90% in more than 7 out of 12 languages.
arXiv Detail & Related papers (2024-04-30T14:43:57Z) - Backdoor Attack on Multilingual Machine Translation [53.28390057407576]
multilingual machine translation (MNMT) systems have security vulnerabilities.
An attacker injects poisoned data into a low-resource language pair to cause malicious translations in other languages.
This type of attack is of particular concern, given the larger attack surface of languages inherent to low-resource settings.
arXiv Detail & Related papers (2024-04-03T01:32:31Z) - A Cross-Language Investigation into Jailbreak Attacks in Large Language
Models [14.226415550366504]
A particularly underexplored area is the Multilingual Jailbreak attack.
There is a lack of comprehensive empirical studies addressing this specific threat.
This study provides valuable insights into understanding and mitigating Multilingual Jailbreak attacks.
arXiv Detail & Related papers (2024-01-30T06:04:04Z) - The Language Barrier: Dissecting Safety Challenges of LLMs in
Multilingual Contexts [46.089025223336854]
This paper examines the variations in safety challenges faced by large language models across different languages.
We compare how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages.
arXiv Detail & Related papers (2024-01-23T23:12:09Z) - Text Embedding Inversion Security for Multilingual Language Models [2.790855523145802]
Research shows that text can be reconstructed from embeddings, even without knowledge of the underlying model.
This study is the first to investigate multilingual inversion attacks, shedding light on the differences in attacks and defenses across monolingual and multilingual settings.
arXiv Detail & Related papers (2024-01-22T18:34:42Z) - Multilingual Jailbreak Challenges in Large Language Models [96.74878032417054]
In this study, we reveal the presence of multilingual jailbreak challenges within large language models (LLMs)
We consider two potential risky scenarios: unintentional and intentional.
We propose a novel textscSelf-Defense framework that automatically generates multilingual training data for safety fine-tuning.
arXiv Detail & Related papers (2023-10-10T09:44:06Z) - Inducing Language-Agnostic Multilingual Representations [61.97381112847459]
Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.
We examine three approaches for this: (i) re-aligning the vector spaces of target languages to a pivot source language; (ii) removing language-specific means and variances, which yields better discriminativeness of embeddings as a by-product; and (iii) increasing input similarity across languages by removing morphological contractions and sentence reordering.
arXiv Detail & Related papers (2020-08-20T17:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.