MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model
Editing
- URL: http://arxiv.org/abs/2401.03190v1
- Date: Sat, 6 Jan 2024 10:40:24 GMT
- Title: MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model
Editing
- Authors: Nianwen Si, Hao Zhang, Weiqiang Zhang
- Abstract summary: We propose a simple yet effective method that trains multilingual patch neuron to store cross-lingual knowledge.
It can be easily adapted to existing approaches to enhance their cross-lingual editing capabilities.
- Score: 10.81072864833299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models are known for encoding a vast amount of factual
knowledge, but they often becomes outdated due to the ever-changing nature of
external information. A promising solution to this challenge is the utilization
of model editing methods to update the knowledge in an efficient manner.
However, the majority of existing model editing techniques are limited to
monolingual frameworks, thus failing to address the crucial issue of
cross-lingual knowledge synchronization for multilingual models. To tackle this
problem, we propose a simple yet effective method that trains multilingual
patch neuron to store cross-lingual knowledge. It can be easily adapted to
existing approaches to enhance their cross-lingual editing capabilities. To
evaluate our method, we conduct experiments using both the XNLI dataset and a
self-constructed XFEVER dataset. Experimental results demonstrate that our
proposed method achieves improved performance in cross-lingual editing tasks
without requiring excessive modifications to the original methodology, thereby
showcasing its user-friendly characteristics. Codes will be released soon.
Related papers
- Cross-Lingual Multi-Hop Knowledge Editing -- Benchmarks, Analysis and a Simple Contrastive Learning based Approach [53.028586843468915]
We propose the Cross-Lingual Multi-Hop Knowledge Editing paradigm, for measuring and analyzing the performance of various SoTA knowledge editing techniques in a cross-lingual setup.
Specifically, we create a parallel cross-lingual benchmark, CROLIN-MQUAKE for measuring the knowledge editing capabilities.
Following this, we propose a significantly improved system for cross-lingual multi-hop knowledge editing, CLEVER-CKE.
arXiv Detail & Related papers (2024-07-14T17:18:16Z) - Multilingual Knowledge Editing with Language-Agnostic Factual Neurons [98.73585104789217]
We investigate how large language models (LLMs) represent multilingual factual knowledge.
We find that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons.
Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge.
arXiv Detail & Related papers (2024-06-24T08:06:56Z) - MEMLA: Enhancing Multilingual Knowledge Editing with Neuron-Masked Low-Rank Adaptation [18.087144677674786]
We focus on multilingual knowledge editing (MKE), which requires propagating updates across multiple languages.
We introduce the Multilingual Knowledge Editing Benchmark (MKEB), a novel dataset comprising 12 languages.
We also propose a method that enhances knowledge Editing with neuron-Masked Low-Rank Adaptation (MEMLA)
arXiv Detail & Related papers (2024-06-17T14:03:50Z) - Cross-lingual Editing in Multilingual Language Models [1.3062731746155414]
This paper introduces the cross-lingual model editing (textbfXME) paradigm, wherein a fact is edited in one language, and the subsequent update propagation is observed across other languages.
The results reveal notable performance limitations of state-of-the-art METs under the XME setting, mainly when the languages involved belong to two distinct script families.
arXiv Detail & Related papers (2024-01-19T06:54:39Z) - Cross-Lingual Knowledge Editing in Large Language Models [73.12622532088564]
Knowledge editing has been shown to adapt large language models to new knowledge without retraining from scratch.
It is still unknown the effect of source language editing on a different target language.
We first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese.
arXiv Detail & Related papers (2023-09-16T11:07:52Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Language Anisotropic Cross-Lingual Model Editing [61.51863835749279]
Existing work only studies the monolingual scenario, which lacks the cross-lingual transferability to perform editing simultaneously across languages.
We propose a framework to naturally adapt monolingual model editing approaches to the cross-lingual scenario using parallel corpus.
We empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing.
arXiv Detail & Related papers (2022-05-25T11:38:12Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.