Language Anisotropic Cross-Lingual Model Editing
- URL: http://arxiv.org/abs/2205.12677v2
- Date: Mon, 5 Jun 2023 09:13:05 GMT
- Title: Language Anisotropic Cross-Lingual Model Editing
- Authors: Yang Xu, Yutai Hou, Wanxiang Che, Min Zhang
- Abstract summary: Existing work only studies the monolingual scenario, which lacks the cross-lingual transferability to perform editing simultaneously across languages.
We propose a framework to naturally adapt monolingual model editing approaches to the cross-lingual scenario using parallel corpus.
We empirically demonstrate the failure of monolingual baselines in propagating the edit to multiple languages and the effectiveness of the proposed language anisotropic model editing.
- Score: 61.51863835749279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multilingual pre-trained language models can learn task-specific abilities or
memorize facts across multiple languages but inevitably make undesired
predictions with specific inputs. Under similar observation, model editing aims
to post-hoc calibrate a model targeted to specific inputs with keeping the
model's raw behavior. However, existing work only studies the monolingual
scenario, which lacks the cross-lingual transferability to perform editing
simultaneously across languages. In this work, we focus on cross-lingual model
editing. Firstly, we define the cross-lingual model editing task and
corresponding metrics, where an edit in one language propagates to the others.
Next, we propose a framework to naturally adapt monolingual model editing
approaches to the cross-lingual scenario using parallel corpus. Further, we
propose language anisotropic editing to improve cross-lingual editing by
amplifying different subsets of parameters for each language. On the newly
defined cross-lingual model editing task, we empirically demonstrate the
failure of monolingual baselines in propagating the edit to multiple languages
and the effectiveness of the proposed language anisotropic model editing. Our
code is publicly available at https://github.com/franklear/LiME.
Related papers
- MPN: Leveraging Multilingual Patch Neuron for Cross-lingual Model
Editing [10.81072864833299]
We propose a simple yet effective method that trains multilingual patch neuron to store cross-lingual knowledge.
It can be easily adapted to existing approaches to enhance their cross-lingual editing capabilities.
arXiv Detail & Related papers (2024-01-06T10:40:24Z) - Cross-Lingual Knowledge Editing in Large Language Models [73.12622532088564]
Knowledge editing has been shown to adapt large language models to new knowledge without retraining from scratch.
It is still unknown the effect of source language editing on a different target language.
We first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese.
arXiv Detail & Related papers (2023-09-16T11:07:52Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Parameter-efficient Zero-shot Transfer for Cross-Language Dense
Retrieval with Adapters [20.168480824057923]
A popular approach to creating a cross-language retrieval model is to substitute a monolingual pretrained language model in the retrieval model.
We show that models trained with monolingual data are more effective than fine-tuning the entire model when transferring to a Cross Language Information Retrieval setting.
arXiv Detail & Related papers (2022-12-20T17:25:04Z) - Lifting the Curse of Multilinguality by Pre-training Modular
Transformers [72.46919537293068]
multilingual pre-trained models suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.
We introduce language-specific modules, which allows us to grow the total capacity of the model, while keeping the total number of trainable parameters per language constant.
Our approach enables adding languages post-hoc with no measurable drop in performance, no longer limiting the model usage to the set of pre-trained languages.
arXiv Detail & Related papers (2022-05-12T17:59:56Z) - On the ability of monolingual models to learn language-agnostic
representations [2.604227467422371]
We show that monolingual models pretrained and finetuned on different languages achieve competitive performance.
For example, models pretrained on distant languages such as German and Portuguese perform similarly on English tasks.
arXiv Detail & Related papers (2021-09-04T22:09:44Z) - How Good is Your Tokenizer? On the Monolingual Performance of
Multilingual Language Models [96.32118305166412]
We study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks.
We find that languages which are adequately represented in the multilingual model's vocabulary exhibit negligible performance decreases over their monolingual counterparts.
arXiv Detail & Related papers (2020-12-31T14:11:00Z) - On the Importance of Word Order Information in Cross-lingual Sequence
Labeling [80.65425412067464]
Cross-lingual models that fit into the word order of the source language might fail to handle target languages.
We investigate whether making models insensitive to the word order of the source language can improve the adaptation performance in target languages.
arXiv Detail & Related papers (2020-01-30T03:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.