Related papers: Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective

Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective

URL: http://arxiv.org/abs/2509.01147v1
Date: Mon, 01 Sep 2025 05:49:49 GMT
Title: Zero-shot Cross-lingual NER via Mitigating Language Difference: An Entity-aligned Translation Perspective
Authors: Zhihao Zhang, Sophia Yat Mei Lee, Dong Zhang, Shoushan Li, Guodong Zhou,
Abstract summary: Cross-lingual Named Entity Recognition aims to transfer knowledge from high-resource languages to low-resource languages.<n>Existing zero-shot CL-NER approaches primarily focus on Latin script language (LSL), where shared linguistic features facilitate effective knowledge transfer.<n>For non-Latin script language (NSL), such as Chinese and Japanese, performance often degrades due to deep structural differences.<n>We propose an entity-aligned translation (EAT) approach. Leveraging large language models (LLMs), EAT employs a dual-translation strategy to align entities between NSL and English.
Score: 29.24475373988723
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Cross-lingual Named Entity Recognition (CL-NER) aims to transfer knowledge from high-resource languages to low-resource languages. However, existing zero-shot CL-NER (ZCL-NER) approaches primarily focus on Latin script language (LSL), where shared linguistic features facilitate effective knowledge transfer. In contrast, for non-Latin script language (NSL), such as Chinese and Japanese, performance often degrades due to deep structural differences. To address these challenges, we propose an entity-aligned translation (EAT) approach. Leveraging large language models (LLMs), EAT employs a dual-translation strategy to align entities between NSL and English. In addition, we fine-tune LLMs using multilingual Wikipedia data to enhance the entity alignment from source to target languages.

Related papers

Language Surgery in Multilingual Large Language Models [32.77326546076424]
Large Language Models (LLMs) have demonstrated remarkable generalization capabilities across tasks and languages.<n>This paper investigates the naturally emerging representation alignment in LLMs, particularly in the middle layers.<n>We propose Inference-Time Language Control (ITLC) to enable precise cross-lingual language control and mitigate language confusion.
arXiv Detail & Related papers (2025-06-14T11:09:50Z)
Lens: Rethinking Multilingual Enhancement for Large Language Models [70.85065197789639]
We propose Lens, a novel approach to enhance multilingual capabilities in large language models (LLMs)<n>Lens operates on two subspaces: the language-agnostic subspace, where it aligns target languages with the central language to inherit strong semantic representations, and the language-specific subspace, where it separates target and central languages to preserve linguistic specificity.<n>Lens significantly improves multilingual performance while maintaining the model's English proficiency, achieving better results with less computational cost compared to existing post-training approaches.
arXiv Detail & Related papers (2024-10-06T08:51:30Z)
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models [62.91524967852552]
Large language models (LLMs) are typically multilingual due to pretraining on diverse multilingual corpora.<n>But can these models relate corresponding concepts across languages, i.e., be crosslingual?<n>This study evaluates state-of-the-art LLMs on inherently crosslingual tasks.
arXiv Detail & Related papers (2024-06-23T15:15:17Z)
Cross-Lingual Transfer Robustness to Lower-Resource Languages on Adversarial Datasets [4.653113033432781]
Cross-lingual transfer capabilities of Multilingual Language Models (MLLMs) are investigated. Our research provides valuable insights into cross-lingual transfer and its implications for NLP applications.
arXiv Detail & Related papers (2024-03-29T08:47:15Z)
Multi-level Contrastive Learning for Cross-lingual Spoken Language Understanding [90.87454350016121]
We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels. We develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer.
arXiv Detail & Related papers (2022-05-07T13:44:28Z)
Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders [85.80950708769923]
We probe multilingual language models for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs. We also devise a novel method to expose this knowledge by additionally fine-tuning multilingual models. We report substantial gains on standard benchmarks.
arXiv Detail & Related papers (2022-04-30T13:23:16Z)
GL-CLeF: A Global-Local Contrastive Learning Framework for Cross-lingual Spoken Language Understanding [74.39024160277809]
We present Global--Local Contrastive Learning Framework (GL-CLeF) to address this shortcoming. Specifically, we employ contrastive learning, leveraging bilingual dictionaries to construct multilingual views of the same utterance. GL-CLeF achieves the best performance and successfully pulls representations of similar sentences across languages closer.
arXiv Detail & Related papers (2022-04-18T13:56:58Z)
Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation [11.155430893354769]
This paper proposes a novel meta-learning framework to learn shareable structures from typologically diverse languages. We first cluster the languages based on language representations and identify the centroid language of each cluster. A meta-learning algorithm is trained with all centroid languages and evaluated on the other languages in the zero-shot setting.
arXiv Detail & Related papers (2022-03-19T05:22:07Z)
FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.