Related papers: CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer

CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer

URL: http://arxiv.org/abs/2302.13201v1
Date: Sun, 26 Feb 2023 00:57:29 GMT
Title: CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer
Authors: Ruolin Su, Zhongkai Sun, Sixing Lu, Chengyuan Ma, Chenlei Guo
Abstract summary: We propose the attention-based Cross-LIngual Commonsense Knowledge transfER framework. CLICKER minimizes the performance gaps between English and non-English languages in commonsense question-answering tasks. CLICKER achieves remarkable improvements in the cross-lingual task for languages other than English.
Score: 5.375217612596619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in cross-lingual commonsense reasoning (CSR) are facilitated by the development of multilingual pre-trained models (mPTMs). While mPTMs show the potential to encode commonsense knowledge for different languages, transferring commonsense knowledge learned in large-scale English corpus to other languages is challenging. To address this problem, we propose the attention-based Cross-LIngual Commonsense Knowledge transfER (CLICKER) framework, which minimizes the performance gaps between English and non-English languages in commonsense question-answering tasks. CLICKER effectively improves commonsense reasoning for non-English languages by differentiating non-commonsense knowledge from commonsense knowledge. Experimental results on public benchmarks demonstrate that CLICKER achieves remarkable improvements in the cross-lingual CSR task for languages other than English.

Related papers

MMATH: A Multilingual Benchmark for Mathematical Reasoning [94.05289799605957]
We introduce MMATH, a benchmark for multilingual complex reasoning spanning 374 high-quality math problems across 10 typologically diverse languages.<n>We observe that even advanced models like DeepSeek R1 exhibit substantial performance disparities across languages and suffer from a critical off-target issue-generating responses in unintended languages.<n>Our findings offer new insights and practical strategies for advancing the multilingual reasoning capabilities of large language models.
arXiv Detail & Related papers (2025-05-25T12:47:39Z)
Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task [73.35882908048423]
Retrieval-augmented generation (RAG) has become a cornerstone of contemporary NLP. This paper investigates the effectiveness of RAG across multiple languages by proposing novel approaches for multilingual open-domain question-answering.
arXiv Detail & Related papers (2025-04-04T17:35:43Z)
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection [9.696145679371213]
Code-switching where multilingual speakers alternately switch between languages during conversations poses significant challenges to end-to-end (E2E) automatic speech recognition (ASR) systems.<n>Our main contributions are at least threefold: First, we incorporate language identification information into several intermediate layers of the encoder, aiming to enrich output embeddings with more detailed language information.<n>Second, through the novel application of language boundary alignment loss, the subsequent ASR modules are enabled to more effectively utilize the knowledge of internal language posteriors.
arXiv Detail & Related papers (2024-11-26T06:49:05Z)
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching [14.841981996951395]
Code-switching (CS) can convey subtle cultural and linguistic nuances that can be otherwise lost in translation. Recent state-of-the-art multilingual large language models (LLMs) demonstrate excellent multilingual abilities in various aspects including understanding CS.
arXiv Detail & Related papers (2024-10-24T05:14:03Z)
Cross-Lingual Multi-Hop Knowledge Editing -- Benchmarks, Analysis and a Simple Contrastive Learning based Approach [53.028586843468915]
We propose the Cross-Lingual Multi-Hop Knowledge Editing paradigm, for measuring and analyzing the performance of various SoTA knowledge editing techniques in a cross-lingual setup. Specifically, we create a parallel cross-lingual benchmark, CROLIN-MQUAKE for measuring the knowledge editing capabilities. Following this, we propose a significantly improved system for cross-lingual multi-hop knowledge editing, CLEVER-CKE.
arXiv Detail & Related papers (2024-07-14T17:18:16Z)
xCoT: Cross-lingual Instruction Tuning for Cross-lingual Chain-of-Thought Reasoning [36.34986831526529]
Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models. We propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2024-01-13T10:53:53Z)
CL-MASR: A Continual Learning Benchmark for Multilingual ASR [15.974765568276615]
We propose CL-MASR, a benchmark for studying multilingual automatic speech recognition in a continual learning setting. CL-MASR provides a diverse set of continual learning methods implemented on top of large-scale pretrained ASR models, along with common metrics. To the best of our knowledge, CL-MASR is the first continual learning benchmark for the multilingual ASR task.
arXiv Detail & Related papers (2023-10-25T18:55:40Z)
The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer [3.300216758849348]
Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. We propose a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. We experiment with a typologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks. Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA
arXiv Detail & Related papers (2022-04-13T15:28:43Z)
Leveraging Knowledge in Multilingual Commonsense Reasoning [25.155987513306854]
We propose to utilize English knowledge sources via a translate-retrieve-translate (TRT) strategy. For multilingual commonsense questions and choices, we collect related knowledge via translation and retrieval from the knowledge sources. The retrieved knowledge is then translated into the target language and integrated into a pre-trained multilingual language model.
arXiv Detail & Related papers (2021-10-16T03:51:53Z)
X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU) Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages. We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z)
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval [51.60862829942932]
We present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks. For sentence-level CLIR, we demonstrate that state-of-the-art performance can be achieved. However, the peak performance is not met using the general-purpose multilingual text encoders off-the-shelf', but rather relying on their variants that have been further specialized for sentence understanding tasks.
arXiv Detail & Related papers (2021-01-21T00:15:38Z)
VECO: Variable and Flexible Cross-lingual Pre-training for Language Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages. It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language. The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z)
Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages. We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC) LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language. We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z)
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension [86.1617182312817]
We propose two auxiliary tasks in the fine-tuning stage to create additional phrase boundary supervision. A mixed Machine Reading task, which translates the question or passage to other languages and builds cross-lingual question-passage pairs. A language-agnostic knowledge masking task by leveraging knowledge phrases mined from web.
arXiv Detail & Related papers (2020-04-29T10:44:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.