Related papers: Large Language Models Are Cross-Lingual Knowledge-Free Reasoners

Large Language Models Are Cross-Lingual Knowledge-Free Reasoners

URL: http://arxiv.org/abs/2406.16655v1
Date: Mon, 24 Jun 2024 14:03:04 GMT
Title: Large Language Models Are Cross-Lingual Knowledge-Free Reasoners
Authors: Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang,
Abstract summary: We decompose reasoning tasks into two separated parts: knowledge retrieval and knowledge-free reasoning. With adapted and constructed knowledge-free reasoning datasets, we show that the knowledge-free reasoning capability can be nearly perfectly transferred across various source-target language directions.
Score: 43.99097308487008
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models have demonstrated impressive reasoning capabilities across multiple languages. However, the relationship between capabilities in different languages is less explored. In this work, we decompose the process of reasoning tasks into two separated parts: knowledge retrieval and knowledge-free reasoning, and analyze the cross-lingual transferability of them. With adapted and constructed knowledge-free reasoning datasets, we show that the knowledge-free reasoning capability can be nearly perfectly transferred across various source-target language directions despite the secondary impact of resource in some specific target languages, while cross-lingual knowledge retrieval significantly hinders the transfer. Moreover, by analyzing the hidden states and feed-forward network neuron activation during the reasoning tasks, we show that higher similarity of hidden representations and larger overlap of activated neurons could explain the better cross-lingual transferability of knowledge-free reasoning than knowledge retrieval. Thus, we hypothesize that knowledge-free reasoning embeds in some language-shared mechanism, while knowledge is stored separately in different languages.

Related papers

Multilingual Information Retrieval with a Monolingual Knowledge Base [2.419638771866955]
We propose a novel strategy to fine-tune multilingual embedding models with weighted sampling for contrastive learning.<n>We demonstrate that the weighted sampling strategy produces performance gains compared to standard ones by up to 31.03% in MRR and up to 33.98% in Recall@3.
arXiv Detail & Related papers (2025-06-03T07:05:49Z)
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective [64.79894853375478]
We propose a new finer-grained neuron identification algorithm, which detects language neurons(including language-specific neurons and language-related neurons) and language-agnostic neurons.<n>Based on the distributional characteristics of different types of neurons, we divide the LLMs' internal process for multilingual inference into four parts.<n>We systematically analyze the models before and after alignment with a focus on different types of neurons.
arXiv Detail & Related papers (2025-05-27T17:59:52Z)
When Less Language is More: Language-Reasoning Disentanglement Makes LLMs Better Multilingual Reasoners [111.50503126693444]
We show that language-specific ablation consistently boosts multilingual reasoning performance.<n>Compared to post-training, our training-free ablation achieves comparable or superior results with minimal computational overhead.
arXiv Detail & Related papers (2025-05-21T08:35:05Z)
One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models [19.58983929459173]
Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora. LLMs have also demonstrated excellent multilingual capabilities, which can express the learned knowledge in multiple languages.
arXiv Detail & Related papers (2024-11-26T13:03:49Z)
Multilingual Knowledge Editing with Language-Agnostic Factual Neurons [98.73585104789217]
We investigate how large language models (LLMs) represent multilingual factual knowledge. We find that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons. Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge.
arXiv Detail & Related papers (2024-06-24T08:06:56Z)
Measuring Cross-lingual Transfer in Bytes [9.011910726620538]
We show that models from diverse languages perform similarly to a target language in a cross-lingual setting. We also found evidence that this transfer is not related to language contamination or language proximity. Our experiments have opened up new possibilities for measuring how much data represents the language-agnostic representations learned during pretraining.
arXiv Detail & Related papers (2024-04-12T01:44:46Z)
Language Representation Projection: Can We Transfer Factual Knowledge across Languages in Multilingual Language Models? [48.88328580373103]
We propose two parameter-free $textbfL$anguage $textbfR$epresentation $textbfP$rojection modules (LRP2) The first module converts non-English representations into English-like equivalents, while the second module reverts English-like representations back into representations of the corresponding non-English language. Experimental results on the mLAMA dataset demonstrate that LRP2 significantly improves factual knowledge retrieval accuracy and facilitates knowledge transferability across diverse non-English languages.
arXiv Detail & Related papers (2023-11-07T08:16:16Z)
Breaking the Language Barrier: Improving Cross-Lingual Reasoning with Structured Self-Attention [18.439771003766026]
We study whether multilingual language models (MultiLMs) can transfer logical reasoning abilities to other languages when they are fine-tuned for reasoning in a different language. We demonstrate that although MultiLMs can transfer reasoning ability across languages in a monolingual setting, they struggle to transfer reasoning abilities in a code-switched setting. Following this observation, we propose a novel attention mechanism that uses a dedicated set of parameters to encourage cross-lingual attention in code-switched sequences.
arXiv Detail & Related papers (2023-10-23T18:06:38Z)
Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z)
Cross-Lingual Ability of Multilingual Masked Language Models: A Study of Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence. Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z)
Does External Knowledge Help Explainable Natural Language Inference? Automatic Evaluation vs. Human Ratings [35.2513653224183]
Natural language inference (NLI) requires models to learn and apply commonsense knowledge. We investigate whether external knowledge can also improve their explanation capabilities. We conduct the largest and most fine-grained explainable NLI crowdsourcing study to date.
arXiv Detail & Related papers (2021-09-16T09:56:20Z)
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source. We observe that our representations embed typology and strengthen correlations with language relationships. We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z)
Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks [3.9342247746757435]
It is now established that modern neural language models can be successfully trained on multiple languages simultaneously. But what kind of knowledge is really shared among languages within these models? In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors. We find that exposing our LMs to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.
arXiv Detail & Related papers (2020-03-31T09:48:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.