Large Language Models Are Cross-Lingual Knowledge-Free Reasoners
- URL: http://arxiv.org/abs/2406.16655v2
- Date: Tue, 15 Oct 2024 13:08:01 GMT
- Title: Large Language Models Are Cross-Lingual Knowledge-Free Reasoners
- Authors: Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang,
- Abstract summary: We decompose the process of reasoning tasks into two separated components: knowledge retrieval and knowledge-free reasoning.
We show that the knowledge-free reasoning capability can be nearly perfectly transferred across various source-target language directions.
We hypothesize that knowledge-free reasoning shares similar neurons in different languages for reasoning, while knowledge is stored separately in different languages.
- Score: 43.99097308487008
- License:
- Abstract: Large Language Models have demonstrated impressive reasoning capabilities across multiple languages. However, the relationship between capabilities in different languages is less explored. In this work, we decompose the process of reasoning tasks into two separated components: knowledge retrieval and knowledge-free reasoning, and analyze the relationship between cross-lingual transferability and these two components. With adapted commonsense reasoning datasets and constructed knowledge-free reasoning datasets, we show that the knowledge-free reasoning capability can be nearly perfectly transferred across various source-target language directions despite the secondary impact of resource in some specific target languages, while cross-lingual knowledge retrieval significantly hinders the transfer. Moreover, by analyzing the hidden states and feed-forward network neuron activation during the reasoning, we show that higher similarity of hidden representations and larger overlap of activated neurons could explain the better cross-lingual transferability of knowledge-free reasoning than knowledge retrieval. Thus, we hypothesize that knowledge-free reasoning shares similar neurons in different languages for reasoning, while knowledge is stored separately in different languages. Our code and data is available at: https://github.com/NJUNLP/Knowledge-Free-Reasoning.
Related papers
- One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models [19.58983929459173]
Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora.
LLMs have also demonstrated excellent multilingual capabilities, which can express the learned knowledge in multiple languages.
arXiv Detail & Related papers (2024-11-26T13:03:49Z) - Multilingual Knowledge Editing with Language-Agnostic Factual Neurons [98.73585104789217]
We investigate how large language models (LLMs) represent multilingual factual knowledge.
We find that the same factual knowledge in different languages generally activates a shared set of neurons, which we call language-agnostic factual neurons.
Inspired by this finding, we propose a new MKE method by locating and modifying Language-Agnostic Factual Neurons (LAFN) to simultaneously edit multilingual knowledge.
arXiv Detail & Related papers (2024-06-24T08:06:56Z) - Measuring Cross-lingual Transfer in Bytes [9.011910726620538]
We show that models from diverse languages perform similarly to a target language in a cross-lingual setting.
We also found evidence that this transfer is not related to language contamination or language proximity.
Our experiments have opened up new possibilities for measuring how much data represents the language-agnostic representations learned during pretraining.
arXiv Detail & Related papers (2024-04-12T01:44:46Z) - Language Representation Projection: Can We Transfer Factual Knowledge
across Languages in Multilingual Language Models? [48.88328580373103]
We propose two parameter-free $textbfL$anguage $textbfR$epresentation $textbfP$rojection modules (LRP2)
The first module converts non-English representations into English-like equivalents, while the second module reverts English-like representations back into representations of the corresponding non-English language.
Experimental results on the mLAMA dataset demonstrate that LRP2 significantly improves factual knowledge retrieval accuracy and facilitates knowledge transferability across diverse non-English languages.
arXiv Detail & Related papers (2023-11-07T08:16:16Z) - Breaking the Language Barrier: Improving Cross-Lingual Reasoning with
Structured Self-Attention [18.439771003766026]
We study whether multilingual language models (MultiLMs) can transfer logical reasoning abilities to other languages when they are fine-tuned for reasoning in a different language.
We demonstrate that although MultiLMs can transfer reasoning ability across languages in a monolingual setting, they struggle to transfer reasoning abilities in a code-switched setting.
Following this observation, we propose a novel attention mechanism that uses a dedicated set of parameters to encourage cross-lingual attention in code-switched sequences.
arXiv Detail & Related papers (2023-10-23T18:06:38Z) - Same Neurons, Different Languages: Probing Morphosyntax in Multilingual
Pre-trained Models [84.86942006830772]
We conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar.
We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe.
arXiv Detail & Related papers (2022-05-04T12:22:31Z) - Cross-Lingual Ability of Multilingual Masked Language Models: A Study of
Language Structure [54.01613740115601]
We study three language properties: constituent order, composition and word co-occurrence.
Our main conclusion is that the contribution of constituent order and word co-occurrence is limited, while the composition is more crucial to the success of cross-linguistic transfer.
arXiv Detail & Related papers (2022-03-16T07:09:35Z) - Does External Knowledge Help Explainable Natural Language Inference?
Automatic Evaluation vs. Human Ratings [35.2513653224183]
Natural language inference (NLI) requires models to learn and apply commonsense knowledge.
We investigate whether external knowledge can also improve their explanation capabilities.
We conduct the largest and most fine-grained explainable NLI crowdsourcing study to date.
arXiv Detail & Related papers (2021-09-16T09:56:20Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent
Neural Networks [3.9342247746757435]
It is now established that modern neural language models can be successfully trained on multiple languages simultaneously.
But what kind of knowledge is really shared among languages within these models?
In this paper we dissect different forms of cross-lingual transfer and look for its most determining factors.
We find that exposing our LMs to a related language does not always increase grammatical knowledge in the target language, and that optimal conditions for lexical-semantic transfer may not be optimal for syntactic transfer.
arXiv Detail & Related papers (2020-03-31T09:48:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.