Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
- URL: http://arxiv.org/abs/2409.13728v2
- Date: Thu, 24 Oct 2024 11:30:33 GMT
- Title: Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
- Authors: Anna Mészáros, Szilvia Ujváry, Wieland Brendel, Patrik Reizinger, Ferenc Huszár,
- Abstract summary: Rule extrapolation describes OOD scenarios, where the prompt violates at least one rule.
We focus on formal languages, which are defined by the intersection of rules.
We lay the first stones of a normative theory of rule extrapolation, inspired by the Solomonoff prior in algorithmic information theory.
- Score: 14.76420070558434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LLMs show remarkable emergent abilities, such as inferring concepts from presumably out-of-distribution prompts, known as in-context learning. Though this success is often attributed to the Transformer architecture, our systematic understanding is limited. In complex real-world data sets, even defining what is out-of-distribution is not obvious. To better understand the OOD behaviour of autoregressive LLMs, we focus on formal languages, which are defined by the intersection of rules. We define a new scenario of OOD compositional generalization, termed rule extrapolation. Rule extrapolation describes OOD scenarios, where the prompt violates at least one rule. We evaluate rule extrapolation in formal languages with varying complexity in linear and recurrent architectures, the Transformer, and state space models to understand the architectures' influence on rule extrapolation. We also lay the first stones of a normative theory of rule extrapolation, inspired by the Solomonoff prior in algorithmic information theory.
Related papers
- Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs [13.918775015238863]
Linguistic evaluations of how well LMs generalize often implicitly take for granted that natural languages are generated by symbolic rules.
Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on rules.
arXiv Detail & Related papers (2025-02-18T17:40:20Z) - RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios [58.90106984375913]
RuleArena is a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning.
Covering three practical domains -- airline baggage fees, NBA transactions, and tax regulations -- RuleArena assesses LLMs' proficiency in handling intricate natural language instructions.
arXiv Detail & Related papers (2024-12-12T06:08:46Z) - Out-of-distribution generalization via composition: a lens through induction heads in Transformers [0.46085106405479537]
Large language models (LLMs) such as GPT-4 sometimes appear to be creative, solving novel tasks often with a few demonstrations in the prompt.
These tasks require the models to generalize on distributions different from those from training data -- which is known as out-of-distribution (OOD) generalization.
We examine OOD generalization in settings where instances are generated according to hidden rules, including in-context learning with symbolic reasoning.
arXiv Detail & Related papers (2024-08-18T14:52:25Z) - Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks.
However, their mastery of underlying inferential rules still falls short of human capabilities.
We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z) - Distilling Rule-based Knowledge into Large Language Models [90.7765003679106]
We are inspired that humans can learn the new tasks or knowledge in another way by learning from rules.
We propose rule distillation, which first uses the strong in-context abilities of LLMs to extract the knowledge from the textual rules.
Our experiments show that making LLMs learn from rules by our method is much more efficient than example-based learning in both the sample size and generalization ability.
arXiv Detail & Related papers (2023-11-15T11:42:41Z) - ChatRule: Mining Logical Rules with Large Language Models for Knowledge
Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs.
Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs.
To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z) - Learning Locally Interpretable Rule Ensemble [2.512827436728378]
A rule ensemble is an interpretable model based on the linear combination of weighted rules.
This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable.
arXiv Detail & Related papers (2023-06-20T12:06:56Z) - Differentiable Rule Induction with Learned Relational Features [9.193818627108572]
Rule Network (RRN) is a neural architecture that learns predicates that represent a linear relationship among attributes along with the rules that use them.
On benchmark tasks we show that these predicates are simple enough to retain interpretability, yet improve prediction accuracy and provide sets of rules that are more concise compared to state of the art rule induction algorithms.
arXiv Detail & Related papers (2022-01-17T16:46:50Z) - Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules.
We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences.
Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z) - Open Rule Induction [2.1248439796866228]
Language model (LM)-based rule generation are proposed to enhance the expressive power of the rules.
We argue that, while KB-based methods inducted rules by discovering data commonalities, the current LM-based methods are "learning rules from rules"
In this paper, we propose the open rule induction problem, which aims to induce open rules utilizing the knowledge in LMs.
arXiv Detail & Related papers (2021-10-26T11:20:24Z) - A Benchmark for Systematic Generalization in Grounded Language
Understanding [61.432407738682635]
Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts.
Modern neural networks, by contrast, struggle to interpret novel compositions.
We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
arXiv Detail & Related papers (2020-03-11T08:40:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.