Related papers: Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts

Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts

URL: http://arxiv.org/abs/2409.13728v2
Date: Thu, 24 Oct 2024 11:30:33 GMT
Title: Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts
Authors: Anna Mészáros, Szilvia Ujváry, Wieland Brendel, Patrik Reizinger, Ferenc Huszár,
Abstract summary: Rule extrapolation describes OOD scenarios, where the prompt violates at least one rule. We focus on formal languages, which are defined by the intersection of rules. We lay the first stones of a normative theory of rule extrapolation, inspired by the Solomonoff prior in algorithmic information theory.
Score: 14.76420070558434
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLMs show remarkable emergent abilities, such as inferring concepts from presumably out-of-distribution prompts, known as in-context learning. Though this success is often attributed to the Transformer architecture, our systematic understanding is limited. In complex real-world data sets, even defining what is out-of-distribution is not obvious. To better understand the OOD behaviour of autoregressive LLMs, we focus on formal languages, which are defined by the intersection of rules. We define a new scenario of OOD compositional generalization, termed rule extrapolation. Rule extrapolation describes OOD scenarios, where the prompt violates at least one rule. We evaluate rule extrapolation in formal languages with varying complexity in linear and recurrent architectures, the Transformer, and state space models to understand the architectures' influence on rule extrapolation. We also lay the first stones of a normative theory of rule extrapolation, inspired by the Solomonoff prior in algorithmic information theory.

Related papers

Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs [13.918775015238863]
Linguistic evaluations of how well LMs generalize often implicitly take for granted that natural languages are generated by symbolic rules. Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on rules.
arXiv Detail & Related papers (2025-02-18T17:40:20Z)
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios [58.90106984375913]
RuleArena is a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning. Covering three practical domains -- airline baggage fees, NBA transactions, and tax regulations -- RuleArena assesses LLMs' proficiency in handling intricate natural language instructions.
arXiv Detail & Related papers (2024-12-12T06:08:46Z)
Generating Global and Local Explanations for Tree-Ensemble Learning Methods by Answer Set Programming [4.820391833117535]
We propose a method for generating rule sets as global and local explanations for tree-ensemble learning methods. For global explanations, candidate rules are chosen from the entire trained tree-ensemble models. For local explanations, candidate rules are selected by only considering rules that are relevant to the particular predicted instance.
arXiv Detail & Related papers (2024-10-14T18:32:29Z)
Out-of-distribution generalization via composition: a lens through induction heads in Transformers [0.46085106405479537]
Large language models (LLMs) such as GPT-4 sometimes appear to be creative, solving novel tasks often with a few demonstrations in the prompt. These tasks require the models to generalize on distributions different from those from training data -- which is known as out-of-distribution (OOD) generalization. We examine OOD generalization in settings where instances are generated according to hidden rules, including in-context learning with symbolic reasoning.
arXiv Detail & Related papers (2024-08-18T14:52:25Z)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z)
Enabling Large Language Models to Learn from Rules [99.16680531261987]
We are inspired that humans can learn the new tasks or knowledge in another way by learning from rules. We propose rule distillation, which first uses the strong in-context abilities of LLMs to extract the knowledge from the textual rules. Our experiments show that making LLMs learn from rules by our method is much more efficient than example-based learning in both the sample size and generalization ability.
arXiv Detail & Related papers (2023-11-15T11:42:41Z)
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning [107.61997887260056]
We propose a novel framework, ChatRule, unleashing the power of large language models for mining logical rules over knowledge graphs. Specifically, the framework is initiated with an LLM-based rule generator, leveraging both the semantic and structural information of KGs. To refine the generated rules, a rule ranking module estimates the rule quality by incorporating facts from existing KGs.
arXiv Detail & Related papers (2023-09-04T11:38:02Z)
Abstracting Concept-Changing Rules for Solving Raven's Progressive Matrix Problems [54.26307134687171]
Raven's Progressive Matrix (RPM) is a classic test to realize such ability in machine intelligence by selecting from candidates. Recent studies suggest that solving RPM in an answer-generation way boosts a more in-depth understanding of rules. We propose a deep latent variable model for Concept-changing Rule ABstraction (CRAB) by learning interpretable concepts and parsing concept-changing rules in the latent space.
arXiv Detail & Related papers (2023-07-15T07:16:38Z)
Learning Locally Interpretable Rule Ensemble [2.512827436728378]
A rule ensemble is an interpretable model based on the linear combination of weighted rules. This paper proposes a new framework for learning a rule ensemble model that is both accurate and interpretable.
arXiv Detail & Related papers (2023-06-20T12:06:56Z)
Differentiable Rule Induction with Learned Relational Features [9.193818627108572]
Rule Network (RRN) is a neural architecture that learns predicates that represent a linear relationship among attributes along with the rules that use them. On benchmark tasks we show that these predicates are simple enough to retain interpretability, yet improve prediction accuracy and provide sets of rules that are more concise compared to state of the art rule induction algorithms.
arXiv Detail & Related papers (2022-01-17T16:46:50Z)
Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules. We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences. Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z)
Open Rule Induction [2.1248439796866228]
Language model (LM)-based rule generation are proposed to enhance the expressive power of the rules. We argue that, while KB-based methods inducted rules by discovering data commonalities, the current LM-based methods are "learning rules from rules" In this paper, we propose the open rule induction problem, which aims to induce open rules utilizing the knowledge in LMs.
arXiv Detail & Related papers (2021-10-26T11:20:24Z)
A Benchmark for Systematic Generalization in Grounded Language Understanding [61.432407738682635]
Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts. Modern neural networks, by contrast, struggle to interpret novel compositions. We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
arXiv Detail & Related papers (2020-03-11T08:40:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.