Related papers: Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs

Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs

URL: http://arxiv.org/abs/2502.13195v1
Date: Tue, 18 Feb 2025 17:40:20 GMT
Title: Linguistic Generalizations are not Rules: Impacts on Evaluation of LMs
Authors: Leonie Weissweiler, Kyle Mahowald, Adele Goldberg,
Abstract summary: Linguistic evaluations of how well LMs generalize often implicitly take for granted that natural languages are generated by symbolic rules.<n>Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on rules.
Score: 13.918775015238863
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Linguistic evaluations of how well LMs generalize to produce or understand novel text often implicitly take for granted that natural languages are generated by symbolic rules. Grammaticality is thought to be determined by whether or not sentences obey such rules. Interpretation is believed to be compositionally generated by syntactic rules operating on meaningful words. Semantic parsing is intended to map sentences into formal logic. Failures of LMs to obey strict rules have been taken to reveal that LMs do not produce or understand language like humans. Here we suggest that LMs' failures to obey symbolic rules may be a feature rather than a bug, because natural languages are not based on rules. New utterances are produced and understood by a combination of flexible interrelated and context-dependent schemata or constructions. We encourage researchers to reimagine appropriate benchmarks and analyses that acknowledge the rich flexible generalizations that comprise natural languages.

Related papers

Can Language Models Learn Typologically Implausible Languages? [62.823015163987996]
Grammatical features across human languages show intriguing correlations often attributed to learning biases in humans.<n>We discuss how language models (LMs) allow us to better determine the role of domain-general learning biases in language universals.<n>We test LMs on an array of highly naturalistic but counterfactual versions of the English (head-initial) and Japanese (head-final) languages.
arXiv Detail & Related papers (2025-02-17T20:40:01Z)
Rule Extrapolation in Language Models: A Study of Compositional Generalization on OOD Prompts [14.76420070558434]
Rule extrapolation describes OOD scenarios, where the prompt violates at least one rule. We focus on formal languages, which are defined by the intersection of rules. We lay the first stones of a normative theory of rule extrapolation, inspired by the Solomonoff prior in algorithmic information theory.
arXiv Detail & Related papers (2024-09-09T22:36:35Z)
Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST! [4.1970767174840455]
We study whether pre-trained language models (LMs) correctly identify and interpret underspecified sentences. Our experiments show that when interpreting underspecified sentences, LMs exhibit little uncertainty, contrary to what theoretical accounts of underspecification would predict.
arXiv Detail & Related papers (2024-02-19T19:49:29Z)
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs [87.34281749422756]
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. We propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic.
arXiv Detail & Related papers (2024-02-18T03:38:51Z)
Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts. We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language. We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules. We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences. Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z)
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation if all expressions in the language are referentially transparent. However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z)
Discourse structure interacts with reference but not syntax in neural language models [17.995905582226463]
We study the ability of language models (LMs) to learn interactions between different linguistic representations. We find that, contrary to humans, implicit causality only influences LM behavior for reference, not syntax. Our results suggest that LM behavior can contradict not only learned representations of discourse but also syntactic agreement.
arXiv Detail & Related papers (2020-10-10T03:14:00Z)
A Benchmark for Systematic Generalization in Grounded Language Understanding [61.432407738682635]
Humans easily interpret expressions that describe unfamiliar situations composed from familiar parts. Modern neural networks, by contrast, struggle to interpret novel compositions. We introduce a new benchmark, gSCAN, for evaluating compositional generalization in situated language understanding.
arXiv Detail & Related papers (2020-03-11T08:40:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.