Constructions are Revealed in Word Distributions
- URL: http://arxiv.org/abs/2503.06048v1
- Date: Sat, 08 Mar 2025 04:22:28 GMT
- Title: Constructions are Revealed in Word Distributions
- Authors: Joshua Rozner, Leonie Weissweiler, Kyle Mahowald, Cory Shain,
- Abstract summary: Construction grammar posits that constructions are acquired through experience with language.<n>How much information about constructions does this distribution actually contain?<n>We use a RoBERTa model as a proxy for this distribution and hypothesize that constructions will be revealed within it as patterns of statistical affinity.
- Score: 18.215932573792255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Construction grammar posits that constructions (form-meaning pairings) are acquired through experience with language (the distributional learning hypothesis). But how much information about constructions does this distribution actually contain? Corpus-based analyses provide some answers, but text alone cannot answer counterfactual questions about what caused a particular word to occur. For that, we need computable models of the distribution over strings -- namely, pretrained language models (PLMs). Here we treat a RoBERTa model as a proxy for this distribution and hypothesize that constructions will be revealed within it as patterns of statistical affinity. We support this hypothesis experimentally: many constructions are robustly distinguished, including (i) hard cases where semantically distinct constructions are superficially similar, as well as (ii) schematic constructions, whose "slots" can be filled by abstract word classes. Despite this success, we also provide qualitative evidence that statistical affinity alone may be insufficient to identify all constructions from text. Thus, statistical affinity is likely an important, but partial, signal available to learners.
Related papers
- QUITE: Quantifying Uncertainty in Natural Language Text in Bayesian Reasoning Scenarios [15.193544498311603]
We present QUITE, a dataset of real-world Bayesian reasoning scenarios with categorical random variables and complex relationships.
We conduct an extensive set of experiments, finding that logic-based models outperform out-of-the-box large language models on all reasoning types.
Our results provide evidence that neuro-symbolic models are a promising direction for improving complex reasoning.
arXiv Detail & Related papers (2024-10-14T12:44:59Z) - "You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of
Abstract Meaning Representation [60.863629647985526]
We examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure.
We find that models can reliably reproduce the basic format of AMR, and can often capture core event, argument, and modifier structure.
Overall, our findings indicate that these models out-of-the-box can capture aspects of semantic structure, but there remain key limitations in their ability to support fully accurate semantic analyses or parses.
arXiv Detail & Related papers (2023-10-26T21:47:59Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - False perspectives on human language: why statistics needs linguistics [0.8699677835130408]
We show that statistical measures can be defined on the basis of either structural or non-structural models.
Only models of surprisal that reflect syntactic structure are able to account for language regularities.
arXiv Detail & Related papers (2023-02-17T11:40:32Z) - A Measure-Theoretic Characterization of Tight Language Models [105.16477132329416]
In some pathological cases, probability mass can leak'' onto the set of infinite sequences.
This paper offers a measure-theoretic treatment of language modeling.
We prove that many popular language model families are in fact tight, meaning that they will not leak in this sense.
arXiv Detail & Related papers (2022-12-20T18:17:11Z) - Model Criticism for Long-Form Text Generation [113.13900836015122]
We apply a statistical tool, model criticism in latent space, to evaluate the high-level structure of generated text.
We perform experiments on three representative aspects of high-level discourse -- coherence, coreference, and topicality.
We find that transformer-based language models are able to capture topical structures but have a harder time maintaining structural coherence or modeling coreference.
arXiv Detail & Related papers (2022-10-16T04:35:58Z) - When Hearst Is not Enough: Improving Hypernymy Detection from Corpus
with Distributional Models [59.46552488974247]
This paper addresses whether an is-a relationship exists between words (x, y) with the help of large textual corpora.
Recent studies suggest that pattern-based ones are superior, if large-scale Hearst pairs are extracted and fed, with the sparsity of unseen (x, y) pairs relieved.
For the first time, this paper quantifies the non-negligible existence of those specific cases. We also demonstrate that distributional methods are ideal to make up for pattern-based ones in such cases.
arXiv Detail & Related papers (2020-10-10T08:34:19Z) - Language Modeling with Reduced Densities [0.0]
We show that sequences of symbols from a finite alphabet, such as those found in a corpus of text, form a category enriched over probabilities.
We then address a second fundamental question: How can this information be stored and modeled in a way that preserves the categorical structure?
arXiv Detail & Related papers (2020-07-08T00:41:53Z) - Learning Probabilistic Sentence Representations from Paraphrases [47.528336088976744]
We define probabilistic models that produce distributions for sentences.
We train our models on paraphrases and demonstrate that they naturally capture sentence specificity.
Our model captures sentential entailment and provides ways to analyze the specificity and preciseness of individual words.
arXiv Detail & Related papers (2020-05-16T21:10:28Z) - INFOTABS: Inference on Tables as Semi-structured Data [39.84930221015755]
We introduce a new dataset called INFOTABS, comprising of human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes.
Our analysis shows that the semi-structured, multi-domain and heterogeneous nature of the premises admits complex, multi-faceted reasoning.
Experiments reveal that, while human annotators agree on the relationships between a table-hypothesis pair, several standard modeling strategies are unsuccessful at the task.
arXiv Detail & Related papers (2020-05-13T02:07:54Z) - A Complete Characterization of Projectivity for Statistical Relational
Models [20.833623839057097]
We introduce a class of directed latent graphical variable models that precisely correspond to the class of projective relational models.
We also obtain a characterization for when a given distribution over size-$k$ structures is the statistical frequency distribution of size-$k$ sub-structures in much larger size-$n$ structures.
arXiv Detail & Related papers (2020-04-23T05:58:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.