Entailment Semantics Can Be Extracted from an Ideal Language Model
- URL: http://arxiv.org/abs/2209.12407v3
- Date: Mon, 8 Jan 2024 22:01:26 GMT
- Title: Entailment Semantics Can Be Extracted from an Ideal Language Model
- Authors: William Merrill and Alex Warstadt and Tal Linzen
- Abstract summary: We prove that entailment judgments between sentences can be extracted from an ideal language model.
We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data.
- Score: 32.5500309433108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models are often trained on text alone, without additional
grounding. There is debate as to how much of natural language semantics can be
inferred from such a procedure. We prove that entailment judgments between
sentences can be extracted from an ideal language model that has perfectly
learned its target distribution, assuming the training sentences are generated
by Gricean agents, i.e., agents who follow fundamental principles of
communication from the linguistic theory of pragmatics. We also show entailment
judgments can be decoded from the predictions of a language model trained on
such Gricean data. Our results reveal a pathway for understanding the semantic
information encoded in unlabeled linguistic data and a potential framework for
extracting semantics from language models.
Related papers
- Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Collapsed Language Models Promote Fairness [88.48232731113306]
We find that debiased language models exhibit collapsed alignment between token representations and word embeddings.
We design a principled fine-tuning method that can effectively improve fairness in a wide range of debiasing methods.
arXiv Detail & Related papers (2024-10-06T13:09:48Z) - Language Models as Models of Language [0.0]
This chapter critically examines the potential contributions of modern language models to theoretical linguistics.
I review a growing body of empirical evidence suggesting that language models can learn hierarchical syntactic structure and exhibit sensitivity to various linguistic phenomena.
I conclude that closer collaboration between theoretical linguists and computational researchers could yield valuable insights.
arXiv Detail & Related papers (2024-08-13T18:26:04Z) - Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - Towards Understanding What Code Language Models Learned [10.989953856458996]
Pre-trained language models are effective in a variety of natural language tasks.
It has been argued their capabilities fall short of fully learning meaning or understanding language.
We investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence.
arXiv Detail & Related papers (2023-06-20T23:42:14Z) - LaMPP: Language Models as Probabilistic Priors for Perception and Action [38.07277869107474]
We show how to leverage language models for non-linguistic perception and control tasks.
Our approach casts labeling and decision-making as inference in probabilistic graphical models.
arXiv Detail & Related papers (2023-02-03T15:14:04Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Probing Linguistic Information For Logical Inference In Pre-trained
Language Models [2.4366811507669124]
We propose a methodology for probing linguistic information for logical inference in pre-trained language model representations.
We find that (i) pre-trained language models do encode several types of linguistic information for inference, but there are also some types of information that are weakly encoded.
We have demonstrated language models' potential as semantic and background knowledge bases for supporting symbolic inference methods.
arXiv Detail & Related papers (2021-12-03T07:19:42Z) - Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.