Entailment Semantics Can Be Extracted from an Ideal Language Model
- URL: http://arxiv.org/abs/2209.12407v3
- Date: Mon, 8 Jan 2024 22:01:26 GMT
- Title: Entailment Semantics Can Be Extracted from an Ideal Language Model
- Authors: William Merrill and Alex Warstadt and Tal Linzen
- Abstract summary: We prove that entailment judgments between sentences can be extracted from an ideal language model.
We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data.
- Score: 32.5500309433108
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models are often trained on text alone, without additional
grounding. There is debate as to how much of natural language semantics can be
inferred from such a procedure. We prove that entailment judgments between
sentences can be extracted from an ideal language model that has perfectly
learned its target distribution, assuming the training sentences are generated
by Gricean agents, i.e., agents who follow fundamental principles of
communication from the linguistic theory of pragmatics. We also show entailment
judgments can be decoded from the predictions of a language model trained on
such Gricean data. Our results reveal a pathway for understanding the semantic
information encoded in unlabeled linguistic data and a potential framework for
extracting semantics from language models.
Related papers
- Learning Phonotactics from Linguistic Informants [54.086544221761486]
Our model iteratively selects or synthesizes a data-point according to one of a range of information-theoretic policies.
We find that the information-theoretic policies that our model uses to select items to query the informant achieve sample efficiency comparable to, or greater than, fully supervised approaches.
arXiv Detail & Related papers (2024-05-08T00:18:56Z) - Towards Understanding What Code Language Models Learned [10.989953856458996]
Pre-trained language models are effective in a variety of natural language tasks.
It has been argued their capabilities fall short of fully learning meaning or understanding language.
We investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence.
arXiv Detail & Related papers (2023-06-20T23:42:14Z) - Evidence of Meaning in Language Models Trained on Programs [5.892876463573452]
We present evidence that language models can learn meaning despite being trained only to perform next token prediction on text.
We first train a Transformer model on the corpus of programs, then probe the trained model's hidden states as it completes a program given a specification.
There is a strong, statistically significant correlation between the accuracy of the probe and the model's ability to generate a program that implements the specification.
arXiv Detail & Related papers (2023-05-18T17:58:08Z) - LaMPP: Language Models as Probabilistic Priors for Perception and Action [38.07277869107474]
We show how to leverage language models for non-linguistic perception and control tasks.
Our approach casts labeling and decision-making as inference in probabilistic graphical models.
arXiv Detail & Related papers (2023-02-03T15:14:04Z) - Language Models as Inductive Reasoners [125.99461874008703]
We propose a new paradigm (task) for inductive reasoning, which is to induce natural language rules from natural language facts.
We create a dataset termed DEER containing 1.2k rule-fact pairs for the task, where rules and facts are written in natural language.
We provide the first and comprehensive analysis of how well pretrained language models can induce natural language rules from natural language facts.
arXiv Detail & Related papers (2022-12-21T11:12:14Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - Probing Linguistic Information For Logical Inference In Pre-trained
Language Models [2.4366811507669124]
We propose a methodology for probing linguistic information for logical inference in pre-trained language model representations.
We find that (i) pre-trained language models do encode several types of linguistic information for inference, but there are also some types of information that are weakly encoded.
We have demonstrated language models' potential as semantic and background knowledge bases for supporting symbolic inference methods.
arXiv Detail & Related papers (2021-12-03T07:19:42Z) - Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z) - Constrained Language Models Yield Few-Shot Semantic Parsers [73.50960967598654]
We explore the use of large pretrained language models as few-shot semantics.
The goal in semantic parsing is to generate a structured meaning representation given a natural language input.
We use language models to paraphrase inputs into a controlled sublanguage resembling English that can be automatically mapped to a target meaning representation.
arXiv Detail & Related papers (2021-04-18T08:13:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.