Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand?
- URL: http://arxiv.org/abs/2104.10809v1
- Date: Thu, 22 Apr 2021 01:00:17 GMT
- Title: Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand?
- Authors: William Merrill, Yoav Goldberg, Roy Schwartz, Noah A. Smith
- Abstract summary: We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
- Score: 87.20342701232869
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models trained on billions of tokens have recently led to
unprecedented results on many NLP tasks. This success raises the question of
whether, in principle, a system can ever "understand" raw text without access
to some form of grounding. We formally investigate the abilities of ungrounded
systems to acquire meaning. Our analysis focuses on the role of "assertions":
contexts within raw text that provide indirect clues about underlying
semantics. We study whether assertions enable a system to emulate
representations preserving semantic relations like equivalence. We find that
assertions enable semantic emulation if all expressions in the language are
referentially transparent. However, if the language uses non-transparent
patterns like variable binding, we show that emulation can become an
uncomputable problem. Finally, we discuss differences between our formal model
and natural language, exploring how our results generalize to a modal setting
and other semantic relations. Together, our results suggest that assertions in
code or language do not provide sufficient signal to fully emulate semantic
representations. We formalize ways in which ungrounded language models appear
to be fundamentally limited in their ability to "understand".
Related papers
- Pixel Sentence Representation Learning [67.4775296225521]
In this work, we conceptualize the learning of sentence-level textual semantics as a visual representation learning process.
We employ visually-grounded text perturbation methods like typos and word order shuffling, resonating with human cognitive patterns, and enabling perturbation to be perceived as continuous.
Our approach is further bolstered by large-scale unsupervised topical alignment training and natural language inference supervision.
arXiv Detail & Related papers (2024-02-13T02:46:45Z) - We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding.
We characterize ambiguity in a sentence by its effect on entailment relations with another sentence.
We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z) - Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions.
Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z) - Entailment Semantics Can Be Extracted from an Ideal Language Model [32.5500309433108]
We prove that entailment judgments between sentences can be extracted from an ideal language model.
We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data.
arXiv Detail & Related papers (2022-09-26T04:16:02Z) - Norm Participation Grounds Language [16.726800816202033]
I propose a different, and more wide-ranging conception of how grounding should be understood: What grounds language is its normative nature.
There are standards for doing things right, these standards are public and authoritative, while at the same time acceptance of authority can be disputed and negotiated.
What grounds language, then, is the determined use that language users make of it, and what it is grounded in is the community of language users.
arXiv Detail & Related papers (2022-06-06T20:21:59Z) - Learning Symbolic Rules for Reasoning in Quasi-Natural Language [74.96601852906328]
We build a rule-based system that can reason with natural language input but without the manual construction of rules.
We propose MetaQNL, a "Quasi-Natural" language that can express both formal logic and natural language sentences.
Our approach achieves state-of-the-art accuracy on multiple reasoning benchmarks.
arXiv Detail & Related papers (2021-11-23T17:49:00Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.