The Language Model Understood the Prompt was Ambiguous: Probing
Syntactic Uncertainty Through Generation
- URL: http://arxiv.org/abs/2109.07848v1
- Date: Thu, 16 Sep 2021 10:27:05 GMT
- Title: The Language Model Understood the Prompt was Ambiguous: Probing
Syntactic Uncertainty Through Generation
- Authors: Laura Aina, Tal Linzen
- Abstract summary: We inspect to which extent neural language models (LMs) exhibit uncertainty over such analyses.
We find that LMs can track multiple analyses simultaneously.
As a response to disambiguating cues, the LMs often select the correct interpretation, but occasional errors point to potential areas of improvement.
- Score: 23.711953448400514
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Temporary syntactic ambiguities arise when the beginning of a sentence is
compatible with multiple syntactic analyses. We inspect to which extent neural
language models (LMs) exhibit uncertainty over such analyses when processing
temporarily ambiguous inputs, and how that uncertainty is modulated by
disambiguating cues. We probe the LM's expectations by generating from it: we
use stochastic decoding to derive a set of sentence completions, and estimate
the probability that the LM assigns to each interpretation based on the
distribution of parses across completions. Unlike scoring-based methods for
targeted syntactic evaluation, this technique makes it possible to explore
completions that are not hypothesized in advance by the researcher. We apply
this method to study the behavior of two LMs (GPT2 and an LSTM) on three types
of temporary ambiguity, using materials from human sentence processing
experiments. We find that LMs can track multiple analyses simultaneously; the
degree of uncertainty varies across constructions and contexts. As a response
to disambiguating cues, the LMs often select the correct interpretation, but
occasional errors point to potential areas of improvement.
Related papers
- Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs)
We conduct experiments using LLMs with different subword tokenization methods.
We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z) - Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment [36.82878715850013]
Merrill et al. argue that, in theory, sentence co-occurrence probabilities predicted by an optimal LM should reflect the entailment relationship of the constituent sentences.
We investigate whether their theory can be used to decode entailment relations from neural LMs.
We find that a test similar to theirs can decode entailment relations between natural sentences, well above random chance, though not perfectly.
arXiv Detail & Related papers (2024-02-21T17:36:07Z) - Do Pre-Trained Language Models Detect and Understand Semantic Underspecification? Ask the DUST! [4.1970767174840455]
We study whether pre-trained language models (LMs) correctly identify and interpret underspecified sentences.
Our experiments show that when interpreting underspecified sentences, LMs exhibit little uncertainty, contrary to what theoretical accounts of underspecification would predict.
arXiv Detail & Related papers (2024-02-19T19:49:29Z) - Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs)
We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties.
The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding.
We characterize ambiguity in a sentence by its effect on entailment relations with another sentence.
We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z) - Are Representations Built from the Ground Up? An Empirical Examination
of Local Composition in Language Models [91.3755431537592]
Representing compositional and non-compositional phrases is critical for language understanding.
We first formulate a problem of predicting the LM-internal representations of longer phrases given those of their constituents.
While we would expect the predictive accuracy to correlate with human judgments of semantic compositionality, we find this is largely not the case.
arXiv Detail & Related papers (2022-10-07T14:21:30Z) - A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes.
We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z) - An Investigation of Language Model Interpretability via Sentence Editing [5.492504126672887]
We re-purpose a sentence editing dataset as a testbed for interpretability of pre-trained language models (PLMs)
This enables us to conduct a systematic investigation on an array of questions regarding PLMs' interpretability.
The investigation generates new insights, for example, contrary to the common understanding, we find that attention weights correlate well with human rationales.
arXiv Detail & Related papers (2020-11-28T00:46:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.