Learning Disentangled Representations of Negation and Uncertainty
- URL: http://arxiv.org/abs/2204.00511v1
- Date: Fri, 1 Apr 2022 15:12:05 GMT
- Title: Learning Disentangled Representations of Negation and Uncertainty
- Authors: Jake Vasilakes, Chrysoula Zerva, Makoto Miwa, Sophia Ananiadou
- Abstract summary: Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify.
We attempt to disentangle the representations of negation, uncertainty, and content using a Variational Autoencoder.
- Score: 25.11863604063283
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Negation and uncertainty modeling are long-standing tasks in natural language
processing. Linguistic theory postulates that expressions of negation and
uncertainty are semantically independent from each other and the content they
modify. However, previous works on representation learning do not explicitly
model this independence. We therefore attempt to disentangle the
representations of negation, uncertainty, and content using a Variational
Autoencoder. We find that simply supervising the latent representations results
in good disentanglement, but auxiliary objectives based on adversarial learning
and mutual information minimization can provide additional disentanglement
gains.
Related papers
- Explaining Sources of Uncertainty in Automated Fact-Checking [41.236833314783134]
CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations) is a framework to generate natural language explanations of model uncertainty.<n>It identifies relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty.<n> CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions.
arXiv Detail & Related papers (2025-05-23T13:06:43Z) - Revisiting subword tokenization: A case study on affixal negation in large language models [57.75279238091522]
We measure the impact of affixal negation on modern English large language models (LLMs)
We conduct experiments using LLMs with different subword tokenization methods.
We show that models can, on the whole, reliably recognize the meaning of affixal negation.
arXiv Detail & Related papers (2024-04-03T03:14:27Z) - Strong hallucinations from negation and how to fix them [2.1178416840822027]
We show that our approach improves model performance in cloze prompting and natural language inference tasks with negation without requiring training on sparse negative data.
We call such responses textitstrong hallucinations and prove that they follow from an LM's computation of its internal representations for logical operators and outputs from those representations.
arXiv Detail & Related papers (2024-02-16T10:11:20Z) - Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs)
We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties.
The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Simple Linguistic Inferences of Large Language Models (LLMs): Blind Spots and Blinds [59.71218039095155]
We evaluate language understanding capacities on simple inference tasks that most humans find trivial.
We target (i) grammatically-specified entailments, (ii) premises with evidential adverbs of uncertainty, and (iii) monotonicity entailments.
The models exhibit moderate to low performance on these evaluation sets.
arXiv Detail & Related papers (2023-05-24T06:41:09Z) - Learning Fair Representation via Distributional Contrastive
Disentanglement [9.577369164287813]
Learning fair representation is crucial for achieving fairness or debiasing sensitive information.
We propose a new approach, learning FAir Representation via distributional CONtrastive Variational AutoEncoder (FarconVAE)
We show superior performance on fairness, pretrained model debiasing, and domain generalization tasks from various modalities.
arXiv Detail & Related papers (2022-06-17T12:58:58Z) - On the Faithfulness Measurements for Model Interpretations [100.2730234575114]
Post-hoc interpretations aim to uncover how natural language processing (NLP) models make predictions.
To tackle these issues, we start with three criteria: the removal-based criterion, the sensitivity of interpretations, and the stability of interpretations.
Motivated by the desideratum of these faithfulness notions, we introduce a new class of interpretation methods that adopt techniques from the adversarial domain.
arXiv Detail & Related papers (2021-04-18T09:19:44Z) - Where and What? Examining Interpretable Disentangled Representations [96.32813624341833]
Capturing interpretable variations has long been one of the goals in disentanglement learning.
Unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting.
In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted.
arXiv Detail & Related papers (2021-04-07T11:22:02Z) - Show or Suppress? Managing Input Uncertainty in Machine Learning Model
Explanations [5.695163312473304]
Feature attribution is widely used in interpretable machine learning to explain how influential each measured input feature value is for an output inference.
It is unclear how the awareness of input uncertainty can affect the trust in explanations.
We propose two approaches to help users to manage their perception of uncertainty in a model explanation.
arXiv Detail & Related papers (2021-01-23T13:10:48Z) - An Experimental Study of Semantic Continuity for Deep Learning Models [11.883949320223078]
We argue that semantic discontinuity results from inappropriate training targets and contributes to notorious issues such as adversarial robustness, interpretability, etc.
We first conduct data analysis to provide evidence of semantic discontinuity in existing deep learning models, and then design a simple semantic continuity constraint which theoretically enables models to obtain smooth gradients and learn semantic-oriented features.
arXiv Detail & Related papers (2020-11-19T12:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.