Distribution Matching for Rationalization
- URL: http://arxiv.org/abs/2106.00320v1
- Date: Tue, 1 Jun 2021 08:49:32 GMT
- Title: Distribution Matching for Rationalization
- Authors: Yongfeng Huang, Yujun Chen, Yulun Du, Zhilin Yang
- Abstract summary: rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks.
We propose a novel rationalization method that matches the distributions of rationales and input text in both the feature space and output space.
- Score: 30.54889533406428
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of rationalization aims to extract pieces of input text as
rationales to justify neural network predictions on text classification tasks.
By definition, rationales represent key text pieces used for prediction and
thus should have similar classification feature distribution compared to the
original input text. However, previous methods mainly focused on maximizing the
mutual information between rationales and labels while neglecting the
relationship between rationales and input text. To address this issue, we
propose a novel rationalization method that matches the distributions of
rationales and input text in both the feature space and output space.
Empirically, the proposed distribution matching approach consistently
outperforms previous methods by a large margin. Our data and code are
available.
Related papers
- Enhancing the Rationale-Input Alignment for Self-explaining
Rationalization [22.74436500022893]
We introduce a novel approach called DAR (textbfDiscriminatively textbfAligned textbfRationalization) to align the selected rationale and the original input.
Experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality.
arXiv Detail & Related papers (2023-12-07T07:37:15Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - Rationale-Augmented Ensembles in Language Models [53.45015291520658]
We reconsider rationale-augmented prompting for few-shot in-context learning.
We identify rationale sampling in the output space as the key component to robustly improve performance.
We demonstrate that rationale-augmented ensembles achieve more accurate and interpretable results than existing prompting approaches.
arXiv Detail & Related papers (2022-07-02T06:20:57Z) - Interlock-Free Multi-Aspect Rationalization for Text Classification [33.33452117387646]
We show that we address the interlocking problem in the multi-aspect setting.
We propose a multi-stage training method incorporating an additional self-supervised contrastive loss.
Empirical results on the beer review dataset show that our method improves significantly the rationalization performance.
arXiv Detail & Related papers (2022-05-13T16:38:38Z) - Contextualized Semantic Distance between Highly Overlapped Texts [85.1541170468617]
Overlapping frequently occurs in paired texts in natural language processing tasks like text editing and semantic similarity evaluation.
This paper aims to address the issue with a mask-and-predict strategy.
We take the words in the longest common sequence as neighboring words and use masked language modeling (MLM) to predict the distributions on their positions.
Experiments on Semantic Textual Similarity show NDD to be more sensitive to various semantic differences, especially on highly overlapped paired texts.
arXiv Detail & Related papers (2021-10-04T03:59:15Z) - SPECTRA: Sparse Structured Text Rationalization [0.0]
We present a unified framework for deterministic extraction of structured explanations via constrained inference on a factor graph.
Our approach greatly eases training and rationale regularization, generally outperforming previous work on plausibility extracted explanations.
arXiv Detail & Related papers (2021-09-09T20:39:56Z) - Variable Instance-Level Explainability for Text Classification [9.147707153504117]
We propose a method for extracting variable-length explanations using a set of different feature scoring methods at instance-level.
Our method consistently provides more faithful explanations compared to previous fixed-length and fixed-feature scoring methods for rationale extraction.
arXiv Detail & Related papers (2021-04-16T16:53:48Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.