D-Separation for Causal Self-Explanation
- URL: http://arxiv.org/abs/2309.13391v2
- Date: Tue, 31 Oct 2023 08:43:05 GMT
- Title: D-Separation for Causal Self-Explanation
- Authors: Wei Liu, Jun Wang, Haozhao Wang, Ruixuan Li, Zhiying Deng, YuanKai
Zhang, Yang Qiu
- Abstract summary: We propose a novel criterion to uncover the causal rationale, termed the Minimum Conditional Dependence (MCD) criterion.
We demonstrate that MCD improves the F1 score by up to $13.7%$ compared to previous state-of-the-art MMI-based methods.
- Score: 19.68235036397476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rationalization is a self-explaining framework for NLP models. Conventional
work typically uses the maximum mutual information (MMI) criterion to find the
rationale that is most indicative of the target label. However, this criterion
can be influenced by spurious features that correlate with the causal rationale
or the target label. Instead of attempting to rectify the issues of the MMI
criterion, we propose a novel criterion to uncover the causal rationale, termed
the Minimum Conditional Dependence (MCD) criterion, which is grounded on our
finding that the non-causal features and the target label are
\emph{d-separated} by the causal rationale. By minimizing the dependence
between the unselected parts of the input and the target label conditioned on
the selected rationale candidate, all the causes of the label are compelled to
be selected. In this study, we employ a simple and practical measure of
dependence, specifically the KL-divergence, to validate our proposed MCD
criterion. Empirically, we demonstrate that MCD improves the F1 score by up to
$13.7\%$ compared to previous state-of-the-art MMI-based methods. Our code is
available at: \url{https://github.com/jugechengzi/Rationalization-MCD}.
Related papers
- Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization [17.26418974819275]
This paper develops a new criterion that treats spurious features as plain noise.
Experiments show that our MRD criterion improves rationale quality (measured by the overlap with human-annotated rationales) by up to $10.4%$ as compared to several recent competitive MMI variants.
arXiv Detail & Related papers (2024-10-08T13:04:02Z) - Covariate Assisted Entity Ranking with Sparse Intrinsic Scores [3.2839905453386162]
We introduce novel model identification conditions and examine the regularized penalized Maximum Likelihood Estimator statistical rates.
We also apply our method to the goodness-of-fit test for models with no latent intrinsic scores.
arXiv Detail & Related papers (2024-07-09T19:58:54Z) - RORA: Robust Free-Text Rationale Evaluation [52.98000150242775]
We propose RORA, a Robust free-text Rationale evaluation against label leakage.
RORA consistently outperforms existing approaches in evaluating human-written, synthetic, or model-generated rationales.
We also show that RORA aligns well with human judgment, providing a more reliable and accurate measurement across diverse free-text rationales.
arXiv Detail & Related papers (2024-02-28T19:46:21Z) - Distinguishing Cause from Effect on Categorical Data: The Uniform
Channel Model [0.0]
Distinguishing cause from effect using observations of a pair of random variables is a core problem in causal discovery.
We propose a criterion to address the cause-effect problem with categorical variables.
We select as the most likely causal direction the one in which the conditional probability mass function is closer to a uniform channel (UC)
arXiv Detail & Related papers (2023-03-14T13:54:11Z) - Optimizing Partial Area Under the Top-k Curve: Theory and Practice [151.5072746015253]
We develop a novel metric named partial Area Under the top-k Curve (AUTKC)
AUTKC has a better discrimination ability, and its Bayes optimal score function could give a correct top-K ranking with respect to the conditional probability.
We present an empirical surrogate risk minimization framework to optimize the proposed metric.
arXiv Detail & Related papers (2022-09-03T11:09:13Z) - A Unified Joint Maximum Mean Discrepancy for Domain Adaptation [73.44809425486767]
This paper theoretically derives a unified form of JMMD that is easy to optimize.
From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence that benefits to classification.
We propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift.
arXiv Detail & Related papers (2021-01-25T09:46:14Z) - Coreference Reasoning in Machine Reading Comprehension [100.75624364257429]
We show that coreference reasoning in machine reading comprehension is a greater challenge than was earlier thought.
We propose a methodology for creating reading comprehension datasets that better reflect the challenges of coreference reasoning.
This allows us to show an improvement in the reasoning abilities of state-of-the-art models across various MRC datasets.
arXiv Detail & Related papers (2020-12-31T12:18:41Z) - Rethink Maximum Mean Discrepancy for Domain Adaptation [77.2560592127872]
This paper theoretically proves two essential facts: 1) minimizing the Maximum Mean Discrepancy equals to maximize the source and target intra-class distances respectively but jointly minimize their variance with some implicit weights, so that the feature discriminability degrades.
Experiments on several benchmark datasets not only prove the validity of theoretical results but also demonstrate that our approach could perform better than the comparative state-of-art methods substantially.
arXiv Detail & Related papers (2020-07-01T18:25:10Z) - Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale.
We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments.
We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.