Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery
- URL: http://arxiv.org/abs/2303.01962v1
- Date: Thu, 2 Mar 2023 06:33:48 GMT
- Title: Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery
- Authors: Tao Feng, Lizhen Qu, Gholamreza Haffari
- Abstract summary: We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
- Score: 52.95935278819512
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we conduct the first study on spurious correlations for
open-domain response generation models based on a corpus CGDIALOG curated in
our work. The cur rent models indeed suffer from spurious correlations and have
a tendency of generating irrelevant and generic responses. Inspired by causal
discovery algorithms, we propose a novel model-agnostic method for training and
inference of response generation model using a conditional independence
classifier. The classifier is trained by a constrained self-training method,
coined CONSTRAIN, to overcome data scarcity. The experimental results based on
both human and automatic evaluation show that our method significantly
outperforms the competitive baselines in terms of relevance, informativeness,
and fluency.
Related papers
- Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model [4.694536172504849]
This study introduces an innovative method for analyzing the impact of various interventions on customer churn, using the potential outcomes framework.
We present a new causal model, the tensorized latent factor block hazard model, which incorporates tensor completion methods for a principled causal analysis of customer churn.
arXiv Detail & Related papers (2024-05-18T19:54:14Z) - Sample Complexity Bounds for Score-Matching: Causal Discovery and
Generative Modeling [82.36856860383291]
We demonstrate that accurate estimation of the score function is achievable by training a standard deep ReLU neural network.
We establish bounds on the error rate of recovering causal relationships using the score-matching-based causal discovery method.
arXiv Detail & Related papers (2023-10-27T13:09:56Z) - Differentiable Retrieval Augmentation via Generative Language Modeling
for E-commerce Query Intent Classification [8.59563091603226]
We propose Differentiable Retrieval Augmentation via Generative lANguage modeling(Dragan) to address this problem by a novel differentiable reformulation.
We demonstrate the effectiveness of our proposed method on a challenging NLP task in e-commerce search, namely query intent classification.
arXiv Detail & Related papers (2023-08-18T05:05:35Z) - Learning Data Representations with Joint Diffusion Models [20.25147743706431]
Joint machine learning models that allow synthesizing and classifying data often offer uneven performance between those tasks or are unstable to train.
We extend the vanilla diffusion model with a classifier that allows for stable joint end-to-end training with shared parameterization between those objectives.
The resulting joint diffusion model outperforms recent state-of-the-art hybrid methods in terms of both classification and generation quality on all evaluated benchmarks.
arXiv Detail & Related papers (2023-01-31T13:29:19Z) - Characterization and Greedy Learning of Gaussian Structural Causal
Models under Unknown Interventions [3.441021278275805]
We consider the problem of recovering the causal structure underlying observations when the targets of the interventions in each experiment are unknown.
We derive a greedy algorithm called GnIES to recover the equivalence class of the data-generating model without knowledge of the intervention targets.
We leverage this procedure and evaluate the performance of GnIES on synthetic, real, and semi-synthetic data sets.
arXiv Detail & Related papers (2022-11-27T17:37:21Z) - Causal Inference via Nonlinear Variable Decorrelation for Healthcare
Applications [60.26261850082012]
We introduce a novel method with a variable decorrelation regularizer to handle both linear and nonlinear confounding.
We employ association rules as new representations using association rule mining based on the original features to increase model interpretability.
arXiv Detail & Related papers (2022-09-29T17:44:14Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Resolving Spurious Correlations in Causal Models of Environments via
Interventions [2.836066255205732]
We consider the problem of inferring a causal model of a reinforcement learning environment.
Our method designs a reward function that incentivizes an agent to do an intervention to find errors in the causal model.
The experimental results in a grid-world environment show that our approach leads to better causal models compared to baselines.
arXiv Detail & Related papers (2020-02-12T20:20:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.