Causal Document-Grounded Dialogue Pre-training
- URL: http://arxiv.org/abs/2305.10927v3
- Date: Sun, 5 Nov 2023 15:26:49 GMT
- Title: Causal Document-Grounded Dialogue Pre-training
- Authors: Yingxiu Zhao, Bowen Yu, Haiyang Yu, Bowen Li, Jinyang Li, Chao Wang,
Fei Huang, Yongbin Li, Nevin L. Zhang
- Abstract summary: We present a causally-complete dataset construction strategy for building million-level DocGD pre-training corpora.
Experiments on three benchmark datasets demonstrate that our causal pre-training achieves considerable and consistent improvements under fully-supervised, low-resource, few-shot, and zero-shot settings.
- Score: 81.16429056652483
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of document-grounded dialogue (DocGD) is to generate a response by
grounding the evidence in a supporting document in accordance with the dialogue
context. This process involves four variables that are causally connected.
Recently, task-specific pre-training has greatly boosted performances on many
downstream tasks. Existing DocGD methods, however, continue to rely on general
pre-trained language models without a specifically tailored pre-training
approach that explicitly captures the causal relationships. To tackle this
issue, we are the first to present a causally-complete dataset construction
strategy for building million-level DocGD pre-training corpora. To better
capture causality, we further propose a causally-perturbed pre-training
strategy, which introduces causal perturbations on the variables and optimizes
the overall causal effect. Experiments on three benchmark datasets demonstrate
that our causal pre-training achieves considerable and consistent improvements
under fully-supervised, low-resource, few-shot, and zero-shot settings.
Related papers
- Analysing The Impact of Sequence Composition on Language Model
Pre-Training [20.929800523719187]
We study the influence of the pre-training sequence composition strategy on the generalisation properties of the model.
Applying causal masking can lead to the inclusion of distracting information from previous documents during pre-training.
In intra-document causal masking, the likelihood of each token is only conditioned on the previous tokens in the same document.
arXiv Detail & Related papers (2024-02-21T18:23:16Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Improving Weakly Supervised Sound Event Detection with Causal
Intervention [46.229038054764956]
Existing weakly supervised sound event detection work has not explored both types of co-occurrences simultaneously.
We first establish a structural causal model (SCM) to reveal that the context is the main cause of co-occurrence confounders.
Based on the causal analysis, we propose a causal intervention (CI) method for WSSED to remove the negative impact of co-occurrence confounders.
arXiv Detail & Related papers (2023-03-10T03:13:36Z) - CausalDialogue: Modeling Utterance-level Causality in Conversations [83.03604651485327]
We have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing.
This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure.
We propose a causality-enhanced method called Exponential Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models.
arXiv Detail & Related papers (2022-12-20T18:31:50Z) - Socratic Pretraining: Question-Driven Pretraining for Controllable
Summarization [89.04537372465612]
Socratic pretraining is a question-driven, unsupervised pretraining objective designed to improve controllability in summarization tasks.
Our results show that Socratic pretraining cuts task-specific labeled data requirements in half.
arXiv Detail & Related papers (2022-12-20T17:27:10Z) - OPAL: Ontology-Aware Pretrained Language Model for End-to-End
Task-Oriented Dialogue [40.62090743056549]
This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD)
Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: dialogue state tracker (DST) and response generator (RG)
arXiv Detail & Related papers (2022-09-10T04:38:27Z) - CUP: Curriculum Learning based Prompt Tuning for Implicit Event Argument
Extraction [22.746071199667146]
Implicit event argument extraction (EAE) aims to identify arguments that could scatter over the document.
We propose a Curriculum learning based Prompt tuning (CUP) approach, which resolves implicit EAE by four learning stages.
In addition, we integrate a prompt-based encoder-decoder model to elicit related knowledge from pre-trained language models.
arXiv Detail & Related papers (2022-05-01T16:03:54Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Guided Generation of Cause and Effect [52.44584102429394]
We present a conditional text generation framework that posits sentential expressions of possible causes and effects.
This framework depends on two novel resources: a large-scale collection of English sentences expressing causal patterns CausalBank and a refinement over previous work on constructing large lexical causal knowledge graphs Cause Effect Graph.
arXiv Detail & Related papers (2021-07-21T02:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.