Causal Discovery and Counterfactual Reasoning to Optimize Persuasive Dialogue Policies
- URL: http://arxiv.org/abs/2503.16544v1
- Date: Wed, 19 Mar 2025 06:06:10 GMT
- Title: Causal Discovery and Counterfactual Reasoning to Optimize Persuasive Dialogue Policies
- Authors: Donghuo Zeng, Roberto Legaspi, Yuewen Sun, Xinshuai Dong, Kazushi Ikeda, Peter Spirtes, Kun Zhang,
- Abstract summary: We use causal discovery and counterfactual reasoning to optimize system persuasion capability and outcomes.<n>Our experiments with the PersuasionForGood dataset show measurable improvements in persuasion outcomes.
- Score: 14.324214906731923
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tailoring persuasive conversations to users leads to more effective persuasion. However, existing dialogue systems often struggle to adapt to dynamically evolving user states. This paper presents a novel method that leverages causal discovery and counterfactual reasoning for optimizing system persuasion capability and outcomes. We employ the Greedy Relaxation of the Sparsest Permutation (GRaSP) algorithm to identify causal relationships between user and system utterance strategies, treating user strategies as states and system strategies as actions. GRaSP identifies user strategies as causal factors influencing system responses, which inform Bidirectional Conditional Generative Adversarial Networks (BiCoGAN) in generating counterfactual utterances for the system. Subsequently, we use the Dueling Double Deep Q-Network (D3QN) model to utilize counterfactual data to determine the best policy for selecting system utterances. Our experiments with the PersuasionForGood dataset show measurable improvements in persuasion outcomes using our approach over baseline methods. The observed increase in cumulative rewards and Q-values highlights the effectiveness of causal discovery in enhancing counterfactual reasoning and optimizing reinforcement learning policies for online dialogue systems.
Related papers
- Exploring the Impact of Personality Traits on Conversational Recommender Systems: A Simulation with Large Language Models [70.180385882195]
This paper introduces a personality-aware user simulation for Conversational Recommender Systems (CRSs)
The user agent induces customizable personality traits and preferences, while the system agent possesses the persuasion capability to simulate realistic interaction in CRSs.
Experimental results demonstrate that state-of-the-art LLMs can effectively generate diverse user responses aligned with specified personality traits.
arXiv Detail & Related papers (2025-04-09T13:21:17Z) - Generative Framework for Personalized Persuasion: Inferring Causal, Counterfactual, and Latent Knowledge [14.324214906731923]
We create hypothetical scenarios to examine the effects of alternative system responses.
We employ causal discovery to identify strategy-level causal relationships among user and system utterances.
We optimize policies for selecting system responses based on counterfactual data.
arXiv Detail & Related papers (2025-04-08T15:33:54Z) - Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and Method [60.364834418531366]
We propose five new evaluation metrics that comprehensively and accurately assess the performance of RRS.
We formulate the RRS from a causal perspective, formulating recommendations as bilateral interventions.
We introduce a reranking strategy to maximize matching outcomes, as measured by the proposed metrics.
arXiv Detail & Related papers (2024-08-19T07:21:02Z) - Counterfactual Reasoning Using Predicted Latent Personality Dimensions for Optimizing Persuasion Outcome [13.731895847081953]
We present a novel approach that tracks a user's latent personality dimensions (LPDs) during ongoing persuasion conversation.
We generate tailored counterfactual utterances based on these LPDs to optimize the overall persuasion outcome.
arXiv Detail & Related papers (2024-04-21T23:03:47Z) - Enhancing End-to-End Multi-Task Dialogue Systems: A Study on Intrinsic Motivation Reinforcement Learning Algorithms for Improved Training and Adaptability [1.0985060632689174]
Investigating intrinsic motivation reinforcement learning algorithms is the goal of this study.
We adapt techniques for random network distillation and curiosity-driven reinforcement learning to measure the frequency of state visits.
Experimental results on MultiWOZ, a heterogeneous dataset, show that intrinsic motivation-based debate systems outperform policies that depend on extrinsic incentives.
arXiv Detail & Related papers (2024-01-31T18:03:39Z) - From Heuristic to Analytic: Cognitively Motivated Strategies for
Coherent Physical Commonsense Reasoning [66.98861219674039]
Heuristic-Analytic Reasoning (HAR) strategies drastically improve the coherence of rationalizations for model decisions.
Our findings suggest that human-like reasoning strategies can effectively improve the coherence and reliability of PLM reasoning.
arXiv Detail & Related papers (2023-10-24T19:46:04Z) - Improving Factual Consistency for Knowledge-Grounded Dialogue Systems
via Knowledge Enhancement and Alignment [77.56326872997407]
Pretrained language models (PLMs) based knowledge-grounded dialogue systems are prone to generate responses that are factually inconsistent with the provided knowledge source.
Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability.
arXiv Detail & Related papers (2023-10-12T14:44:05Z) - PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded
Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities.
We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework.
We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z) - Large-Scale Sequential Learning for Recommender and Engineering Systems [91.3755431537592]
In this thesis, we focus on the design of an automatic algorithms that provide personalized ranking by adapting to the current conditions.
For the former, we propose novel algorithm called SAROS that take into account both kinds of feedback for learning over the sequence of interactions.
The proposed idea of taking into account the neighbour lines shows statistically significant results in comparison with the initial approach for faults detection in power grid.
arXiv Detail & Related papers (2022-05-13T21:09:41Z) - What Does The User Want? Information Gain for Hierarchical Dialogue
Policy Optimisation [3.1433893853959605]
optimisation via reinforcement learning (RL) is susceptible to sample inefficiency and instability.
We propose the usage of an intrinsic reward based on information gain to address this issue.
Our algorithm, which we call FeudalGain, achieves state-of-the-art results in most environments of the PyDial framework.
arXiv Detail & Related papers (2021-09-15T07:21:26Z) - Persuasive Dialogue Understanding: the Baselines and Negative Results [27.162062321321805]
We demonstrate the limitations of a Transformer-based approach coupled with Conditional Random Field (CRF) for the task of persuasive strategy recognition.
We leverage inter- and intra-speaker contextual semantic features, as well as label dependencies to improve the recognition.
arXiv Detail & Related papers (2020-11-19T16:52:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.