Dialogue Chain-of-Thought Distillation for Commonsense-aware
Conversational Agents
- URL: http://arxiv.org/abs/2310.09343v2
- Date: Sun, 22 Oct 2023 09:16:52 GMT
- Title: Dialogue Chain-of-Thought Distillation for Commonsense-aware
Conversational Agents
- Authors: Hyungjoo Chae, Yongho Song, Kai Tzu-iunn Ong, Taeyoon Kwon, Minjin
Kim, Youngjae Yu, Dongha Lee, Dongyeop Kang, Jinyoung Yeo
- Abstract summary: We propose a framework for dialogue chain-of-thought (CoT) reasoning.
We present DOCTOR, a DialOgue Chain-of-ThOught Reasoner.
We conduct experiments to show that enhancing dialogue agents with high-quality rationales from DOCTOR significantly improves the quality of their responses.
- Score: 35.6393824052347
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Human-like chatbots necessitate the use of commonsense reasoning in order to
effectively comprehend and respond to implicit information present within
conversations. Achieving such coherence and informativeness in responses,
however, is a non-trivial task. Even for large language models (LLMs), the task
of identifying and aggregating key evidence within a single hop presents a
substantial challenge. This complexity arises because such evidence is
scattered across multiple turns in a conversation, thus necessitating
integration over multiple hops. Hence, our focus is to facilitate such
multi-hop reasoning over a dialogue context, namely dialogue chain-of-thought
(CoT) reasoning. To this end, we propose a knowledge distillation framework
that leverages LLMs as unreliable teachers and selectively distills consistent
and helpful rationales via alignment filters. We further present DOCTOR, a
DialOgue Chain-of-ThOught Reasoner that provides reliable CoT rationales for
response generation. We conduct extensive experiments to show that enhancing
dialogue agents with high-quality rationales from DOCTOR significantly improves
the quality of their responses.
Related papers
- DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs [54.4857963044859]
We propose DialogueReason, a reasoning paradigm that uncovers the lost roles in monologue-style reasoning models.<n>Our work consists of an analysis of monologue reasoning patterns and the development of a dialogue-based reasoning approach.
arXiv Detail & Related papers (2025-05-11T16:39:58Z) - Reasoning Court: Combining Reasoning, Action, and Judgment for Multi-Hop Reasoning [17.829990749622496]
Reasoning Court (RC) is a novel framework that extends iterative reasoning-and-retrieval methods, such as ReAct, with a dedicated LLM judge.
RC consistently outperforms state-of-the-art few-shot prompting methods without task-specific fine-tuning.
arXiv Detail & Related papers (2025-04-14T00:56:08Z) - Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion.
We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations.
Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z) - Cohesive Conversations: Enhancing Authenticity in Multi-Agent Simulated Dialogues [17.38671584773247]
This paper investigates the quality of multi-agent dialogues in simulations powered by Large Language Models (LLMs)
We propose a novel Screening, Diagnosis, and Regeneration (SDR) framework that detects and corrects utterance errors.
arXiv Detail & Related papers (2024-07-13T14:24:45Z) - Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded
Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities.
We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework.
We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z) - HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale
Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision.
Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z) - Reason first, then respond: Modular Generation for Knowledge-infused
Dialogue [43.64093692715295]
Large language models can produce fluent dialogue but often hallucinate factual inaccuracies.
We propose a modular model, Knowledge to Response, for incorporating knowledge into conversational agents.
In detailed experiments, we find that such a model hallucinates less in knowledge-grounded dialogue tasks.
arXiv Detail & Related papers (2021-11-09T15:29:43Z) - Self-supervised Dialogue Learning for Spoken Conversational Question
Answering [29.545937716796082]
In spoken conversational question answering (SCQA), the answer to the corresponding question is generated by retrieving and then analyzing a fixed spoken document, including multi-part conversations.
We introduce a self-supervised learning approach, including incoherence discrimination, insertion detection, and question prediction, to explicitly capture the coreference resolution and dialogue coherence.
Our proposed method provides more coherent, meaningful, and appropriate responses, yielding superior performance gains compared to the original pre-trained language models.
arXiv Detail & Related papers (2021-06-04T00:09:38Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.