Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models
- URL: http://arxiv.org/abs/2402.17226v1
- Date: Tue, 27 Feb 2024 05:37:10 GMT
- Title: Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models
- Authors: Xiaolong Wang, Yile Wang, Yuanchi Zhang, Fuwen Luo, Peng Li, Maosong
Sun, Yang Liu
- Abstract summary: We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
- Score: 56.93074140619464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have achieved remarkable performance in
objective tasks such as open-domain question answering and mathematical
reasoning, which can often be solved through recalling learned factual
knowledge or chain-of-thought style reasoning. However, we find that the
performance of LLMs in subjective tasks is still unsatisfactory, such as
metaphor recognition, dark humor detection, etc. Compared to objective tasks,
subjective tasks focus more on interpretation or emotional response rather than
a universally accepted reasoning pathway. Based on the characteristics of the
tasks and the strong dialogue-generation capabilities of LLMs, we propose RiC
(Reasoning in Conversation), a method that focuses on solving subjective tasks
through dialogue simulation. The motivation of RiC is to mine useful contextual
information by simulating dialogues instead of supplying chain-of-thought style
rationales, thereby offering potential useful knowledge behind dialogues for
giving the final answers. We evaluate both API-based and open-source LLMs
including GPT-4, ChatGPT, and OpenChat across twelve tasks. Experimental
results show that RiC can yield significant improvement compared with various
baselines.
Related papers
- RAD-Bench: Evaluating Large Language Models Capabilities in Retrieval Augmented Dialogues [8.036117602566074]
RAD-Bench is a benchmark designed to evaluate Large Language Models' capabilities in multi-turn dialogues following retrievals.
Our evaluation results on commonly used LLMs reveal that model performance deteriorates as additional layers of conditions or constraints are applied.
arXiv Detail & Related papers (2024-09-19T08:26:45Z) - Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training [33.57497419019826]
Action-Based Contrastive Self-Training allows for sample-efficient dialogue policy learning in multi-turn conversation.
ACT demonstrates substantial conversation modeling improvements over standard approaches to supervised fine-tuning and DPO.
arXiv Detail & Related papers (2024-05-31T22:44:48Z) - Exploring the Factual Consistency in Dialogue Comprehension of Large Language Models [51.75805497456226]
This work focuses on the factual consistency issue with the help of the dialogue summarization task.
Our evaluation shows that, on average, 26.8% of the summaries generated by LLMs contain factual inconsistency.
To stimulate and enhance the dialogue comprehension ability of LLMs, we propose a fine-tuning paradigm with auto-constructed multi-task data.
arXiv Detail & Related papers (2023-11-13T09:32:12Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Self-Explanation Prompting Improves Dialogue Understanding in Large
Language Models [52.24756457516834]
We propose a novel "Self-Explanation" prompting strategy to enhance the comprehension abilities of Large Language Models (LLMs)
This task-agnostic approach requires the model to analyze each dialogue utterance before task execution, thereby improving performance across various dialogue-centric tasks.
Experimental results from six benchmark datasets confirm that our method consistently outperforms other zero-shot prompts and matches or exceeds the efficacy of few-shot prompts.
arXiv Detail & Related papers (2023-09-22T15:41:34Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Frugal Prompting for Dialog Models [17.048111072193933]
This study examines different approaches for building dialog systems using large language models (LLMs)
As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context.
The research also analyzes the representations of dialog history that have the optimal usable-information density.
arXiv Detail & Related papers (2023-05-24T09:06:49Z) - Prompting and Evaluating Large Language Models for Proactive Dialogues:
Clarification, Target-guided, and Non-collaboration [72.04629217161656]
This work focuses on three aspects of proactive dialogue systems: clarification, target-guided, and non-collaborative dialogues.
To trigger the proactivity of LLMs, we propose the Proactive Chain-of-Thought prompting scheme.
arXiv Detail & Related papers (2023-05-23T02:49:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.