Related papers: PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator

PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator

URL: http://arxiv.org/abs/2308.11534v5
Date: Mon, 27 May 2024 16:32:16 GMT
Title: PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator
Authors: Chuyi Kong, Yaxin Fan, Xiang Wan, Feng Jiang, Benyou Wang,
Abstract summary: We propose a paradigm to simulate human behavior better and explore the benefits of incorporating more human-like questions in multi-turn conversations. Specifically, we target human questions extracted from genuine human-machine conversations as a learning goal and provide a novel user simulator called Socratic' Our results show our response model, PlatoLM', achieves SoTA performance among LLaMA-based 7B models in MT-Bench.
Score: 39.40718009289621
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The unparalleled performance of closed-sourced ChatGPT has sparked efforts towards its democratization, with notable strides made by leveraging real user and ChatGPT dialogues, as evidenced by Vicuna. However, due to challenges in gathering dialogues involving human participation, current endeavors like Baize and UltraChat rely on ChatGPT conducting roleplay to simulate humans based on instructions, resulting in overdependence on seeds, diminished human-likeness, limited topic diversity, and an absence of genuine multi-round conversational dynamics. To address the above issues, we propose a paradigm to simulate human behavior better and explore the benefits of incorporating more human-like questions in multi-turn conversations. Specifically, we directly target human questions extracted from genuine human-machine conversations as a learning goal and provide a novel user simulator called `Socratic'. The experimental results show our response model, `PlatoLM', achieves SoTA performance among LLaMA-based 7B models in MT-Bench. Our findings further demonstrate that our method introduces highly human-like questioning patterns and rich topic structures, which can teach the response model better than previous works in multi-round conversations.

Related papers

DialogueForge: LLM Simulation of Human-Chatbot Dialogue [7.038493120049631]
We propose DialogueForge as a framework for generating AI-simulated conversations in human-chatbot style.<n>To each generated conversation, DialogueForge uses seed prompts extracted from real human-chatbot interactions.<n>We evaluate the quality of the simulated conversations and compare different models using the UniEval and GTEval evaluation protocols.
arXiv Detail & Related papers (2025-07-21T16:08:19Z)
Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions [13.341099059080936]
This study aims to equip chatbots with "eyes and ears" capable of more immersive interactions with humans.<n>We introduce a new multimodal conversation dataset, Multimodal Multi-Session Multi-Party Conversation.<n>Our model, trained on the $M3C$, demonstrates the ability to seamlessly engage in long-term conversations with multiple speakers.
arXiv Detail & Related papers (2025-05-31T06:50:51Z)
REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues. We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues. Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z)
DiverseDialogue: A Methodology for Designing Chatbots with Human-Like Diversity [5.388338680646657]
We show that GPT-4o mini, when used as simulated human participants, systematically differ from those between actual humans across multiple linguistic features. We propose an approach that automatically generates prompts for user simulations by incorporating features derived from real human interactions. Our method of prompt optimization, tailored to target specific linguistic features, shows significant improvements.
arXiv Detail & Related papers (2024-08-30T21:33:58Z)
Self-Directed Turing Test for Large Language Models [56.64615470513102]
The Turing test examines whether AIs can exhibit human-like behaviour in natural language conversations. Traditional Turing tests adopt a rigid dialogue format where each participant sends only one message each time. This paper proposes the Self-Directed Turing Test, which extends the original test with a burst dialogue format.
arXiv Detail & Related papers (2024-08-19T09:57:28Z)
LLM Roleplay: Simulating Human-Chatbot Interaction [52.03241266241294]
We propose a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction. Our method can simulate human-chatbot dialogues with a high indistinguishability rate.
arXiv Detail & Related papers (2024-07-04T14:49:46Z)
Designing and Evaluating Multi-Chatbot Interface for Human-AI Communication: Preliminary Findings from a Persuasion Task [1.360607903399872]
This study examines the impact of multi-chatbot communication in a specific persuasion setting: promoting charitable donations. We developed an online environment that enables multi-chatbot communication and conducted a pilot experiment. We present our development process of the multi-chatbot interface and present preliminary findings from a pilot experiment.
arXiv Detail & Related papers (2024-06-28T04:33:41Z)
ChatGPT Role-play Dataset: Analysis of User Motives and Model Naturalness [4.564433526993029]
We study how ChatGPT behaves during conversations in different settings by analyzing its interactions in both a normal way and a role-play setting. Our study highlights the diversity of user motives when interacting with ChatGPT and variable AI naturalness, showing not only the nuanced dynamics of natural conversations between humans and AI, but also providing new avenues for improving the effectiveness of human-AI communication.
arXiv Detail & Related papers (2024-03-26T22:01:13Z)
BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting. We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance. We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z)
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World [97.58623810402563]
We introduce a new video-based multi-modal dialogue dataset, called TikTalk. We collect 38K videos from a popular video-sharing platform, along with 367K conversations posted by users beneath them. Users engage in spontaneous conversations based on their multi-modal experiences from watching videos, which helps recreate real-world chitchat context.
arXiv Detail & Related papers (2023-01-14T10:18:22Z)
Put Chatbot into Its Interlocutor's Shoes: New Framework to Learn Chatbot Responding with Intention [55.77218465471519]
This paper proposes an innovative framework to train chatbots to possess human-like intentions. Our framework included a guiding robot and an interlocutor model that plays the role of humans. We examined our framework using three experimental setups and evaluate the guiding robot with four different metrics to demonstrated flexibility and performance advantages.
arXiv Detail & Related papers (2021-03-30T15:24:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.