Related papers: Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue

Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue

URL: http://arxiv.org/abs/2311.07445v2
Date: Fri, 15 Mar 2024 08:30:30 GMT
Title: Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue
Authors: Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng,
Abstract summary: Large language models (LLMs) can generate fluent, coherent, and diverse responses. However, they lack a crucial ability: communication skills. This article aims to empower LLMs with communication skills through inner monologues. Experimental results show that the proposed CSIM strategy improves the backbone models and outperforms the baselines.
Score: 73.69510478736483
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The emergence of large language models (LLMs) further improves the capabilities of open-domain dialogue systems and can generate fluent, coherent, and diverse responses. However, LLMs still lack a crucial ability: communication skills. This limitation renders them more like information seeking tools rather than anthropomorphic chatbots. Communication skills, such as topic transition, proactively asking questions, concept guidance, empathy, and summarising often should be taken into consideration, to make LLMs more anthropomorphic and proactive during the conversation, thereby increasing the interest of users and attracting them to chat for longer. However, enabling these communication skills in black-box LLMs remains a key challenge because they do not have the same utterance formation mode as real people: think before speaking. Inspired by linguistics and cognitive science, we empower LLMs with communication skills through inner monologues. To evaluate various communication skills, we construct a benchmark named Cskills, which can also more comprehensively evaluate the dialogue generation ability of the model. Experimental results show that the proposed CSIM strategy improves the backbone models and outperforms the baselines.

Related papers

Towards Anthropomorphic Conversational AI Part I: A Practical Framework [49.62013440962072]
We introduce a multi- module framework designed to replicate the key aspects of human intelligence involved in conversations. In the second stage of our approach, these conversational data, after filtering and labeling, can serve as training and testing data for reinforcement learning.
arXiv Detail & Related papers (2025-02-28T03:18:39Z)
Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations [58.65755268815283]
Many real dialogues are interactive, meaning an agent's utterances will influence their conversational partner, elicit information, or change their opinion. We use this fact to rewrite and augment existing suboptimal data, and train via offline reinforcement learning (RL) an agent that outperforms both prompting and learning from unaltered human demonstrations. Our results in a user study with real humans show that our approach greatly outperforms existing state-of-the-art dialogue agents.
arXiv Detail & Related papers (2024-11-07T21:37:51Z)
Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs [14.997971970162743]
Humans spontaneously use increasingly efficient language as interactions progress, by adapting and forming ad-hoc conventions. It remains unexplored whether multimodal large language models (MLLMs) similarly increase communication efficiency during interactions. We introduce ICCA, an automated framework to evaluate such conversational adaptation as an in-context behavior in MLLMs.
arXiv Detail & Related papers (2024-08-02T17:51:57Z)
Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z)
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs) Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)
Sibyl: Empowering Empathetic Dialogue Generation in Large Language Models via Sensible and Visionary Commonsense Inference [40.96005200292604]
We present an innovative framework named Sensible and Visionary Commonsense Knowledge (Sibyl) It is designed to concentrate on the immediately succeeding dialogue, aiming to elicit more empathetic responses. Experimental results demonstrate that incorporating our paradigm for acquiring commonsense knowledge into LLMs comprehensively enhances the quality of their responses.
arXiv Detail & Related papers (2023-11-26T14:35:23Z)
BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues [72.65163468440434]
This report provides a preliminary evaluation of existing large language models for human-style multi-turn chatting. We prompt large language models (LLMs) to generate a full multi-turn dialogue based on the ChatSEED, utterance by utterance. We find GPT-4 can generate human-style multi-turn dialogues with impressive quality, significantly outperforms its counterparts.
arXiv Detail & Related papers (2023-10-20T16:53:51Z)
Tackling Vision Language Tasks Through Learning Inner Monologues [10.795616787372625]
We propose a novel approach, Inner Monologue Multi-Modal Optimization (IMMO), to solve complex vision language problems. IMMO simulates inner monologue processes, a cognitive process in which an individual engages in silent verbal communication with themselves. The results suggest IMMO can enhance reasoning and explanation abilities, contributing to the more effective fusion of vision and language models.
arXiv Detail & Related papers (2023-08-19T10:10:49Z)
Few-shot Language Coordination by Modeling Theory of Mind [95.54446989205117]
We study the task of few-shot $textitlanguage coordination$. We require the lead agent to coordinate with a $textitpopulation$ of agents with different linguistic abilities. This requires the ability to model the partner's beliefs, a vital component of human communication.
arXiv Detail & Related papers (2021-07-12T19:26:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.