Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model
- URL: http://arxiv.org/abs/2411.04496v1
- Date: Thu, 07 Nov 2024 07:46:06 GMT
- Title: Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model
- Authors: Young-Jun Lee, Dokyong Lee, Junyoung Youn, Kyeongjin Oh, Ho-Jin Choi,
- Abstract summary: We present a skill-of-mind-annotated conversation dataset grounded in diverse social contexts.
We introduce a new family of skill-of-mind-infused LLMs, named Thanos, with model sizes of 1B, 3B, and 8B parameters.
With extensive experiments, these models successfully demonstrate the skill-of-mind process and exhibit strong generalizability.
- Score: 5.505013339790826
- License:
- Abstract: To increase social bonding with interlocutors, humans naturally acquire the ability to respond appropriately in a given situation by considering which conversational skill is most suitable for the response - a process we call skill-of-mind. For large language model (LLM)-based conversational agents, planning appropriate conversational skills, as humans do, is challenging due to the complexity of social dialogue, especially in interactive scenarios. To address this, we propose a skill-of-mind-annotated conversation dataset, named Multifaceted Skill-of-Mind, which includes multi-turn and multifaceted conversational skills across various interactive scenarios (e.g., long-term, counseling, task-oriented), grounded in diverse social contexts (e.g., demographics, persona, rules of thumb). This dataset consists of roughly 100K conversations. Using this dataset, we introduce a new family of skill-of-mind-infused LLMs, named Thanos, with model sizes of 1B, 3B, and 8B parameters. With extensive experiments, these models successfully demonstrate the skill-of-mind process and exhibit strong generalizability in inferring multifaceted skills across a variety of domains. Moreover, we show that Thanos significantly enhances the quality of responses generated by LLM-based conversational agents and promotes prosocial behavior in human evaluations.
Related papers
- DialSim: A Real-Time Simulator for Evaluating Long-Term Multi-Party Dialogue Understanding of Conversational Agents [13.915753261117901]
We introduce DialSim, a real-time dialogue simulator.
In this simulator, an agent is assigned the role of a character from popular TV shows.
Key features of DialSim include evaluating the agent's ability to respond within a reasonable time limit.
arXiv Detail & Related papers (2024-06-19T01:37:10Z) - CloChat: Understanding How People Customize, Interact, and Experience
Personas in Large Language Models [15.915071948354466]
CloChat is an interface supporting easy and accurate customization of agent personas in large language models.
Results indicate that participants formed emotional bonds with the customized agents, engaged in more dynamic dialogues, and showed interest in sustaining interactions.
arXiv Detail & Related papers (2024-02-23T11:25:17Z) - Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue [73.69510478736483]
Large language models (LLMs) can generate fluent, coherent, and diverse responses.
However, they lack a crucial ability: communication skills.
This article aims to empower LLMs with communication skills through inner monologues.
Experimental results show that the proposed CSIM strategy improves the backbone models and outperforms the baselines.
arXiv Detail & Related papers (2023-11-13T16:19:42Z) - Affect Recognition in Conversations Using Large Language Models [9.689990547610664]
Affect recognition plays a pivotal role in human communication.
This study investigates the capacity of large language models (LLMs) to recognise human affect in conversations.
arXiv Detail & Related papers (2023-09-22T14:11:23Z) - Multi-Party Chat: Conversational Agents in Group Settings with Humans
and Models [39.80729604768669]
We evaluate the ability of language models to act as one or more characters in multi-party conversations.
We find that our new dataset, MultiLIGHT, can help bring significant improvements in the group setting.
arXiv Detail & Related papers (2023-04-26T21:41:17Z) - PLACES: Prompting Language Models for Social Conversation Synthesis [103.94325597273316]
We use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting.
We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations.
arXiv Detail & Related papers (2023-02-07T05:48:16Z) - TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real
World [97.58623810402563]
We introduce a new video-based multi-modal dialogue dataset, called TikTalk.
We collect 38K videos from a popular video-sharing platform, along with 367K conversations posted by users beneath them.
Users engage in spontaneous conversations based on their multi-modal experiences from watching videos, which helps recreate real-world chitchat context.
arXiv Detail & Related papers (2023-01-14T10:18:22Z) - Face-to-Face Contrastive Learning for Social Intelligence
Question-Answering [55.90243361923828]
multimodal methods have set the state of the art on many tasks, but have difficulty modeling the complex face-to-face conversational dynamics.
We propose Face-to-Face Contrastive Learning (F2F-CL), a graph neural network designed to model social interactions.
We experimentally evaluated the challenging Social-IQ dataset and show state-of-the-art results.
arXiv Detail & Related papers (2022-07-29T20:39:44Z) - Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning [35.67318830455459]
We develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale.
Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space.
arXiv Detail & Related papers (2022-07-25T16:12:33Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Retrieval Augmentation Reduces Hallucination in Conversation [49.35235945543833]
We explore the use of neural-retrieval-in-the-loop architectures for knowledge-grounded dialogue.
We show that our best models obtain state-of-the-art performance on two knowledge-grounded conversational tasks.
arXiv Detail & Related papers (2021-04-15T16:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.