Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning
- URL: http://arxiv.org/abs/2502.12876v1
- Date: Tue, 18 Feb 2025 14:05:59 GMT
- Title: Continuous Learning Conversational AI: A Personalized Agent Framework via A2C Reinforcement Learning
- Authors: Nandakishor M, Anjali M,
- Abstract summary: This paper introduces a Continuous Learning Conversational AI (CLCA) approach, implemented using A2C reinforcement learning.<n>We use simulated sales dialogues, generated by Large Language Models (LLMs), to train an A2C agent.<n>This agent learns to optimize conversation strategies for personalization, focusing on engagement and delivering value.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Creating personalized and adaptable conversational AI remains a key challenge. This paper introduces a Continuous Learning Conversational AI (CLCA) approach, implemented using A2C reinforcement learning, to move beyond static Large Language Models (LLMs). We use simulated sales dialogues, generated by LLMs, to train an A2C agent. This agent learns to optimize conversation strategies for personalization, focusing on engagement and delivering value. Our system architecture integrates reinforcement learning with LLMs for both data creation and response selection. This method offers a practical way to build personalized AI companions that evolve through continuous learning, advancing beyond traditional static LLM techniques.
Related papers
- Chain-of-Thought Training for Open E2E Spoken Dialogue Systems [57.77235760292348]
End-to-end (E2E) spoken dialogue systems preserve full differentiability and capture non-phonemic information.<n>We propose a chain-of-thought (CoT) formulation to ensure that training on conversational data remains closely aligned with the multimodal language model.<n>Our method achieves over 1.5 ROUGE-1 improvement over the baseline, successfully training spoken dialogue systems on publicly available human-human conversation datasets.
arXiv Detail & Related papers (2025-05-31T21:43:37Z) - Towards Anthropomorphic Conversational AI Part I: A Practical Framework [49.62013440962072]
We introduce a multi- module framework designed to replicate the key aspects of human intelligence involved in conversations.
In the second stage of our approach, these conversational data, after filtering and labeling, can serve as training and testing data for reinforcement learning.
arXiv Detail & Related papers (2025-02-28T03:18:39Z) - Dynamic Skill Adaptation for Large Language Models [78.31322532135272]
We present Dynamic Skill Adaptation (DSA), an adaptive and dynamic framework to adapt novel and complex skills to Large Language Models (LLMs)<n>For every skill, we utilize LLMs to generate both textbook-like data which contains detailed descriptions of skills for pre-training and exercise-like data which targets at explicitly utilizing the skills to solve problems for instruction-tuning.<n>Experiments on large language models such as LLAMA and Mistral demonstrate the effectiveness of our proposed methods in adapting math reasoning skills and social study skills.
arXiv Detail & Related papers (2024-12-26T22:04:23Z) - MaestroMotif: Skill Design from Artificial Intelligence Feedback [67.17724089381056]
MaestroMotif is a method for AI-assisted skill design, which yields high-performing and adaptable agents.<n>We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and adaptable agents.
arXiv Detail & Related papers (2024-12-11T16:59:31Z) - Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM [1.3089936156875277]
We introduce a speech-conditioned Large Language Model (LLM) integrated with a Mixture of Experts (MoE) based connector.
We propose an Insertion and Deletion of Interruption Token (IDIT) mechanism for better transfer text generation ability of LLM to speech recognition task.
We also present a connecter with MoE architecture that manages multiple languages efficiently.
arXiv Detail & Related papers (2024-09-24T09:20:22Z) - Automating Knowledge Discovery from Scientific Literature via LLMs: A Dual-Agent Approach with Progressive Ontology Prompting [59.97247234955861]
We introduce a novel framework based on large language models (LLMs) that combines a progressive prompting algorithm with a dual-agent system, named LLM-Duo.
Our method identifies 2,421 interventions from 64,177 research articles in the speech-language therapy domain.
arXiv Detail & Related papers (2024-08-20T16:42:23Z) - Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach [6.154304269581415]
Advanced Large language models (LLMs) provide superior performance in complex human-like interactions.
LLMs are costly, or too large for edge devices such as smartphones and harder to self-host, leading to security and privacy concerns.
This paper introduces a novel interpretable knowledge distillation approach to enhance the performance of smaller, more economical LLMs.
arXiv Detail & Related papers (2024-08-13T23:59:36Z) - Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk [11.706292228586332]
Large language models (LLMs) are powerful dialogue agents, but specializing them towards fulfilling a specific function can be challenging.
We propose a more effective method for data collection through LLMs engaging in a conversation in various roles.
This approach generates a training data via "self-talk" of LLMs that can be refined and utilized for supervised fine-tuning.
arXiv Detail & Related papers (2024-01-10T09:49:10Z) - Building Open-Ended Embodied Agent via Language-Policy Bidirectional
Adaptation [40.82919989450566]
Building embodied agents on integrating Large Language Models (LLMs) and Reinforcement Learning (RL) have revolutionized human-AI interaction.
Existing research faces challenges in meeting the requirement of open-endedness.
We present OpenPAL, a co-training framework comprising two stages: fine-tuning a pre-trained LLM to translate human instructions into goals for planning, and goal-conditioned training a policy for decision-making.
arXiv Detail & Related papers (2023-12-12T11:06:07Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.