Toward Self-Learning End-to-End Dialog Systems
- URL: http://arxiv.org/abs/2201.06849v1
- Date: Tue, 18 Jan 2022 09:56:35 GMT
- Title: Toward Self-Learning End-to-End Dialog Systems
- Authors: Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng
- Abstract summary: We propose SL-Agent, a self-learning framework for building end-to-end dialog systems in changing environments.
SL-Agent consists of a dialog model and a pre-trained reward model to judge the quality of a system response.
Experiments show that SL-Agent can effectively adapt to new tasks using limited human corrections.
- Score: 107.65369860922392
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end task-oriented dialog systems often suffer from out-of-distribution
(OOD) inputs after being deployed in dynamic, changing, and open environments.
In this work, we propose SL-Agent, a self-learning framework that combines
supervised learning, reinforcement learning, and machine teaching for building
end-to-end dialog systems in a more realistic changing environment setting.
SL-Agent consists of a dialog model and a pre-trained reward model to judge the
quality of a system response. SL-Agent enables dialog agents to automatically
adapt to environments with user behavior changes by learning from human-bot
interactions via reinforcement learning, with the incorporated pre-trained
reward model. We validate SL-Agent in four different dialog domains.
Experimental results show the effectiveness of SL-Agent for automatically
adapting to changing environments using both automatic and human evaluations.
Furthermore, experiments on a challenging domain extension setting demonstrate
that SL-Agent can effectively adapt to new tasks using limited human
corrections provided via machine teaching. We will release code, data, and
pre-trained models for further research.
Related papers
- Hello Again! LLM-powered Personalized Agent for Long-term Dialogue [63.65128176360345]
We introduce a model-agnostic framework, the Long-term Dialogue Agent (LD-Agent)
It incorporates three independently tunable modules dedicated to event perception, persona extraction, and response generation.
The effectiveness, generality, and cross-domain capabilities of LD-Agent are empirically demonstrated.
arXiv Detail & Related papers (2024-06-09T21:58:32Z) - CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models [8.123272461141815]
We introduce the TinyAgent model, trained on a meticulously curated high-quality dataset.
We also present the Collaborative Multi-Agent Tuning (CMAT) framework, an innovative system designed to augment language agent capabilities.
In this research, we propose a new communication agent framework that integrates multi-agent systems with environmental feedback mechanisms.
arXiv Detail & Related papers (2024-04-02T06:07:35Z) - Modeling Resilience of Collaborative AI Systems [1.869472599236422]
Collaborative Artificial Intelligence System (CAIS) performs actions in collaboration with the human to achieve a common goal.
CAISs can use a trained AI model to control human-system interaction, or they can use human interaction to dynamically learn from humans in an online fashion.
In online learning with human feedback, the AI model evolves by monitoring human interaction through the system sensors in the learning state.
Any disruptive event affecting these sensors may affect the AI model's ability to make accurate decisions and degrade the CAIS performance.
arXiv Detail & Related papers (2024-01-23T10:28:33Z) - AgentCF: Collaborative Learning with Autonomous Language Agents for
Recommender Systems [112.76941157194544]
We propose AgentCF for simulating user-item interactions in recommender systems through agent-based collaborative filtering.
We creatively consider not only users but also items as agents, and develop a collaborative learning approach that optimize both kinds of agents together.
Overall, the optimized agents exhibit diverse interaction behaviors within our framework, including user-item, user-user, item-item, and collective interactions.
arXiv Detail & Related papers (2023-10-13T16:37:14Z) - Replicating Complex Dialogue Policy of Humans via Offline Imitation
Learning with Supervised Regularization [7.151589223349882]
Policy learning (PL) is a module of a task-oriented dialogue system that trains an agent to make actions in each dialogue turn.
Both supervised learning (SL) and reinforcement learning (RL) frameworks cannot imitate humans well.
This study proposed an offline imitation learning model that learns policy from real dialogue datasets.
arXiv Detail & Related papers (2023-05-06T09:27:58Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for
Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses.
We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z) - Transferable Dialogue Systems and User Simulators [17.106518400787156]
One of the difficulties in training dialogue systems is the lack of training data.
We explore the possibility of creating dialogue data through the interaction between a dialogue system and a user simulator.
We develop a modelling framework that can incorporate new dialogue scenarios through self-play between the two agents.
arXiv Detail & Related papers (2021-07-25T22:59:09Z) - SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine
Teaching [81.45928589522032]
We parameterize modular task-oriented dialog systems using a Transformer-based auto-regressive language model.
We pre-train, on heterogeneous dialog corpora, a task-grounded response generation model.
Experiments show that SOLOIST creates new state-of-the-art on well-studied task-oriented dialog benchmarks.
arXiv Detail & Related papers (2020-05-11T17:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.