KAUCUS: Knowledge Augmented User Simulators for Training Language Model
Assistants
- URL: http://arxiv.org/abs/2401.16454v1
- Date: Mon, 29 Jan 2024 06:57:02 GMT
- Title: KAUCUS: Knowledge Augmented User Simulators for Training Language Model
Assistants
- Authors: Kaustubh D. Dhole
- Abstract summary: An effective instruction-following assistant can be developed by creating a simulator that can generate useful interaction data.
Previous user simulators generally lacked diversity, were mostly closed domain, and necessitated rigid schema.
We introduce Kaucus, a Knowledge-Augmented User Simulator framework, to outline a process of creating diverse user simulators.
- Score: 3.724713116252253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: An effective multi-turn instruction-following assistant can be developed by
creating a simulator that can generate useful interaction data. Apart from
relying on its intrinsic weights, an ideal user simulator should also be able
to bootstrap external knowledge rapidly in its raw form to simulate the
multifarious diversity of text available over the internet. Previous user
simulators generally lacked diversity, were mostly closed domain, and
necessitated rigid schema making them inefficient to rapidly scale to
incorporate external knowledge. In this regard, we introduce, Kaucus, a
Knowledge-Augmented User Simulator framework, to outline a process of creating
diverse user simulators, that can seamlessly exploit external knowledge as well
as benefit downstream assistant model training. Through two GPT-J based
simulators viz., a Retrieval Augmented Simulator and a Summary Controlled
Simulator we generate diverse simulator-assistant interactions. Through reward
and preference model-based evaluations, we find that these interactions serve
as useful training data and create more helpful downstream assistants. We also
find that incorporating knowledge through retrieval augmentation or summary
control helps create better assistants.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Gaussian Splatting to Real World Flight Navigation Transfer with Liquid Networks [93.38375271826202]
We present a method to improve generalization and robustness to distribution shifts in sim-to-real visual quadrotor navigation tasks.
We first build a simulator by integrating Gaussian splatting with quadrotor flight dynamics, and then, train robust navigation policies using Liquid neural networks.
In this way, we obtain a full-stack imitation learning protocol that combines advances in 3D Gaussian splatting radiance field rendering, programming of expert demonstration training data, and the task understanding capabilities of Liquid networks.
arXiv Detail & Related papers (2024-06-21T13:48:37Z) - PeerFL: A Simulator for Peer-to-Peer Federated Learning at Scale [3.7338201977027885]
This work integrates peer-to-peer federated learning tools with NS3, a widely used network simulator.
Our experiments showcase the simulator's efficiency in computational resource utilization at scale.
The framework is open source and available for use and extension to the community.
arXiv Detail & Related papers (2024-05-28T05:30:18Z) - A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems [14.646529557978512]
Conversational Recommender System (CRS) leverages real-time feedback from users to dynamically model their preferences.
Large Language Models (LLMs) has marked the onset of a new epoch in computational capabilities.
We introduce a Controllable, scalable, and human-Involved (CSHI) simulator framework that manages the behavior of user simulators.
arXiv Detail & Related papers (2024-05-13T03:02:56Z) - How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation [14.646529557978512]
We analyze the limitations of using Large Language Models in constructing user simulators for Conversational Recommender System.
Data leakage, which occurs in conversational history and the user simulator's replies, results in inflated evaluation results.
We propose SimpleUserSim, employing a straightforward strategy to guide the topic toward the target items.
arXiv Detail & Related papers (2024-03-25T04:21:06Z) - Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous
Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes.
It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training.
We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z) - Learning Interactive Real-World Simulators [96.5991333400566]
We explore the possibility of learning a universal simulator of real-world interaction through generative modeling.
We use the simulator to train both high-level vision-language policies and low-level reinforcement learning policies.
Video captioning models can benefit from training with simulated experience, opening up even wider applications.
arXiv Detail & Related papers (2023-10-09T19:42:22Z) - Metaphorical User Simulators for Evaluating Task-oriented Dialogue
Systems [80.77917437785773]
Task-oriented dialogue systems ( TDSs) are assessed mainly in an offline setting or through human evaluation.
We propose a metaphorical user simulator for end-to-end TDS evaluation, where we define a simulator to be metaphorical if it simulates user's analogical thinking in interactions with systems.
We also propose a tester-based evaluation framework to generate variants, i.e., dialogue systems with different capabilities.
arXiv Detail & Related papers (2022-04-02T05:11:03Z) - A Discrete-event-based Simulator for Deep Learning at Edge [7.096287095663305]
We propose a discrete-event-based edge learning simulator.
It includes a deep learning module and a network simulation module.
Our framework is generic and can be used in various deep learning problems before the deep learning model is deployed.
arXiv Detail & Related papers (2021-12-02T03:13:53Z) - Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward
Decomposition [64.06167416127386]
We propose Multi-Agent Dialog Policy Learning, which regards both the system and the user as the dialog agents.
Two agents interact with each other and are jointly learned simultaneously.
Results show that our method can successfully build a system policy and a user policy simultaneously.
arXiv Detail & Related papers (2020-04-08T04:51:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.