Related papers: Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models

Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models

URL: http://arxiv.org/abs/2305.17878v1
Date: Mon, 29 May 2023 04:19:35 GMT
Title: Ask an Expert: Leveraging Language Models to Improve Strategic Reasoning in Goal-Oriented Dialogue Models
Authors: Qiang Zhang, Jason Naradowsky, Yusuke Miyao
Abstract summary: We propose the "Ask an Expert" framework in which the model is trained with access to an "expert" which it can consult at each turn. Advice is solicited via a structured dialogue with the expert, and the model is optimized to selectively utilize (or ignore) it given the context and dialogue history. We evaluate this framework in a mental health support domain, where the structure of the expert conversation is outlined by pre-specified prompts which reflect a reasoning strategy taught to practitioners in the field.
Score: 15.476899850339395
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Existing dialogue models may encounter scenarios which are not well-represented in the training data, and as a result generate responses that are unnatural, inappropriate, or unhelpful. We propose the "Ask an Expert" framework in which the model is trained with access to an "expert" which it can consult at each turn. Advice is solicited via a structured dialogue with the expert, and the model is optimized to selectively utilize (or ignore) it given the context and dialogue history. In this work the expert takes the form of an LLM. We evaluate this framework in a mental health support domain, where the structure of the expert conversation is outlined by pre-specified prompts which reflect a reasoning strategy taught to practitioners in the field. Blenderbot models utilizing "Ask an Expert" show quality improvements across all expert sizes, including those with fewer parameters than the dialogue model itself. Our best model provides a $\sim 10\%$ improvement over baselines, approaching human-level scores on "engingingness" and "helpfulness" metrics.

Related papers

MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors [76.1634959528817]
We present MathTutorBench, an open-source benchmark for holistic tutoring model evaluation. MathTutorBench contains datasets and metrics that broadly cover tutor abilities as defined by learning sciences research in dialog-based teaching. We evaluate a wide set of closed- and open-weight models and find that subject expertise, indicated by solving ability, does not immediately translate to good teaching.
arXiv Detail & Related papers (2025-02-26T08:43:47Z)
MoIN: Mixture of Introvert Experts to Upcycle an LLM [15.182215869841789]
This paper aims to improve an existing large language model without continued pre-training of the full-model. The idea is to split the pre-training data into semantically relevant groups and train an expert on each subset. During inference, an incoming query is first routed to the most relevant expert which is then loaded onto the base model for the forward pass.
arXiv Detail & Related papers (2024-10-13T01:11:04Z)
Debating with More Persuasive LLMs Leads to More Truthful Answers [45.0343254517401]
We find that debate consistently helps both non-expert models and humans answer questions, achieving 76% and 88% accuracy respectively. Our results provide encouraging empirical evidence for the viability of aligning models with debate in the absence of ground truth.
arXiv Detail & Related papers (2024-02-09T21:05:01Z)
Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue [92.01165203498299]
Embodied dialogue instruction following requires an agent to complete a complex sequence of tasks from a natural language exchange. This paper argues that imitation learning (IL) and related low-level metrics are actually misleading and do not align with the goals of embodied dialogue research.
arXiv Detail & Related papers (2022-10-10T05:51:40Z)
GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog. We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups. A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z)
DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks. To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z)
Are Metrics Enough? Guidelines for Communicating and Visualizing Predictive Models to Subject Matter Experts [7.768301998812552]
We describe an iterative study conducted with both subject matter experts and data scientists to understand the gaps in communication. We derive a set of communication guidelines that use visualization as a common medium for communicating the strengths and weaknesses of a model.
arXiv Detail & Related papers (2022-05-11T19:40:24Z)
Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task. Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts. Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z)
Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query. A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives. We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z)
Learning an Effective Context-Response Matching Model with Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination. We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner. Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z)
Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation. Previous works usually employ various neural network architectures to model the history. This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.