Deal, or no deal (or who knows)? Forecasting Uncertainty in
Conversations using Large Language Models
- URL: http://arxiv.org/abs/2402.03284v1
- Date: Mon, 5 Feb 2024 18:39:47 GMT
- Title: Deal, or no deal (or who knows)? Forecasting Uncertainty in
Conversations using Large Language Models
- Authors: Anthony Sicilia, Hyunwoo Kim, Khyathi Raghavi Chandu, Malihe Alikhani,
Jack Hessel
- Abstract summary: How well can language models represent inherent uncertainty in conversations?
We propose FortUne Dial, an expansion of the long-standing "conversation forecasting" task.
We study two ways in which language models potentially represent outcome uncertainty.
- Score: 45.41542983671774
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective interlocutors account for the uncertain goals, beliefs, and
emotions of others. But even the best human conversationalist cannot perfectly
anticipate the trajectory of a dialogue. How well can language models represent
inherent uncertainty in conversations? We propose FortUne Dial, an expansion of
the long-standing "conversation forecasting" task: instead of just accuracy,
evaluation is conducted with uncertainty-aware metrics, effectively enabling
abstention on individual instances. We study two ways in which language models
potentially represent outcome uncertainty (internally, using scores and
directly, using tokens) and propose fine-tuning strategies to improve
calibration of both representations. Experiments on eight difficult negotiation
corpora demonstrate that our proposed fine-tuning strategies (a traditional
supervision strategy and an off-policy reinforcement learning strategy) can
calibrate smaller open-source models to compete with pre-trained models 10x
their size.
Related papers
- Trustworthy Alignment of Retrieval-Augmented Large Language Models via Reinforcement Learning [84.94709351266557]
We focus on the trustworthiness of language models with respect to retrieval augmentation.
We deem that retrieval-augmented language models have the inherent capabilities of supplying response according to both contextual and parametric knowledge.
Inspired by aligning language models with human preference, we take the first step towards aligning retrieval-augmented language models to a status where it responds relying merely on the external evidence.
arXiv Detail & Related papers (2024-10-22T09:25:21Z) - Speechworthy Instruction-tuned Language Models [71.8586707840169]
We show that both prompting and preference learning increase the speech-suitability of popular instruction-tuned LLMs.
We share lexical, syntactical, and qualitative analyses to showcase how each method contributes to improving the speech-suitability of generated responses.
arXiv Detail & Related papers (2024-09-23T02:34:42Z) - Large Language Model based Situational Dialogues for Second Language Learning [7.450328495455734]
In second language learning, scenario-based conversation practice is important for language learners to achieve fluency in speaking.
To bridge this gap, we propose situational dialogue models for students to engage in conversational practice.
Our situational dialogue models are fine-tuned on large language models (LLMs), with the aim of combining the engaging nature of an open-ended conversation with the focused practice of scenario-based tasks.
arXiv Detail & Related papers (2024-03-29T06:43:55Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Controllable Mixed-Initiative Dialogue Generation through Prompting [50.03458333265885]
Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control.
Agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner.
Standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents.
We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation.
arXiv Detail & Related papers (2023-05-06T23:11:25Z) - The Interplay of Task Success and Dialogue Quality: An in-depth
Evaluation in Task-Oriented Visual Dialogues [6.02280861819024]
We show that in the popular end-to-end approach, this choice prevents the model from learning to generate linguistically richer dialogues.
We show that in GuessWhat, models could increase their accuracy if they learn to ground, encode, and decode also words that do not occur frequently in the training set.
arXiv Detail & Related papers (2021-03-20T10:13:30Z) - I Beg to Differ: A study of constructive disagreement in online
conversations [15.581515781839656]
We construct a corpus of 7 425 Wikipedia Talk page conversations that contain content disputes.
We define the task of predicting whether disagreements will be escalated to mediation by a moderator.
We develop a variety of neural models and show that taking into account the structure of the conversation improves predictive accuracy.
arXiv Detail & Related papers (2021-01-26T16:36:43Z) - Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [74.09352261943911]
We study knowledge-grounded dialogue generation with pre-trained language models.
We propose equipping response generation defined by a pre-trained language model with a knowledge selection module.
arXiv Detail & Related papers (2020-10-17T16:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.