OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models
- URL: http://arxiv.org/abs/2508.21061v1
- Date: Thu, 28 Aug 2025 17:58:29 GMT
- Title: OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models
- Authors: Adam Coscia, Shunan Guo, Eunyee Koh, Alex Endert,
- Abstract summary: We present OnGoal, an LLM chat interface that helps users better manage goal progress.<n>OnGoal provides real-time feedback on goal alignment through LLM-assisted evaluation.<n>Using OnGoal, participants spent less time and effort to achieve their goals.
- Score: 23.174462069020695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As multi-turn dialogues with large language models (LLMs) grow longer and more complex, how can users better evaluate and review progress on their conversational goals? We present OnGoal, an LLM chat interface that helps users better manage goal progress. OnGoal provides real-time feedback on goal alignment through LLM-assisted evaluation, explanations for evaluation results with examples, and overviews of goal progression over time, enabling users to navigate complex dialogues more effectively. Through a study with 20 participants on a writing task, we evaluate OnGoal against a baseline chat interface without goal tracking. Using OnGoal, participants spent less time and effort to achieve their goals while exploring new prompting strategies to overcome miscommunication, suggesting tracking and visualizing goals can enhance engagement and resilience in LLM dialogues. Our findings inspired design implications for future LLM chat interfaces that improve goal communication, reduce cognitive load, enhance interactivity, and enable feedback to improve LLM performance.
Related papers
- GameTalk: Training LLMs for Strategic Conversation [51.29670609281524]
We introduce textbfGameTalk, a framework for training LLMs to make strategic decisions via multi-turn interactions.<n>Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations.<n>We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling.
arXiv Detail & Related papers (2026-01-22T19:18:39Z) - DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors [19.83349341267686]
Large-language-model (LLM) agents excel at reactive dialogue but struggle with proactive, goal-driven interactions.<n>We introduce DialogXpert, which proposes a small, high-quality set of candidate actions per turn.<n>By tracking the user's emotions, DialogXpert tailors each decision to advance the task while nurturing a genuine, empathetic connection.
arXiv Detail & Related papers (2025-05-23T12:12:40Z) - Enhancing User-Oriented Proactivity in Open-Domain Dialogues with Critic Guidance [35.15965694815852]
Open-domain dialogue systems aim to generate natural and engaging conversations.<n>Existing large language models (LLMs) fall short in proactively understanding the user's chatting preferences.<n>We propose a User-oriented Proactive (UPC) to enhance the user-oriented proactivity.
arXiv Detail & Related papers (2025-05-18T09:59:22Z) - Conversational User-AI Intervention: A Study on Prompt Rewriting for Improved LLM Response Generation [16.8514748768591]
This paper investigates aspects in which user queries fall short of expressing information needs, and the potential of using LLMs to rewrite suboptimal user prompts.<n>Our findings demonstrate that rephrasing ineffective prompts can elicit better responses from a conversational system, while preserving the user's original intent.
arXiv Detail & Related papers (2025-03-21T02:01:02Z) - SAGE: Steering Dialog Generation with Future-Aware State-Action Augmentation [9.95917154889491]
We present a novel approach called SAGE that uses latent variables to control long-horizon behavior in dialogue generation.<n>At the core of our method is the State-Action Chain (SAC), which augments standard language model fine-tuning.<n>Our experimental results show that models trained with this approach demonstrate improved performance in emotional intelligence metrics.
arXiv Detail & Related papers (2025-03-04T22:45:24Z) - Evaluating Very Long-Term Conversational Memory of LLM Agents [95.84027826745609]
We introduce a machine-human pipeline to generate high-quality, very long-term dialogues.
We equip each agent with the capability of sharing and reacting to images.
The generated conversations are verified and edited by human annotators for long-range consistency.
arXiv Detail & Related papers (2024-02-27T18:42:31Z) - Think Before You Speak: Cultivating Communication Skills of Large Language Models via Inner Monologue [73.69510478736483]
Large language models (LLMs) can generate fluent, coherent, and diverse responses.
However, they lack a crucial ability: communication skills.
This article aims to empower LLMs with communication skills through inner monologues.
Experimental results show that the proposed CSIM strategy improves the backbone models and outperforms the baselines.
arXiv Detail & Related papers (2023-11-13T16:19:42Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Enhancing Large Language Model Induced Task-Oriented Dialogue Systems
Through Look-Forward Motivated Goals [76.69419538047813]
ProToD approach anticipates the future dialogue actions and incorporates the goal-oriented reward signal to enhance ToD systems.
We present a novel evaluation method that assesses ToD systems based on goal-driven dialogue simulations.
Empirical experiments conducted on the MultiWoZ 2.1 dataset demonstrate that our model can achieve superior performance using only 10% of the data.
arXiv Detail & Related papers (2023-09-16T10:56:00Z) - Unlocking the Potential of User Feedback: Leveraging Large Language
Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model.
This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models.
Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z) - Rethinking the Evaluation for Conversational Recommendation in the Era
of Large Language Models [115.7508325840751]
The recent success of large language models (LLMs) has shown great potential to develop more powerful conversational recommender systems (CRSs)
In this paper, we embark on an investigation into the utilization of ChatGPT for conversational recommendation, revealing the inadequacy of the existing evaluation protocol.
We propose an interactive Evaluation approach based on LLMs named iEvaLM that harnesses LLM-based user simulators.
arXiv Detail & Related papers (2023-05-22T15:12:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.