Related papers: Taking Action Towards Graceful Interaction: The Effects of Performing Actions on Modelling Policies for Instruction Clarification Requests

Taking Action Towards Graceful Interaction: The Effects of Performing Actions on Modelling Policies for Instruction Clarification Requests

URL: http://arxiv.org/abs/2401.17039v1
Date: Tue, 30 Jan 2024 14:18:31 GMT
Title: Taking Action Towards Graceful Interaction: The Effects of Performing Actions on Modelling Policies for Instruction Clarification Requests
Authors: Brielen Madureira, David Schlangen
Abstract summary: Transformer-based models fail to learn good policies for when to ask Instruction CRs. We discuss the shortcomings of the data-driven paradigm for learning meta-communication acts.
Score: 23.405917899107767
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Clarification requests are a mechanism to help solve communication problems, e.g. due to ambiguity or underspecification, in instruction-following interactions. Despite their importance, even skilful models struggle with producing or interpreting such repair acts. In this work, we test three hypotheses concerning the effects of action taking as an auxiliary task in modelling iCR policies. Contrary to initial expectations, we conclude that its contribution to learning an iCR policy is limited, but some information can still be extracted from prediction uncertainty. We present further evidence that even well-motivated, Transformer-based models fail to learn good policies for when to ask Instruction CRs (iCRs), while the task of determining what to ask about can be more successfully modelled. Considering the implications of these findings, we further discuss the shortcomings of the data-driven paradigm for learning meta-communication acts.

Related papers

Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions. Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes. We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z)
Retrieved In-Context Principles from Previous Mistakes [55.109234526031884]
In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes. We propose Retrieved In-Context Principles (RICP), a novel teacher-student framework.
arXiv Detail & Related papers (2024-07-08T07:32:26Z)
It Couldn't Help But Overhear: On the Limits of Modelling Meta-Communicative Grounding Acts with Supervised Learning [19.812562421377706]
Overhearers are deprived of the privilege of performing grounding acts and can only conjecture about intended meanings. We show that there is evidence pointing to the impossibility of properly modelling human meta-communicative acts with data-driven learning models. Most importantly, we wish to bring this topic back to the community's table, encouraging discussion on the consequences of having models designed to only "listen in"
arXiv Detail & Related papers (2024-05-02T09:55:19Z)
Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) can estimate causal effects under interventions on different parts of a system. We conduct empirical analyses to evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention. We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types, and enable a study of intervention-based reasoning.
arXiv Detail & Related papers (2024-04-08T14:15:56Z)
Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models [18.409654309062027]
Large language models (LLMs) combine action-based policies with chain of thought (CoT) reasoning. Human intervention is also required to develop grounding functions that ensure low-level controllers appropriately process CoT reasoning. We propose a comprehensive training framework for complex task-solving, incorporating human prior knowledge into the learning of action policies.
arXiv Detail & Related papers (2023-10-27T13:19:19Z)
Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph. We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs. The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z)
Post Hoc Explanations of Language Models Can Improve Language Models [43.2109029463221]
We present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY) We leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions. Our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks.
arXiv Detail & Related papers (2023-05-19T04:46:04Z)
A Closer Look at the Intervention Procedure of Concept Bottleneck Models [18.222350428973343]
Concept bottleneck models (CBMs) are a class of interpretable neural network models that predict the target response of a given input based on its high-level concepts. CBMs enable domain experts to intervene on the predicted concepts and rectify any mistakes at test time, so that more accurate task predictions can be made at the end. We develop various ways of selecting intervening concepts to improve the intervention effectiveness and conduct an array of in-depth analyses as to how they evolve under different circumstances.
arXiv Detail & Related papers (2023-02-28T02:37:24Z)
Learning to Generate All Feasible Actions [4.333208181196761]
We introduce action mapping, a novel approach that divides the learning process into two steps: first learn feasibility and subsequently, the objective. This paper focuses on the feasibility part by learning to generate all feasible actions through self-supervised querying of the feasibility model. We demonstrate the agent's proficiency in generating actions across disconnected feasible action sets.
arXiv Detail & Related papers (2023-01-26T23:15:51Z)
Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions. By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem. We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them. Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z)
Paired Examples as Indirect Supervision in Latent Decision Models [109.76417071249945]
We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions. We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
arXiv Detail & Related papers (2021-04-05T03:58:30Z)
Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors. We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives. We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.