Taking Action Towards Graceful Interaction: The Effects of Performing
Actions on Modelling Policies for Instruction Clarification Requests
- URL: http://arxiv.org/abs/2401.17039v1
- Date: Tue, 30 Jan 2024 14:18:31 GMT
- Title: Taking Action Towards Graceful Interaction: The Effects of Performing
Actions on Modelling Policies for Instruction Clarification Requests
- Authors: Brielen Madureira, David Schlangen
- Abstract summary: Transformer-based models fail to learn good policies for when to ask Instruction CRs.
We discuss the shortcomings of the data-driven paradigm for learning meta-communication acts.
- Score: 23.405917899107767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clarification requests are a mechanism to help solve communication problems,
e.g. due to ambiguity or underspecification, in instruction-following
interactions. Despite their importance, even skilful models struggle with
producing or interpreting such repair acts. In this work, we test three
hypotheses concerning the effects of action taking as an auxiliary task in
modelling iCR policies. Contrary to initial expectations, we conclude that its
contribution to learning an iCR policy is limited, but some information can
still be extracted from prediction uncertainty. We present further evidence
that even well-motivated, Transformer-based models fail to learn good policies
for when to ask Instruction CRs (iCRs), while the task of determining what to
ask about can be more successfully modelled. Considering the implications of
these findings, we further discuss the shortcomings of the data-driven paradigm
for learning meta-communication acts.
Related papers
- Interactive Agents to Overcome Ambiguity in Software Engineering [61.40183840499932]
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions.
Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes.
We study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance.
arXiv Detail & Related papers (2025-02-18T17:12:26Z) - Retrieved In-Context Principles from Previous Mistakes [55.109234526031884]
In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples.
Recent advances have attempted to improve model performance through principles derived from mistakes.
We propose Retrieved In-Context Principles (RICP), a novel teacher-student framework.
arXiv Detail & Related papers (2024-07-08T07:32:26Z) - It Couldn't Help But Overhear: On the Limits of Modelling Meta-Communicative Grounding Acts with Supervised Learning [19.812562421377706]
Overhearers are deprived of the privilege of performing grounding acts and can only conjecture about intended meanings.
We show that there is evidence pointing to the impossibility of properly modelling human meta-communicative acts with data-driven learning models.
Most importantly, we wish to bring this topic back to the community's table, encouraging discussion on the consequences of having models designed to only "listen in"
arXiv Detail & Related papers (2024-05-02T09:55:19Z) - Evaluating Interventional Reasoning Capabilities of Large Language Models [58.52919374786108]
Large language models (LLMs) are used to automate decision-making tasks.
In this paper, we evaluate whether LLMs can accurately update their knowledge of a data-generating process in response to an intervention.
We create benchmarks that span diverse causal graphs (e.g., confounding, mediation) and variable types.
These benchmarks allow us to isolate the ability of LLMs to accurately predict changes resulting from their ability to memorize facts or find other shortcuts.
arXiv Detail & Related papers (2024-04-08T14:15:56Z) - Ask more, know better: Reinforce-Learned Prompt Questions for Decision
Making with Large Language Models [18.409654309062027]
Large language models (LLMs) combine action-based policies with chain of thought (CoT) reasoning.
Human intervention is also required to develop grounding functions that ensure low-level controllers appropriately process CoT reasoning.
We propose a comprehensive training framework for complex task-solving, incorporating human prior knowledge into the learning of action policies.
arXiv Detail & Related papers (2023-10-27T13:19:19Z) - Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph.
We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs.
The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z) - Post Hoc Explanations of Language Models Can Improve Language Models [43.2109029463221]
We present a novel framework, Amplifying Model Performance by Leveraging In-Context Learning with Post Hoc Explanations (AMPLIFY)
We leverage post hoc explanation methods which output attribution scores (explanations) capturing the influence of each of the input features on model predictions.
Our framework, AMPLIFY, leads to prediction accuracy improvements of about 10-25% over a wide range of tasks.
arXiv Detail & Related papers (2023-05-19T04:46:04Z) - Learning to Generate All Feasible Actions [4.333208181196761]
We introduce action mapping, a novel approach that divides the learning process into two steps: first learn feasibility and subsequently, the objective.
This paper focuses on the feasibility part by learning to generate all feasible actions through self-supervised querying of the feasibility model.
We demonstrate the agent's proficiency in generating actions across disconnected feasible action sets.
arXiv Detail & Related papers (2023-01-26T23:15:51Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.