Shaking the foundations: delusions in sequence models for interaction
and control
- URL: http://arxiv.org/abs/2110.10819v1
- Date: Wed, 20 Oct 2021 23:31:05 GMT
- Title: Shaking the foundations: delusions in sequence models for interaction
and control
- Authors: Pedro A. Ortega, Markus Kunesch, Gr\'egoire Del\'etang, Tim Genewein,
Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien
Perolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian
Chen, Scott Reed, Marcus Hutter, Nando de Freitas, Shane Legg
- Abstract summary: We show that sequence models "lack the understanding of the cause and effect of their actions" leading them to draw incorrect inferences due to auto-suggestive delusions.
We show that in supervised learning, one can teach a system to condition or intervene on data by training with factual and counterfactual error signals respectively.
- Score: 45.34593341136043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent phenomenal success of language models has reinvigorated machine
learning research, and large sequence models such as transformers are being
applied to a variety of domains. One important problem class that has remained
relatively elusive however is purposeful adaptive behavior. Currently there is
a common perception that sequence models "lack the understanding of the cause
and effect of their actions" leading them to draw incorrect inferences due to
auto-suggestive delusions. In this report we explain where this mismatch
originates, and show that it can be resolved by treating actions as causal
interventions. Finally, we show that in supervised learning, one can teach a
system to condition or intervene on data by training with factual and
counterfactual error signals respectively.
Related papers
- Counterfactual Generative Modeling with Variational Causal Inference [1.9287470458589586]
We present a novel variational Bayesian causal inference framework to handle counterfactual generative modeling tasks.
In experiments, we demonstrate the advantage of our framework compared to state-of-the-art models in counterfactual generative modeling.
arXiv Detail & Related papers (2024-10-16T16:44:12Z) - Implicit Causal Representation Learning via Switchable Mechanisms [11.870185425476429]
Implicit learning of causal mechanisms typically involves two categories of interventional data: hard and soft interventions.
In this paper, we tackle the challenges of learning causal models using soft interventions while retaining implicit modelling.
We propose ICLR-SM, which models the effects of soft interventions by employing a causal mechanism switch variable designed to toggle between different causal mechanisms.
arXiv Detail & Related papers (2024-02-16T23:17:00Z) - Limitations of Agents Simulated by Predictive Models [1.6649383443094403]
We outline two structural reasons for why predictive models can fail when turned into agents.
We show that both of those failures are fixed by including a feedback loop from the environment.
Our treatment provides a unifying view of those failure modes, and informs the question of why fine-tuning offline learned policies with online learning makes them more effective.
arXiv Detail & Related papers (2024-02-08T17:08:08Z) - Sim-to-Real Causal Transfer: A Metric Learning Approach to
Causally-Aware Interaction Representations [62.48505112245388]
We take an in-depth look at the causal awareness of modern representations of agent interactions.
We show that recent representations are already partially resilient to perturbations of non-causal agents.
We propose a metric learning approach that regularizes latent representations with causal annotations.
arXiv Detail & Related papers (2023-12-07T18:57:03Z) - Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph.
We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs.
The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Sequential Causal Imitation Learning with Unobserved Confounders [82.22545916247269]
"Monkey see monkey do" is an age-old adage, referring to na"ive imitation without a deep understanding of a system's underlying mechanics.
This paper investigates the problem of causal imitation learning in sequential settings, where the imitator must make multiple decisions per episode.
arXiv Detail & Related papers (2022-08-12T13:53:23Z) - Graceful Degradation and Related Fields [0.0]
graceful degradation refers to the optimisation of model performance as it encounters out-of-distribution data.
This work presents a definition and discussion of graceful degradation and where it can be applied in deployed visual systems.
arXiv Detail & Related papers (2021-06-21T13:56:41Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z) - Feedback in Imitation Learning: Confusion on Causality and Covariate
Shift [12.93527098342393]
We argue that conditioning policies on previous actions leads to a dramatic divergence between "held out" error and performance of the learner in situ.
We analyze existing benchmarks used to test imitation learning approaches.
We find, in a surprising contrast with previous literature, that naive behavioral cloning provides excellent results.
arXiv Detail & Related papers (2021-02-04T20:18:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.