Outcome-Guided Counterfactuals for Reinforcement Learning Agents from a
Jointly Trained Generative Latent Space
- URL: http://arxiv.org/abs/2207.07710v1
- Date: Fri, 15 Jul 2022 19:09:54 GMT
- Title: Outcome-Guided Counterfactuals for Reinforcement Learning Agents from a
Jointly Trained Generative Latent Space
- Authors: Eric Yeh, Pedro Sequeira, Jesse Hostetler, Melinda Gervasio
- Abstract summary: We present a novel generative method for producing unseen and plausible counterfactual examples for reinforcement learning (RL) agents.
Our approach uses a variational autoencoder to train a latent space that jointly encodes information about the observations and outcome variables pertaining to an agent's behavior.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel generative method for producing unseen and plausible
counterfactual examples for reinforcement learning (RL) agents based upon
outcome variables that characterize agent behavior. Our approach uses a
variational autoencoder to train a latent space that jointly encodes
information about the observations and outcome variables pertaining to an
agent's behavior. Counterfactuals are generated using traversals in this latent
space, via gradient-driven updates as well as latent interpolations against
cases drawn from a pool of examples. These include updates to raise the
likelihood of generated examples, which improves the plausibility of generated
counterfactuals. From experiments in three RL environments, we show that these
methods produce counterfactuals that are more plausible and proximal to their
queries compared to purely outcome-driven or case-based baselines. Finally, we
show that a latent jointly trained to reconstruct both the input observations
and behavioral outcome variables produces higher-quality counterfactuals over
latents trained solely to reconstruct the observation inputs.
Related papers
- Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes [36.12653178844828]
Trajectory forecasting is crucial for video surveillance analytics, as it enables the anticipation of future movements for a set of agents.
We introduce Vector Quantized Variational Autoencoders (VQ-VAEs), which utilize a discrete latent space to tackle the issue of posterior collapse.
We show that such a two-fold framework, augmented with instance-level discretization, leads to accurate and diverse forecasts.
arXiv Detail & Related papers (2024-05-31T10:13:17Z) - Reframing the Relationship in Out-of-Distribution Detection [4.182518087792777]
We introduce a novel approach that integrates the agent paradigm into the Out-of-distribution (OOD) detection task.
Our proposed method, Concept Matching with Agent (CMA), employs neutral prompts as agents to augment the CLIP-based OOD detection process.
Our extensive experimental results showcase the superior performance of CMA over both zero-shot and training-required methods.
arXiv Detail & Related papers (2024-05-27T02:27:28Z) - Continual Offline Reinforcement Learning via Diffusion-based Dual Generative Replay [16.269591842495892]
We study a practical paradigm that facilitates forward transfer and mitigates catastrophic forgetting to tackle sequential offline tasks.
We propose a dual generative replay framework that retains previous knowledge by concurrent replay of generated pseudo-data.
arXiv Detail & Related papers (2024-04-16T15:39:11Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Disentangling Action Sequences: Discovering Correlated Samples [6.179793031975444]
We demonstrate the data itself plays a crucial role in disentanglement and instead of the factors, and the disentangled representations align the latent variables with the action sequences.
We propose a novel framework, fractional variational autoencoder (FVAE) to disentangle the action sequences with different significance step-by-step.
Experimental results on dSprites and 3D Chairs show that FVAE improves the stability of disentanglement.
arXiv Detail & Related papers (2020-10-17T07:37:50Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z) - Estimating the Effects of Continuous-valued Interventions using
Generative Adversarial Networks [103.14809802212535]
We build on the generative adversarial networks (GANs) framework to address the problem of estimating the effect of continuous-valued interventions.
Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions.
To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator.
arXiv Detail & Related papers (2020-02-27T18:46:21Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.