Related papers: From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning

From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning

URL: http://arxiv.org/abs/2510.02484v1
Date: Thu, 02 Oct 2025 18:43:20 GMT
Title: From Pixels to Factors: Learning Independently Controllable State Variables for Reinforcement Learning
Authors: Rafael Rodriguez-Sanchez, Cameron Allen, George Konidaris,
Abstract summary: Action-Controllable Factorization (ACF) is a contrastive learning approach that uncovers independently controllable latent variables.<n>ACF consistently outperforms baseline disentanglement algorithms.
Score: 10.819503014571671
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Algorithms that exploit factored Markov decision processes are far more sample-efficient than factor-agnostic methods, yet they assume a factored representation is known a priori -- a requirement that breaks down when the agent sees only high-dimensional observations. Conversely, deep reinforcement learning handles such inputs but cannot benefit from factored structure. We address this representation problem with Action-Controllable Factorization (ACF), a contrastive learning approach that uncovers independently controllable latent variables -- state components each action can influence separately. ACF leverages sparsity: actions typically affect only a subset of variables, while the rest evolve under the environment's dynamics, yielding informative data for contrastive training. ACF recovers the ground truth controllable factors directly from pixel observations on three benchmarks with known factored structure -- Taxi, FourRooms, and MiniGrid-DoorKey -- consistently outperforming baseline disentanglement algorithms.

Related papers

Learning in Markov Decision Processes with Exogenous Dynamics [39.6376520918509]
We study a structured class of MDPs characterized by state components whose transitions are independent of the agent's actions.<n>We show that exploiting this structure yields significantly improved learning guarantees.<n>We empirically validate our approach across classical toy settings and real-world-inspired environments.
arXiv Detail & Related papers (2026-03-03T11:10:45Z)
Efficient Solution and Learning of Robust Factored MDPs [57.2416302384766]
Learning r-MDPs from interactions with an unknown environment enables the synthesis of robust policies with provable guarantees on performance.<n>We propose novel methods for solving and learning r-MDPs based on factored state representations.
arXiv Detail & Related papers (2025-08-01T15:23:15Z)
Variable Importance in High-Dimensional Settings Requires Grouping [19.095605415846187]
Conditional Permutation Importance (CPI) bypasses PI's limitations in such cases. Grouping variables statistically via clustering or some prior knowledge gains some power back. We show that the approach extended with stacking controls the type-I error even with highly-correlated groups.
arXiv Detail & Related papers (2023-12-18T00:21:47Z)
Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information [77.19830787312743]
In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. We introduce a new problem setting for reinforcement learning, the Exogenous Decision Process (ExoMDP), in which the state space admits an (unknown) factorization into a small controllable component and a large irrelevant component. We provide a new algorithm, ExoRL, which learns a near-optimal policy with sample complexity in the size of the endogenous component.
arXiv Detail & Related papers (2022-06-09T05:19:32Z)
Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization [25.864029391642422]
We propose Disentangled Representations for Counterfactual Regression via Mutual Information Minimization (MIM-DRCFR) We use a multi-task learning framework to share information when learning the latent factors and incorporates MI minimization learning criteria to ensure the independence of these factors. Experiments including public benchmarks and real-world industrial user growth datasets demonstrate that our method performs much better than state-of-the-art methods.
arXiv Detail & Related papers (2022-06-02T12:49:41Z)
Confounder Identification-free Causal Visual Feature Learning [84.28462256571822]
We propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders. CICF models the interventions among different samples based on front-door criterion, and then approximates the global-scope intervening effect upon the instance-level interventions. We uncover the relation between CICF and the popular meta-learning strategy MAML, and provide an interpretation of why MAML works from the theoretical perspective.
arXiv Detail & Related papers (2021-11-26T10:57:47Z)
Visual Representation Learning Does Not Generalize Strongly Within the Same Domain [41.66817277929783]
We test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets. We train and test 2000+ models and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias.
arXiv Detail & Related papers (2021-07-17T11:24:18Z)
CausalX: Causal Explanations and Block Multilinear Factor Analysis [3.087360758008569]
We propose a unified multilinear model of wholes and parts. We introduce an incremental bottom-up computational alternative, the Incremental M-mode Block SVD. The resulting object representation is an interpretable choice of intrinsic causal factor representations related to an object's hierarchy of wholes and parts.
arXiv Detail & Related papers (2021-02-25T13:49:01Z)
Disentangling Observed Causal Effects from Latent Confounders using Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions. We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z)
Learning Causal Semantic Representation for Out-of-Distribution Prediction [125.38836464226092]
We propose a Causal Semantic Generative model (CSG) based on a causal reasoning so that the two factors are modeled separately. We show that CSG can identify the semantic factor by fitting training data, and this semantic-identification guarantees the boundedness of OOD generalization error.
arXiv Detail & Related papers (2020-11-03T13:16:05Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.