Related papers: Information-Theoretic State Variable Selection for Reinforcement Learning

Information-Theoretic State Variable Selection for Reinforcement Learning

URL: http://arxiv.org/abs/2401.11512v1
Date: Sun, 21 Jan 2024 14:51:09 GMT
Title: Information-Theoretic State Variable Selection for Reinforcement Learning
Authors: Charles Westphal, Stephen Hailes, Mirco Musolesi
Abstract summary: We introduce the Transfer Entropy Redundancy Criterion (TERC), an information-theoretic criterion. TERC determines if there is textitentropy transferred from state variables to actions during training. We define an algorithm based on TERC that provably excludes variables from the state that have no effect on the final performance of the agent.
Score: 4.2050490361120465
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Identifying the most suitable variables to represent the state is a fundamental challenge in Reinforcement Learning (RL). These variables must efficiently capture the information necessary for making optimal decisions. In order to address this problem, in this paper, we introduce the Transfer Entropy Redundancy Criterion (TERC), an information-theoretic criterion, which determines if there is \textit{entropy transferred} from state variables to actions during training. We define an algorithm based on TERC that provably excludes variables from the state that have no effect on the final performance of the agent, resulting in more sample efficient learning. Experimental results show that this speed-up is present across three different algorithm classes (represented by tabular Q-learning, Actor-Critic, and Proximal Policy Optimization (PPO)) in a variety of environments. Furthermore, to highlight the differences between the proposed methodology and the current state-of-the-art feature selection approaches, we present a series of controlled experiments on synthetic data, before generalizing to real-world decision-making tasks. We also introduce a representation of the problem that compactly captures the transfer of information from state variables to actions as Bayesian networks.

Related papers

Counterfactual experience augmented off-policy reinforcement learning [9.77739016575541]
CEA builds efficient inference model and enhances representativeness of learning data. Uses variational autoencoders to model the dynamic patterns of state transitions. Builds a complete counterfactual experience to alleviate the out-of-distribution problem of the learning data.
arXiv Detail & Related papers (2025-03-18T02:32:50Z)
Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem. Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z)
Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC. We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z)
Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance. Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z)
Decentralized Learning Strategies for Estimation Error Minimization with Graph Neural Networks [94.2860766709971]
We address the challenge of sampling and remote estimation for autoregressive Markovian processes in a wireless network with statistically-identical agents. Our goal is to minimize time-average estimation error and/or age of information with decentralized scalable sampling and transmission policies.
arXiv Detail & Related papers (2024-04-04T06:24:11Z)
State Sequences Prediction via Fourier Transform for Representation Learning [111.82376793413746]
We propose State Sequences Prediction via Fourier Transform (SPF), a novel method for learning expressive representations efficiently. We theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity. Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.
arXiv Detail & Related papers (2023-10-24T14:47:02Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
A feature selection method based on Shapley values robust to concept shift in regression [0.0]
In this paper, we introduce a direct relationship between Shapley values and prediction errors. We show that our proposed algorithm significantly outperforms state-of-the-art feature selection methods in concept shift scenarios. We also perform three analyses of standard situations to assess the algorithm's robustness in the absence of shifts.
arXiv Detail & Related papers (2023-04-28T11:34:59Z)
An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization [52.44068740462729]
We present an information-theoretic perspective on the VICReg objective. We derive a generalization bound for VICReg, revealing its inherent advantages for downstream tasks. We introduce a family of SSL methods derived from information-theoretic principles that outperform existing SSL techniques.
arXiv Detail & Related papers (2023-03-01T16:36:25Z)
Dynamic Selection in Algorithmic Decision-making [9.172670955429906]
This paper identifies and addresses dynamic selection problems in online learning algorithms with endogenous data. A novel bias (self-fulfilling bias) arises because the endogeneity of the data influences the choices of decisions. We propose an instrumental-variable-based algorithm to correct for the bias.
arXiv Detail & Related papers (2021-08-28T01:41:37Z)
More Powerful Conditional Selective Inference for Generalized Lasso by Parametric Programming [20.309302270008146]
Conditional selective inference (SI) has been studied intensively as a new statistical inference framework for data-driven hypotheses. We propose a more powerful and general conditional SI method for a class of problems that can be converted into quadratic parametric programming.
arXiv Detail & Related papers (2021-05-11T10:12:00Z)
Greedy Search Algorithms for Unsupervised Variable Selection: A Comparative Study [3.4888132404740797]
This paper focuses on unsupervised variable selection based dimensionality reduction. We present a critical evaluation of seven unsupervised greedy variable selection algorithms. We introduce and evaluate for the first time, a lazy implementation of the variance explained based forward selection component analysis (FSCA) algorithm.
arXiv Detail & Related papers (2021-03-03T21:10:26Z)
Transfer Reinforcement Learning under Unobserved Contextual Information [16.895704973433382]
We study a transfer reinforcement learning problem where the state transitions and rewards are affected by the environmental context. We develop a method to obtain causal bounds on the transition and reward functions using the demonstrator's data. We propose new Q learning and UCB-Q learning algorithms that converge to the true value function without bias.
arXiv Detail & Related papers (2020-03-09T22:00:04Z)
Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments. We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data. Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.