Information-Theoretic State Variable Selection for Reinforcement
Learning
- URL: http://arxiv.org/abs/2401.11512v1
- Date: Sun, 21 Jan 2024 14:51:09 GMT
- Title: Information-Theoretic State Variable Selection for Reinforcement
Learning
- Authors: Charles Westphal, Stephen Hailes, Mirco Musolesi
- Abstract summary: We introduce the Transfer Entropy Redundancy Criterion (TERC), an information-theoretic criterion.
TERC determines if there is textitentropy transferred from state variables to actions during training.
We define an algorithm based on TERC that provably excludes variables from the state that have no effect on the final performance of the agent.
- Score: 4.2050490361120465
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Identifying the most suitable variables to represent the state is a
fundamental challenge in Reinforcement Learning (RL). These variables must
efficiently capture the information necessary for making optimal decisions. In
order to address this problem, in this paper, we introduce the Transfer Entropy
Redundancy Criterion (TERC), an information-theoretic criterion, which
determines if there is \textit{entropy transferred} from state variables to
actions during training. We define an algorithm based on TERC that provably
excludes variables from the state that have no effect on the final performance
of the agent, resulting in more sample efficient learning. Experimental results
show that this speed-up is present across three different algorithm classes
(represented by tabular Q-learning, Actor-Critic, and Proximal Policy
Optimization (PPO)) in a variety of environments. Furthermore, to highlight the
differences between the proposed methodology and the current state-of-the-art
feature selection approaches, we present a series of controlled experiments on
synthetic data, before generalizing to real-world decision-making tasks. We
also introduce a representation of the problem that compactly captures the
transfer of information from state variables to actions as Bayesian networks.
Related papers
- Preventing Local Pitfalls in Vector Quantization via Optimal Transport [77.15924044466976]
We introduce OptVQ, a novel vector quantization method that employs the Sinkhorn algorithm to optimize the optimal transport problem.
Our experiments on image reconstruction tasks demonstrate that OptVQ achieves 100% codebook utilization and surpasses current state-of-the-art VQNs in reconstruction quality.
arXiv Detail & Related papers (2024-12-19T18:58:14Z) - Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.
We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.
Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - Switchable Decision: Dynamic Neural Generation Networks [98.61113699324429]
We propose a switchable decision to accelerate inference by dynamically assigning resources for each data instance.
Our method benefits from less cost during inference while keeping the same accuracy.
arXiv Detail & Related papers (2024-05-07T17:44:54Z) - State Sequences Prediction via Fourier Transform for Representation
Learning [111.82376793413746]
We propose State Sequences Prediction via Fourier Transform (SPF), a novel method for learning expressive representations efficiently.
We theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity.
Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.
arXiv Detail & Related papers (2023-10-24T14:47:02Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - A feature selection method based on Shapley values robust to concept
shift in regression [0.0]
In this paper, we introduce a direct relationship between Shapley values and prediction errors.
We show that our proposed algorithm significantly outperforms state-of-the-art feature selection methods in concept shift scenarios.
We also perform three analyses of standard situations to assess the algorithm's robustness in the absence of shifts.
arXiv Detail & Related papers (2023-04-28T11:34:59Z) - Dynamic Selection in Algorithmic Decision-making [9.172670955429906]
This paper identifies and addresses dynamic selection problems in online learning algorithms with endogenous data.
A novel bias (self-fulfilling bias) arises because the endogeneity of the data influences the choices of decisions.
We propose an instrumental-variable-based algorithm to correct for the bias.
arXiv Detail & Related papers (2021-08-28T01:41:37Z) - More Powerful Conditional Selective Inference for Generalized Lasso by
Parametric Programming [20.309302270008146]
Conditional selective inference (SI) has been studied intensively as a new statistical inference framework for data-driven hypotheses.
We propose a more powerful and general conditional SI method for a class of problems that can be converted into quadratic parametric programming.
arXiv Detail & Related papers (2021-05-11T10:12:00Z) - Greedy Search Algorithms for Unsupervised Variable Selection: A
Comparative Study [3.4888132404740797]
This paper focuses on unsupervised variable selection based dimensionality reduction.
We present a critical evaluation of seven unsupervised greedy variable selection algorithms.
We introduce and evaluate for the first time, a lazy implementation of the variance explained based forward selection component analysis (FSCA) algorithm.
arXiv Detail & Related papers (2021-03-03T21:10:26Z) - Transfer Reinforcement Learning under Unobserved Contextual Information [16.895704973433382]
We study a transfer reinforcement learning problem where the state transitions and rewards are affected by the environmental context.
We develop a method to obtain causal bounds on the transition and reward functions using the demonstrator's data.
We propose new Q learning and UCB-Q learning algorithms that converge to the true value function without bias.
arXiv Detail & Related papers (2020-03-09T22:00:04Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.