Causal Reinforcement Learning: An Instrumental Variable Approach
- URL: http://arxiv.org/abs/2103.04021v1
- Date: Sat, 6 Mar 2021 03:57:46 GMT
- Title: Causal Reinforcement Learning: An Instrumental Variable Approach
- Authors: Jin Li and Ye Luo and Xiaowei Zhang
- Abstract summary: We show that the dynamic interaction between data generation and data analysis leads to a new type of bias -- reinforcement bias -- that exacerbates the endogeneity problem in standard data analysis.
A key contribution of the paper is the development of new techniques that allow for the analysis of the algorithms in general settings where noises feature time-dependency.
- Score: 8.881788084913147
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the standard data analysis framework, data is first collected (once for
all), and then data analysis is carried out. With the advancement of digital
technology, decisionmakers constantly analyze past data and generate new data
through the decisions they make. In this paper, we model this as a Markov
decision process and show that the dynamic interaction between data generation
and data analysis leads to a new type of bias -- reinforcement bias -- that
exacerbates the endogeneity problem in standard data analysis.
We propose a class of instrument variable (IV)-based reinforcement learning
(RL) algorithms to correct for the bias and establish their asymptotic
properties by incorporating them into a two-timescale stochastic approximation
framework. A key contribution of the paper is the development of new techniques
that allow for the analysis of the algorithms in general settings where noises
feature time-dependency.
We use the techniques to derive sharper results on finite-time trajectory
stability bounds: with a polynomial rate, the entire future trajectory of the
iterates from the algorithm fall within a ball that is centered at the true
parameter and is shrinking at a (different) polynomial rate. We also use the
technique to provide formulas for inferences that are rarely done for RL
algorithms. These formulas highlight how the strength of the IV and the degree
of the noise's time dependency affect the inference.
Related papers
- Towards stable real-world equation discovery with assessing
differentiating quality influence [52.2980614912553]
We propose alternatives to the commonly used finite differences-based method.
We evaluate these methods in terms of applicability to problems, similar to the real ones, and their ability to ensure the convergence of equation discovery algorithms.
arXiv Detail & Related papers (2023-11-09T23:32:06Z) - An analysis of Universal Differential Equations for data-driven
discovery of Ordinary Differential Equations [7.48176340790825]
We make a contribution by testing the UDE framework in the context of Ordinary Differential Equations (ODEs) discovery.
We highlight some of the issues arising when combining data-driven approaches and numerical solvers.
We believe that our analysis represents a significant contribution in investigating the capabilities and limitations of Physics-informed Machine Learning frameworks.
arXiv Detail & Related papers (2023-06-17T12:26:50Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Deep Active Learning with Noise Stability [24.54974925491753]
Uncertainty estimation for unlabeled data is crucial to active learning.
We propose a novel algorithm that leverages noise stability to estimate data uncertainty.
Our method is generally applicable in various tasks, including computer vision, natural language processing, and structural data analysis.
arXiv Detail & Related papers (2022-05-26T13:21:01Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - A Priori Denoising Strategies for Sparse Identification of Nonlinear
Dynamical Systems: A Comparative Study [68.8204255655161]
We investigate and compare the performance of several local and global smoothing techniques to a priori denoise the state measurements.
We show that, in general, global methods, which use the entire measurement data set, outperform local methods, which employ a neighboring data subset around a local point.
arXiv Detail & Related papers (2022-01-29T23:31:25Z) - Dynamic Selection in Algorithmic Decision-making [9.172670955429906]
This paper identifies and addresses dynamic selection problems in online learning algorithms with endogenous data.
A novel bias (self-fulfilling bias) arises because the endogeneity of the data influences the choices of decisions.
We propose an instrumental-variable-based algorithm to correct for the bias.
arXiv Detail & Related papers (2021-08-28T01:41:37Z) - Towards Handling Uncertainty-at-Source in AI -- A Review and Next Steps
for Interval Regression [6.166295570030645]
This paper focuses on linear regression for interval-valued data as a recent growth area.
We conduct an in-depth analysis of state-of-the-art methods, elucidating their behaviour, advantages, and pitfalls when applied to datasets with different properties.
arXiv Detail & Related papers (2021-04-15T05:31:10Z) - Online Robust and Adaptive Learning from Data Streams [22.319483572757097]
In online learning, it is necessary to learn robustly to outliers and to adapt quickly to changes in the underlying data generating mechanism.
In this paper, we refer to the former attribute of online learning algorithms as robustness and to the latter as adaptivity.
We propose a novel approximation-based robustness-adaptivity algorithm (SRA) to evaluate the tradeoff.
arXiv Detail & Related papers (2020-07-23T17:49:04Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.