Doubly Inhomogeneous Reinforcement Learning
- URL: http://arxiv.org/abs/2211.03983v1
- Date: Tue, 8 Nov 2022 03:41:14 GMT
- Title: Doubly Inhomogeneous Reinforcement Learning
- Authors: Liyuan Hu and Mengbing Li and Chengchun Shi and Zhenke Wu and Piotr
Fryzlewicz
- Abstract summary: We propose an original algorithm to determine the best data chunks" that display similar dynamics over time and across individuals for policy learning.
Our method is general, and works with a wide range of clustering and change point detection algorithms.
- Score: 4.334006170547247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies reinforcement learning (RL) in doubly inhomogeneous
environments under temporal non-stationarity and subject heterogeneity. In a
number of applications, it is commonplace to encounter datasets generated by
system dynamics that may change over time and population, challenging
high-quality sequential decision making. Nonetheless, most existing RL
solutions require either temporal stationarity or subject homogeneity, which
would result in sub-optimal policies if both assumptions were violated. To
address both challenges simultaneously, we propose an original algorithm to
determine the ``best data chunks" that display similar dynamics over time and
across individuals for policy learning, which alternates between most recent
change point detection and cluster identification. Our method is general, and
works with a wide range of clustering and change point detection algorithms. It
is multiply robust in the sense that it takes multiple initial estimators as
input and only requires one of them to be consistent. Moreover, by borrowing
information over time and population, it allows us to detect weaker signals and
has better convergence properties when compared to applying the clustering
algorithm per time or the change point detection algorithm per subject.
Empirically, we demonstrate the usefulness of our method through extensive
simulations and a real data application.
Related papers
- Detection of Anomalies in Multivariate Time Series Using Ensemble
Techniques [3.2422067155309806]
We propose an ensemble technique that combines multiple base models toward the final decision.
A semi-supervised approach using a Logistic Regressor to combine the base models' outputs is also proposed.
The performance improvement in terms of anomaly detection accuracy reaches 2% for the unsupervised and at least 10% for the semi-supervised models.
arXiv Detail & Related papers (2023-08-06T17:51:22Z) - Implicit neural representation for change detection [15.741202788959075]
Most commonly used approaches to detecting changes in point clouds are based on supervised methods.
We propose an unsupervised approach that comprises two components: Implicit Neural Representation (INR) for continuous shape reconstruction and a Gaussian Mixture Model for categorising changes.
We apply our method to a benchmark dataset comprising simulated LiDAR point clouds for urban sprawling.
arXiv Detail & Related papers (2023-07-28T09:26:00Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Granger Causality Based Hierarchical Time Series Clustering for State
Estimation [8.384689499720515]
Clustering is useful when working with a large volume of unlabeled data.
We propose a hierarchical time series clustering technique based on symbolic dynamic filtering and Granger causality.
A new distance metric based on Granger causality is proposed and used for the time series clustering, as well as validated on empirical data sets.
arXiv Detail & Related papers (2021-04-09T06:14:54Z) - Conjugate Mixture Models for Clustering Multimodal Data [24.640116037967985]
The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors.
We show that multimodal clustering can be addressed within a novel framework, namely conjugate mixture models.
arXiv Detail & Related papers (2020-12-09T10:13:22Z) - From Time Series to Euclidean Spaces: On Spatial Transformations for
Temporal Clustering [5.220940151628734]
We show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data.
We propose a novel approach to temporal clustering, in which we transform the input time series into a distance-based projected representation.
arXiv Detail & Related papers (2020-10-02T09:08:16Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.