Doubly Inhomogeneous Reinforcement Learning
- URL: http://arxiv.org/abs/2211.03983v3
- Date: Sun, 16 Mar 2025 17:25:47 GMT
- Title: Doubly Inhomogeneous Reinforcement Learning
- Authors: Liyuan Hu, Mengbing Li, Chengchun Shi, Zhenke Wu, Piotr Fryzlewicz,
- Abstract summary: We propose an original algorithm to determine the best data chunks" that display similar dynamics over time and across individuals for policy learning.<n>Our method is general, and works with a wide range of clustering and change point detection algorithms.
- Score: 9.131331650922878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies reinforcement learning (RL) in doubly inhomogeneous environments under temporal non-stationarity and subject heterogeneity. In a number of applications, it is commonplace to encounter datasets generated by system dynamics that may change over time and population, challenging high-quality sequential decision making. Nonetheless, most existing RL solutions require either temporal stationarity or subject homogeneity, which would result in sub-optimal policies if both assumptions were violated. To address both challenges simultaneously, we propose an original algorithm to determine the ``best data chunks" that display similar dynamics over time and across individuals for policy learning, which alternates between most recent change point detection and cluster identification. Our method is general, and works with a wide range of clustering and change point detection algorithms. It is multiply robust in the sense that it takes multiple initial estimators as input and only requires one of them to be consistent. Moreover, by borrowing information over time and population, it allows us to detect weaker signals and has better convergence properties when compared to applying the clustering algorithm per time or the change point detection algorithm per subject. Empirically, we demonstrate the usefulness of our method through extensive simulations and a real data application.
Related papers
- A system identification approach to clustering vector autoregressive time series [50.66782357329375]
Clustering time series based on their underlying dynamics is keeping attracting researchers due to its impacts on assisting complex system modelling.<n>Most current time series clustering methods handle only scalar time series, treat them as white noise, or rely on domain knowledge for high-quality feature construction.<n>Instead of relying on feature/metric construction, the system identification approach allows treating vector time series clustering by explicitly considering their underlying autoregressive dynamics.
arXiv Detail & Related papers (2025-05-20T14:31:44Z) - FCPCA: Fuzzy clustering of high-dimensional time series based on common principal component analysis [11.138320457692288]
This work introduces a novel fuzzy clustering approach based on common principal component analysis.<n>We show that our proposed clustering method outperforms several existing approaches in the literature.<n>An interesting application involving brain signals from different drivers recorded from a simulated driving experiment illustrates the potential of the approach.
arXiv Detail & Related papers (2025-05-12T06:59:17Z) - Detection of Anomalies in Multivariate Time Series Using Ensemble
Techniques [3.2422067155309806]
We propose an ensemble technique that combines multiple base models toward the final decision.
A semi-supervised approach using a Logistic Regressor to combine the base models' outputs is also proposed.
The performance improvement in terms of anomaly detection accuracy reaches 2% for the unsupervised and at least 10% for the semi-supervised models.
arXiv Detail & Related papers (2023-08-06T17:51:22Z) - Implicit neural representation for change detection [15.741202788959075]
Most commonly used approaches to detecting changes in point clouds are based on supervised methods.
We propose an unsupervised approach that comprises two components: Implicit Neural Representation (INR) for continuous shape reconstruction and a Gaussian Mixture Model for categorising changes.
We apply our method to a benchmark dataset comprising simulated LiDAR point clouds for urban sprawling.
arXiv Detail & Related papers (2023-07-28T09:26:00Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Granger Causality Based Hierarchical Time Series Clustering for State
Estimation [8.384689499720515]
Clustering is useful when working with a large volume of unlabeled data.
We propose a hierarchical time series clustering technique based on symbolic dynamic filtering and Granger causality.
A new distance metric based on Granger causality is proposed and used for the time series clustering, as well as validated on empirical data sets.
arXiv Detail & Related papers (2021-04-09T06:14:54Z) - Conjugate Mixture Models for Clustering Multimodal Data [24.640116037967985]
The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors.
We show that multimodal clustering can be addressed within a novel framework, namely conjugate mixture models.
arXiv Detail & Related papers (2020-12-09T10:13:22Z) - From Time Series to Euclidean Spaces: On Spatial Transformations for
Temporal Clustering [5.220940151628734]
We show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data.
We propose a novel approach to temporal clustering, in which we transform the input time series into a distance-based projected representation.
arXiv Detail & Related papers (2020-10-02T09:08:16Z) - TadGAN: Time Series Anomaly Detection Using Generative Adversarial
Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs)
To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics.
To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.