Related papers: Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data

Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data

URL: http://arxiv.org/abs/2303.08242v1
Date: Tue, 14 Mar 2023 21:26:30 GMT
Title: Optimal Sampling Designs for Multi-dimensional Streaming Time Series with Application to Power Grid Sensor Data
Authors: Rui Xie, Shuyang Bai and Ping Ma
Abstract summary: We study the data-dependent sample selection and online inference problem for a multi-dimensional streaming time series. Inspired by D-optimality criterion in design of experiments, we propose a class of online data reduction methods. We show that the optimal solution amounts to a strategy that is a mixture of Bernoulli sampling and leverage score sampling.
Score: 4.891140022708977
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The Internet of Things (IoT) system generates massive high-speed temporally correlated streaming data and is often connected with online inference tasks under computational or energy constraints. Online analysis of these streaming time series data often faces a trade-off between statistical efficiency and computational cost. One important approach to balance this trade-off is sampling, where only a small portion of the sample is selected for the model fitting and update. Motivated by the demands of dynamic relationship analysis of IoT system, we study the data-dependent sample selection and online inference problem for a multi-dimensional streaming time series, aiming to provide low-cost real-time analysis of high-speed power grid electricity consumption data. Inspired by D-optimality criterion in design of experiments, we propose a class of online data reduction methods that achieve an optimal sampling criterion and improve the computational efficiency of the online analysis. We show that the optimal solution amounts to a strategy that is a mixture of Bernoulli sampling and leverage score sampling. The leverage score sampling involves auxiliary estimations that have a computational advantage over recursive least squares updates. Theoretical properties of the auxiliary estimations involved are also discussed. When applied to European power grid consumption data, the proposed leverage score based sampling methods outperform the benchmark sampling method in online estimation and prediction. The general applicability of the sampling-assisted online estimation method is assessed via simulation studies.

Related papers

Parallel simulation for sampling under isoperimetry and score-based diffusion models [56.39904484784127]
As data size grows, reducing the iteration cost becomes an important goal. Inspired by the success of the parallel simulation of the initial value problem in scientific computation, we propose parallel Picard methods for sampling tasks. Our work highlights the potential advantages of simulation methods in scientific computation for dynamics-based sampling and diffusion models.
arXiv Detail & Related papers (2024-12-10T11:50:46Z)
A Distribution-Aware Flow-Matching for Generating Unstructured Data for Few-Shot Reinforcement Learning [1.0709300917082865]
We introduce a distribution-aware flow matching, designed to generate synthetic unstructured data tailored for few-shot reinforcement learning (RL) on embedded processors. We apply feature weighting through Random Forests to prioritize critical data aspects, thereby improving the precision of the generated synthetic data. Our method provides a stable convergence based on max Q-value while enhancing frame rate by 30% in the very beginning first timestamps.
arXiv Detail & Related papers (2024-09-21T15:50:59Z)
Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning [50.332027356848094]
AI-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The mapping between context and AI model parameters is ideally done in a zero-shot fashion. This paper introduces a general methodology for the online optimization of AMS mappings.
arXiv Detail & Related papers (2024-06-22T11:17:50Z)
Energy-Efficient Federated Edge Learning with Streaming Data: A Lyapunov Optimization Approach [34.00679567444125]
We develop a dynamic scheduling and resource allocation algorithm to address the inherent randomness in data arrivals and resource availability under long-term energy constraints. Our proposed algorithm makes adaptive decisions on device scheduling, computational capacity adjustment, and allocation of bandwidth and transmit power in every round. The effectiveness of our scheme is verified through simulation results, demonstrating improved learning performance and energy efficiency as compared to baseline schemes.
arXiv Detail & Related papers (2024-05-20T14:13:22Z)
Online Tensor Inference [0.0]
Traditional offline learning, involving the storage and utilization of all data in each computational iteration, becomes impractical for high-dimensional tensor data. Existing low-rank tensor methods lack the capability for statistical inference in an online fashion. Our approach employs Gradient Descent (SGD) to enable efficient real-time data processing without extensive memory requirements.
arXiv Detail & Related papers (2023-12-28T16:37:48Z)
Soft Random Sampling: A Theoretical and Empirical Analysis [59.719035355483875]
Soft random sampling (SRS) is a simple yet effective approach for efficient deep neural networks when dealing with massive data. It selects a uniformly speed at random with replacement from each data set in each epoch. It is shown to be a powerful and competitive strategy with significant and competitive performance on real-world industrial scale.
arXiv Detail & Related papers (2023-11-21T17:03:21Z)
Low-Frequency Load Identification using CNN-BiLSTM Attention Mechanism [0.0]
Non-intrusive Load Monitoring (NILM) is an established technique for effective and cost-efficient electricity consumption management. This paper presents a hybrid learning approach, consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory (BILSTM) CNN-BILSTM model is adept at extracting both temporal (time-related) and spatial (location-related) features, allowing it to precisely identify energy consumption patterns at the appliance level.
arXiv Detail & Related papers (2023-11-14T21:02:27Z)
Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks. We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations. We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z)
Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints. Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z)
Reinforced Approximate Exploratory Data Analysis [7.974685452145769]
We are first to consider the impact of sampling in interactive data exploration settings as they introduce approximation errors. We propose a Deep Reinforcement Learning (DRL) based framework which can optimize the sample selection in order to keep the analysis and insight generation flow intact.
arXiv Detail & Related papers (2022-12-12T20:20:22Z)
Statistical Inference After Adaptive Sampling for Longitudinal Data [9.468593929311867]
We develop novel methods to perform a variety of statistical analyses on adaptively sampled data via Z-estimation. We develop novel theoretical tools for empirical processes on non-i.i.d., adaptively sampled longitudinal data which may be of independent interest.
arXiv Detail & Related papers (2022-02-14T23:48:13Z)
Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity [57.275753974812666]
Federated learning involves learning from data samples distributed across a network of clients while the data remains local. In this paper, we propose a novel straggler-resilient federated learning method that incorporates statistical characteristics of the clients' data to adaptively select the clients in order to speed up the learning procedure.
arXiv Detail & Related papers (2020-12-28T19:21:14Z)
A Generative Learning Approach for Spatio-temporal Modeling in Connected Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles. LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure. In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.