Iterative Forgetting: Online Data Stream Regression Using Database-Inspired Adaptive Granulation
- URL: http://arxiv.org/abs/2403.09588v1
- Date: Thu, 14 Mar 2024 17:26:00 GMT
- Title: Iterative Forgetting: Online Data Stream Regression Using Database-Inspired Adaptive Granulation
- Authors: Niket Kathiriya, Hossein Haeri, Cindy Chen, Kshitij Jerath,
- Abstract summary: We present a database-inspired datastream regression model that uses inspiration from R*-trees to create granules from incoming datastreams.
Experiments demonstrate that the ability of this method to discard data produces a significant order-of-magnitude improvement in latency and training time.
- Score: 1.6874375111244329
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many modern systems, such as financial, transportation, and telecommunications systems, are time-sensitive in the sense that they demand low-latency predictions for real-time decision-making. Such systems often have to contend with continuous unbounded data streams as well as concept drift, which are challenging requirements that traditional regression techniques are unable to cater to. There exists a need to create novel data stream regression methods that can handle these scenarios. We present a database-inspired datastream regression model that (a) uses inspiration from R*-trees to create granules from incoming datastreams such that relevant information is retained, (b) iteratively forgets granules whose information is deemed to be outdated, thus maintaining a list of only recent, relevant granules, and (c) uses the recent data and granules to provide low-latency predictions. The R*-tree-inspired approach also makes the algorithm amenable to integration with database systems. Our experiments demonstrate that the ability of this method to discard data produces a significant order-of-magnitude improvement in latency and training time when evaluated against the most accurate state-of-the-art algorithms, while the R*-tree-inspired granulation technique provides competitively accurate predictions
Related papers
- Computationally and Memory-Efficient Robust Predictive Analytics Using Big Data [0.0]
This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data.
We utilize Robust Principal Component Analysis (RPCA) for effective noise reduction and outlier elimination, and Optimal Sensor Placement (OSP) for efficient data compression and storage.
arXiv Detail & Related papers (2024-03-27T22:39:08Z) - A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation [35.46631415365955]
We introduce a conditional diffusion framework called C$2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information.
Our experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-18T11:59:04Z) - GenQ: Quantization in Low Data Regimes with Generative Synthetic Data [30.489005912126544]
GenQ is a novel approach employing an advanced Generative AI model to generate high-resolution synthetic data.
In case of limited data availability, the actual data is used to guide the synthetic data generation process.
Through rigorous experimentation, GenQ establishes new benchmarks in data-free and data-scarce quantization.
arXiv Detail & Related papers (2023-12-07T23:31:42Z) - Deep Ensembles Meets Quantile Regression: Uncertainty-aware Imputation
for Time Series [49.992908221544624]
Time series data often exhibit numerous missing values, which is the time series imputation task.
Previous deep learning methods have been shown to be effective for time series imputation.
We propose a non-generative time series imputation method that produces accurate imputations with inherent uncertainty.
arXiv Detail & Related papers (2023-12-03T05:52:30Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - A Meta-learning Approach to Reservoir Computing: Time Series Prediction
with Limited Data [0.0]
We present a data-driven approach to automatically extract an appropriate model structure from experimentally observed processes.
We demonstrate our approach on a simple benchmark problem, where it beats the state of the art meta-learning techniques.
arXiv Detail & Related papers (2021-10-07T18:23:14Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - DeepRite: Deep Recurrent Inverse TreatmEnt Weighting for Adjusting
Time-varying Confounding in Modern Longitudinal Observational Data [68.29870617697532]
We propose Deep Recurrent Inverse TreatmEnt weighting (DeepRite) for time-varying confounding in longitudinal data.
DeepRite is shown to recover the ground truth from synthetic data, and estimate unbiased treatment effects from real data.
arXiv Detail & Related papers (2020-10-28T15:05:08Z) - Improved Predictive Deep Temporal Neural Networks with Trend Filtering [22.352437268596674]
We propose a new prediction framework based on deep neural networks and a trend filtering.
We reveal that the predictive performance of deep temporal neural networks improves when the training data is temporally processed by a trend filtering.
arXiv Detail & Related papers (2020-10-16T08:29:36Z) - Real-Time Regression with Dividing Local Gaussian Processes [62.01822866877782]
Local Gaussian processes are a novel, computationally efficient modeling approach based on Gaussian process regression.
Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice.
A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed.
arXiv Detail & Related papers (2020-06-16T18:43:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.