Two ways towards combining Sequential Neural Network and Statistical
Methods to Improve the Prediction of Time Series
- URL: http://arxiv.org/abs/2110.00082v1
- Date: Thu, 30 Sep 2021 20:34:58 GMT
- Title: Two ways towards combining Sequential Neural Network and Statistical
Methods to Improve the Prediction of Time Series
- Authors: Jingwei Li
- Abstract summary: We propose two different directions to integrate the two, a decomposition-based method and a method exploiting the statistic extraction of data features.
We evaluate the proposal using time series data with varying degrees of stability.
Performance results show that both methods can outperform existing schemes that use models and learning separately.
- Score: 0.34265828682659694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Statistic modeling and data-driven learning are the two vital fields that
attract many attentions. Statistic models intend to capture and interpret the
relationships among variables, while data-based learning attempt to extract
information directly from the data without pre-processing through complex
models. Given the extensive studies in both fields, a subtle issue is how to
properly integrate data based methods with existing knowledge or models. In
this paper, based on the time series data, we propose two different directions
to integrate the two, a decomposition-based method and a method exploiting the
statistic extraction of data features. The first one decomposes the data into
linear stable, nonlinear stable and unstable parts, where suitable statistical
models are used for the linear stable and nonlinear stable parts while the
appropriate machine learning tools are used for the unstable parts. The second
one applies statistic models to extract statistics features of data and feed
them as additional inputs into the machine learning platform for training. The
most critical and challenging thing is how to determine and extract the
valuable information from mathematical or statistical models to boost the
performance of machine learning algorithms. We evaluate the proposal using time
series data with varying degrees of stability. Performance results show that
both methods can outperform existing schemes that use models and learning
separately, and the improvements can be over 60%. Both our proposed methods are
promising in bridging the gap between model-based and data-driven schemes and
integrating the two to provide an overall higher learning performance.
Related papers
- Data Augmentation for Sparse Multidimensional Learning Performance Data Using Generative AI [17.242331892899543]
Learning performance data describe correct and incorrect answers or problem-solving attempts in adaptive learning.
Learning performance data tend to be highly sparse (80%(sim)90% missing observations) in most real-world applications due to adaptive item selection.
This article proposes a systematic framework for augmenting learner data to address data sparsity in learning performance data.
arXiv Detail & Related papers (2024-09-24T00:25:07Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Unlearning Information Bottleneck: Machine Unlearning of Systematic Patterns and Biases [6.936871609178494]
We present Unlearning Information Bottleneck (UIB), a novel information-theoretic framework designed to enhance the process of machine unlearning.
By proposing a variational upper bound, we recalibrate the model parameters through a dynamic prior that integrates changes in data distribution with an affordable computational cost.
Our experiments across various datasets, models, and unlearning methods demonstrate that our approach effectively removes systematic patterns and biases while maintaining the performance of models post-unlearning.
arXiv Detail & Related papers (2024-05-22T21:54:05Z) - An Entropy-Based Model for Hierarchical Learning [3.1473798197405944]
A common feature among real-world datasets is that data domains are multiscale.
We propose a learning model that exploits this multiscale data structure.
The hierarchical learning model is inspired by the logical and progressive easy-to-hard learning mechanism of human beings.
arXiv Detail & Related papers (2022-12-30T13:14:46Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - A Data-Driven Method for Automated Data Superposition with Applications
in Soft Matter Science [0.0]
We develop a data-driven, non-parametric method for superposing experimental data with arbitrary coordinate transformations.
Our method produces interpretable data-driven models that may inform applications such as materials classification, design, and discovery.
arXiv Detail & Related papers (2022-04-20T14:58:04Z) - Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability.
We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code.
We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z) - Learning physically consistent mathematical models from data using group
sparsity [2.580765958706854]
In areas like biology, high noise levels, sensor-induced correlations, and strong inter-system variability can render data-driven models nonsensical or physically inconsistent.
We show several applications from systems biology that demonstrate the benefits of enforcing $textitpriors$ in data-driven modeling.
arXiv Detail & Related papers (2020-12-11T14:45:38Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.