Exploring Bayesian Surprise to Prevent Overfitting and to Predict Model
Performance in Non-Intrusive Load Monitoring
- URL: http://arxiv.org/abs/2009.07756v1
- Date: Wed, 16 Sep 2020 15:39:08 GMT
- Title: Exploring Bayesian Surprise to Prevent Overfitting and to Predict Model
Performance in Non-Intrusive Load Monitoring
- Authors: Richard Jones, Christoph Klemenjak, Stephen Makonin, Ivan V. Bajic
- Abstract summary: Non-Intrusive Load Monitoring (NILM) is a field of research focused on segregating constituent electrical loads in a system based only on their aggregated signal.
We quantify the degree of surprise between the predictive distribution (termed postdictive surprise) and the transitional probabilities (termed transitional surprise)
This work provides clear evidence that a point of diminishing returns of model performance with respect to dataset size exists.
- Score: 25.32973996508579
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-Intrusive Load Monitoring (NILM) is a field of research focused on
segregating constituent electrical loads in a system based only on their
aggregated signal. Significant computational resources and research time are
spent training models, often using as much data as possible, perhaps driven by
the preconception that more data equates to more accurate models and better
performing algorithms. When has enough prior training been done? When has a
NILM algorithm encountered new, unseen data? This work applies the notion of
Bayesian surprise to answer these questions which are important for both
supervised and unsupervised algorithms. We quantify the degree of surprise
between the predictive distribution (termed postdictive surprise), as well as
the transitional probabilities (termed transitional surprise), before and after
a window of observations. We compare the performance of several benchmark NILM
algorithms supported by NILMTK, in order to establish a useful threshold on the
two combined measures of surprise. We validate the use of transitional surprise
by exploring the performance of a popular Hidden Markov Model as a function of
surprise threshold. Finally, we explore the use of a surprise threshold as a
regularization technique to avoid overfitting in cross-dataset performance.
Although the generality of the specific surprise threshold discussed herein may
be suspect without further testing, this work provides clear evidence that a
point of diminishing returns of model performance with respect to dataset size
exists. This has implications for future model development, dataset
acquisition, as well as aiding in model flexibility during deployment.
Related papers
- Posterior Uncertainty Quantification in Neural Networks using Data Augmentation [3.9860047080844807]
We show that deep ensembling is a fundamentally mis-specified model class, since it assumes that future data are supported on existing observations only.
We propose MixupMP, a method that constructs a more realistic predictive distribution using popular data augmentation techniques.
Our empirical analysis showcases that MixupMP achieves superior predictive performance and uncertainty quantification on various image classification datasets.
arXiv Detail & Related papers (2024-03-18T17:46:07Z) - Informed Spectral Normalized Gaussian Processes for Trajectory Prediction [0.0]
We propose a novel regularization-based continual learning method for SNGPs.
Our proposal builds upon well-established methods and requires no rehearsal memory or parameter expansion.
We apply our informed SNGP model to the trajectory prediction problem in autonomous driving by integrating prior drivability knowledge.
arXiv Detail & Related papers (2024-03-18T17:05:24Z) - A Meta-Learning Approach to Predicting Performance and Data Requirements [163.4412093478316]
We propose an approach to estimate the number of samples required for a model to reach a target performance.
We find that the power law, the de facto principle to estimate model performance, leads to large error when using a small dataset.
We introduce a novel piecewise power law (PPL) that handles the two data differently.
arXiv Detail & Related papers (2023-03-02T21:48:22Z) - Anomaly Detection with Test Time Augmentation and Consistency Evaluation [13.709281244889691]
We propose a simple, yet effective anomaly detection algorithm named Test Time Augmentation Anomaly Detection (TTA-AD)
We observe that in-distribution data enjoy more consistent predictions for its original and augmented versions on a trained network than out-distribution data.
Experiments on various high-resolution image benchmark datasets demonstrate that TTA-AD achieves comparable or better detection performance.
arXiv Detail & Related papers (2022-06-06T04:27:06Z) - Deep Generative model with Hierarchical Latent Factors for Time Series
Anomaly Detection [40.21502451136054]
This work presents DGHL, a new family of generative models for time series anomaly detection.
A top-down Convolution Network maps a novel hierarchical latent space to time series windows, exploiting temporal dynamics to encode information efficiently.
Our method outperformed current state-of-the-art models on four popular benchmark datasets.
arXiv Detail & Related papers (2022-02-15T17:19:44Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Robust Out-of-Distribution Detection on Deep Probabilistic Generative
Models [0.06372261626436676]
Out-of-distribution (OOD) detection is an important task in machine learning systems.
Deep probabilistic generative models facilitate OOD detection by estimating the likelihood of a data sample.
We propose a new detection metric that operates without outlier exposure.
arXiv Detail & Related papers (2021-06-15T06:36:10Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.