Related papers: Scalable computation of prediction intervals for neural networks via matrix sketching

Scalable computation of prediction intervals for neural networks via matrix sketching

URL: http://arxiv.org/abs/2205.03194v1
Date: Fri, 6 May 2022 13:18:31 GMT
Title: Scalable computation of prediction intervals for neural networks via matrix sketching
Authors: Alexander Fishkov and Maxim Panov
Abstract summary: Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
Score: 79.44177623781043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accounting for the uncertainty in the predictions of modern neural networks is a challenging and important task in many domains. Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure (e.g., Bayesian neural networks) or dramatically increase the computational cost of predictions such as approaches based on ensembling. This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals. The method is based on the classical delta method in statistics but achieves computational efficiency by using matrix sketching to approximate the Jacobian matrix. The resulting algorithm is competitive with state-of-the-art approaches for constructing predictive intervals on various regression datasets from the UCI repository.

Related papers

Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
Discrete Neural Algorithmic Reasoning [18.497863598167257]
We propose to force neural reasoners to maintain the execution trajectory as a combination of finite predefined states. trained with supervision on the algorithm's state transitions, such models are able to perfectly align with the original algorithm.
arXiv Detail & Related papers (2024-02-18T16:03:04Z)
Randomized Polar Codes for Anytime Distributed Machine Learning [66.46612460837147]
We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations. We propose a sequential decoding algorithm designed to handle real valued data while maintaining low computational complexity for recovery. We demonstrate the potential applications of this framework in various contexts, such as large-scale matrix multiplication and black-box optimization.
arXiv Detail & Related papers (2023-09-01T18:02:04Z)
A new approach to generalisation error of machine learning algorithms: Estimates and convergence [0.0]
We introduce a new approach to the estimation of the (generalisation) error and to convergence. Our results include estimates of the error without any structural assumption on the neural networks.
arXiv Detail & Related papers (2023-06-23T20:57:31Z)
A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems. The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients. Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z)
Confidence-Nets: A Step Towards better Prediction Intervals for regression Neural Networks on small datasets [0.0]
We propose an ensemble method that attempts to estimate the uncertainty of predictions, increase their accuracy and provide an interval for the expected variation. The proposed method is tested on various datasets, and a significant improvement in the performance of the neural network model is seen.
arXiv Detail & Related papers (2022-10-31T06:38:40Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network [75.44925576268052]
The linear-chain Conditional Random Field (CRF) model is one of the most widely-used neural sequence labeling approaches. Exact probabilistic inference algorithms are typically applied in training and prediction stages of the CRF model. We propose to employ a parallelizable approximate variational inference algorithm for the CRF model.
arXiv Detail & Related papers (2020-09-17T12:18:43Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
The committee machine: Computational to statistical gaps in learning a two-layers neural network [29.86621613621785]
Heuristic tools from statistical physics have been used to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario. We introduce a version of the approximate message passing (AMP) algorithm for the machine that allows to perform optimal learning in time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases.
arXiv Detail & Related papers (2018-06-14T10:22:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.