Three Learning Stages and Accuracy-Efficiency Tradeoff of Restricted
Boltzmann Machines
- URL: http://arxiv.org/abs/2209.00873v1
- Date: Fri, 2 Sep 2022 08:20:34 GMT
- Title: Three Learning Stages and Accuracy-Efficiency Tradeoff of Restricted
Boltzmann Machines
- Authors: Lennart Dabelow and Masahito Ueda
- Abstract summary: Restricted Boltzmann Machines (RBMs) offer a versatile architecture for unsupervised machine learning.
For training and eventual applications, it is desirable to have a sampler that is both accurate and efficient.
We identify and quantitatively characterize three regimes of RBM learning: independent learning, correlation learning, and degradation.
- Score: 5.33024001730262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Restricted Boltzmann Machines (RBMs) offer a versatile architecture for
unsupervised machine learning that can in principle approximate any target
probability distribution with arbitrary accuracy. However, the RBM model is
usually not directly accessible due to its computational complexity, and
Markov-chain sampling is invoked to analyze the learned probability
distribution. For training and eventual applications, it is thus desirable to
have a sampler that is both accurate and efficient. We highlight that these two
goals generally compete with each other and cannot be achieved simultaneously.
More specifically, we identify and quantitatively characterize three regimes of
RBM learning: independent learning, where the accuracy improves without losing
efficiency; correlation learning, where higher accuracy entails lower
efficiency; and degradation, where both accuracy and efficiency no longer
improve or even deteriorate. These findings are based on numerical experiments
and heuristic arguments.
Related papers
- Accurate and Reliable Predictions with Mutual-Transport Ensemble [46.368395985214875]
We propose a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL)
We show that MTE can simultaneously enhance both accuracy and uncertainty calibration.
For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method.
arXiv Detail & Related papers (2024-05-30T03:15:59Z) - MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers [41.56951365163419]
"MixedNUTS" is a training-free method where the output logits of a robust classifier are processed by nonlinear transformations with only three parameters.
MixedNUTS then converts the transformed logits into probabilities and mixes them as the overall output.
On CIFAR-10, CIFAR-100, and ImageNet datasets, experimental results with custom strong adaptive attacks demonstrate MixedNUTS's vastly improved accuracy and near-SOTA robustness.
arXiv Detail & Related papers (2024-02-03T21:12:36Z) - Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials [25.091146216183144]
Active learning uses biased or unbiased molecular dynamics to generate candidate pools.
Existing biased and unbiased MD-simulation methods are prone to miss either rare events or extrapolative regions.
This work demonstrates that MD, when biased by the MLIP's energy uncertainty, simultaneously captures extrapolative regions and rare events.
arXiv Detail & Related papers (2023-12-03T14:39:14Z) - Calibrated ensembles can mitigate accuracy tradeoffs under distribution
shift [108.30303219703845]
We find that ID-calibrated ensembles outperforms prior state-of-the-art (based on self-training) on both ID and OOD accuracy.
We analyze this method in stylized settings, and identify two important conditions for ensembles to perform well both ID and OOD.
arXiv Detail & Related papers (2022-07-18T23:14:44Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Worst Case Matters for Few-Shot Recognition [27.023352955311502]
Few-shot recognition learns a recognition model with very few (e.g., 1 or 5) images per category.
Current few-shot learning methods focus on improving the average accuracy over many episodes.
We argue that in real-world applications we may often only try one episode instead of many, and hence maximizing the worst-case accuracy is more important than maximizing the average accuracy.
arXiv Detail & Related papers (2022-03-13T05:39:40Z) - Robustness and Accuracy Could Be Reconcilable by (Proper) Definition [109.62614226793833]
The trade-off between robustness and accuracy has been widely studied in the adversarial literature.
We find that it may stem from the improperly defined robust error, which imposes an inductive bias of local invariance.
By definition, SCORE facilitates the reconciliation between robustness and accuracy, while still handling the worst-case uncertainty.
arXiv Detail & Related papers (2022-02-21T10:36:09Z) - Uncertainty Estimation and Calibration with Finite-State Probabilistic
RNNs [29.84563789289183]
Uncertainty quantification is crucial for building reliable and trustable machine learning systems.
We propose to estimate uncertainty in recurrent neural networks (RNNs) via discrete state transitions over recurrent timesteps.
The uncertainty of the model can be quantified by running a prediction several times, each time sampling from the recurrent state transition distribution.
arXiv Detail & Related papers (2020-11-24T10:35:28Z) - Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via
Higher-Order Influence Functions [121.10450359856242]
We develop a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals.
The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy.
arXiv Detail & Related papers (2020-06-29T13:36:52Z) - Machine learning for causal inference: on the use of cross-fit
estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties.
We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE)
When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z) - Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by
Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images)
This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.