On Resource-Efficient Bayesian Network Classifiers and Deep Neural
Networks
- URL: http://arxiv.org/abs/2010.11773v2
- Date: Wed, 22 Sep 2021 12:59:45 GMT
- Title: On Resource-Efficient Bayesian Network Classifiers and Deep Neural
Networks
- Authors: Wolfgang Roth, G\"unther Schindler, Holger Fr\"oning, Franz Pernkopf
- Abstract summary: We present two methods to reduce the complexity of Bayesian network (BN) classifiers.
First, we introduce quantization-aware training using the straight-through gradient estimator to quantize the parameters of BNs to few bits.
Second, we extend a recently proposed differentiable tree-augmented naive Bayes (TAN) structure learning approach by also considering the model size.
- Score: 14.540226579203207
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present two methods to reduce the complexity of Bayesian network (BN)
classifiers. First, we introduce quantization-aware training using the
straight-through gradient estimator to quantize the parameters of BNs to few
bits. Second, we extend a recently proposed differentiable tree-augmented naive
Bayes (TAN) structure learning approach by also considering the model size.
Both methods are motivated by recent developments in the deep learning
community, and they provide effective means to trade off between model size and
prediction accuracy, which is demonstrated in extensive experiments.
Furthermore, we contrast quantized BN classifiers with quantized deep neural
networks (DNNs) for small-scale scenarios which have hardly been investigated
in the literature. We show Pareto optimal models with respect to model size,
number of operations, and test error and find that both model classes are
viable options.
Related papers
- B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable [53.848005910548565]
'B-cosification' is a novel approach to transform existing pre-trained models to become inherently interpretable.
We find that B-cosification can yield models that are on par with B-cos models trained from scratch in terms of interpretability.
arXiv Detail & Related papers (2024-11-01T16:28:11Z) - BEND: Bagging Deep Learning Training Based on Efficient Neural Network Diffusion [56.9358325168226]
We propose a Bagging deep learning training algorithm based on Efficient Neural network Diffusion (BEND)
Our approach is simple but effective, first using multiple trained model weights and biases as inputs to train autoencoder and latent diffusion model.
Our proposed BEND algorithm can consistently outperform the mean and median accuracies of both the original trained model and the diffused model.
arXiv Detail & Related papers (2024-03-23T08:40:38Z) - Reconciliation of Pre-trained Models and Prototypical Neural Networks in
Few-shot Named Entity Recognition [35.34238362639678]
We propose a one-line-code normalization method to reconcile such a mismatch with empirical and theoretical grounds.
Our work also provides an analytical viewpoint for addressing the general problems in few-shot name entity recognition.
arXiv Detail & Related papers (2022-11-07T02:33:45Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Fully differentiable model discovery [0.0]
We propose an approach by combining neural network based surrogates with Sparse Bayesian Learning.
Our work expands PINNs to various types of neural network architectures, and connects neural network-based surrogates to the rich field of Bayesian parameter inference.
arXiv Detail & Related papers (2021-06-09T08:11:23Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.