Differentiable TAN Structure Learning for Bayesian Network Classifiers
- URL: http://arxiv.org/abs/2008.09566v1
- Date: Fri, 21 Aug 2020 16:22:47 GMT
- Title: Differentiable TAN Structure Learning for Bayesian Network Classifiers
- Authors: Wolfgang Roth and Franz Pernkopf
- Abstract summary: We consider learning of tree-augmented naive Bayes (TAN) structures for Bayesian network classifiers with discrete input features.
Instead of performing a optimization over the space of possible graph structures, the proposed method learns a distribution over graph structures.
Our method consistently outperforms random TAN structures and Chow-Liu TAN structures.
- Score: 19.30562170076368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning the structure of Bayesian networks is a difficult combinatorial
optimization problem. In this paper, we consider learning of tree-augmented
naive Bayes (TAN) structures for Bayesian network classifiers with discrete
input features. Instead of performing a combinatorial optimization over the
space of possible graph structures, the proposed method learns a distribution
over graph structures. After training, we select the most probable structure of
this distribution. This allows for a joint training of the Bayesian network
parameters along with its TAN structure using gradient-based optimization. The
proposed method is agnostic to the specific loss and only requires that it is
differentiable. We perform extensive experiments using a hybrid
generative-discriminative loss based on the discriminative probabilistic
margin. Our method consistently outperforms random TAN structures and Chow-Liu
TAN structures.
Related papers
- Generalized Naive Bayes [0.0]
We introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure.
We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB)
arXiv Detail & Related papers (2024-08-28T16:36:18Z) - SE-GSL: A General and Effective Graph Structure Learning Framework
through Structural Entropy Optimization [67.28453445927825]
Graph Neural Networks (GNNs) are de facto solutions to structural data learning.
Existing graph structure learning (GSL) frameworks still lack robustness and interpretability.
This paper proposes a general GSL framework, SE-GSL, through structural entropy and the graph hierarchy abstracted in the encoding tree.
arXiv Detail & Related papers (2023-03-17T05:20:24Z) - Tree ensemble kernels for Bayesian optimization with known constraints
over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search.
Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function.
Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z) - A Differentiable Approach to Combinatorial Optimization using Dataless
Neural Networks [20.170140039052455]
We propose a radically different approach in that no data is required for training the neural networks that produce the solution.
In particular, we reduce the optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest.
arXiv Detail & Related papers (2022-03-15T19:21:31Z) - Hybrid Bayesian network discovery with latent variables by scoring
multiple interventions [5.994412766684843]
We present the hybrid mFGS-BS (majority rule and Fast Greedy equivalence Search with Bayesian Scoring) algorithm for structure learning from discrete data.
The algorithm assumes causal insufficiency in the presence of latent variables and produces a Partial Ancestral Graph (PAG)
Experimental results show that mFGS-BS improves structure learning accuracy relative to the state-of-the-art and it is computationally efficient.
arXiv Detail & Related papers (2021-12-20T14:54:41Z) - A Sparse Structure Learning Algorithm for Bayesian Network
Identification from Discrete High-Dimensional Data [0.40611352512781856]
This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data.
We propose a score function that satisfies the sparsity and the DAG property simultaneously.
Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data.
arXiv Detail & Related papers (2021-08-21T12:21:01Z) - DiBS: Differentiable Bayesian Structure Learning [38.01659425023988]
We propose a general, fully differentiable framework for Bayesian structure learning (DiBS)
DiBS operates in the continuous space of a latent probabilistic graph representation.
Contrary to existing work, DiBS is agnostic to the form of the local conditional distributions.
arXiv Detail & Related papers (2021-05-25T11:23:08Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Fitting the Search Space of Weight-sharing NAS with Graph Convolutional
Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks.
With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant.
Our approach relies on perturbeds, and can be used readily together with existing solvers.
We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.