Related papers: Differentiable TAN Structure Learning for Bayesian Network Classifiers

Differentiable TAN Structure Learning for Bayesian Network Classifiers

URL: http://arxiv.org/abs/2008.09566v1
Date: Fri, 21 Aug 2020 16:22:47 GMT
Title: Differentiable TAN Structure Learning for Bayesian Network Classifiers
Authors: Wolfgang Roth and Franz Pernkopf
Abstract summary: We consider learning of tree-augmented naive Bayes (TAN) structures for Bayesian network classifiers with discrete input features. Instead of performing a optimization over the space of possible graph structures, the proposed method learns a distribution over graph structures. Our method consistently outperforms random TAN structures and Chow-Liu TAN structures.
Score: 19.30562170076368
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning the structure of Bayesian networks is a difficult combinatorial optimization problem. In this paper, we consider learning of tree-augmented naive Bayes (TAN) structures for Bayesian network classifiers with discrete input features. Instead of performing a combinatorial optimization over the space of possible graph structures, the proposed method learns a distribution over graph structures. After training, we select the most probable structure of this distribution. This allows for a joint training of the Bayesian network parameters along with its TAN structure using gradient-based optimization. The proposed method is agnostic to the specific loss and only requires that it is differentiable. We perform extensive experiments using a hybrid generative-discriminative loss based on the discriminative probabilistic margin. Our method consistently outperforms random TAN structures and Chow-Liu TAN structures.

Related papers

BAPE: Learning an Explicit Bayes Classifier for Long-tailed Visual Recognition [78.70453964041718]
Current deep learning algorithms usually solve for the optimal classifier by emphimplicitly estimating the posterior probabilities.<n>This simple methodology has been proven effective for meticulously balanced academic benchmark datasets.<n>However, it is not applicable to the long-tailed data distributions in the real world.<n>This paper presents a novel approach (BAPE) that provides a more precise theoretical estimation of the data distributions.
arXiv Detail & Related papers (2025-06-29T15:12:50Z)
Score-Based Model for Low-Rank Tensor Recovery [49.158601255093416]
Low-rank tensor decompositions (TDs) provide an effective framework for multiway data analysis.<n>Traditional TD methods rely on predefined structural assumptions, such as CP or Tucker decompositions.<n>We propose a score-based model that eliminates the need for predefined structural or distributional assumptions.
arXiv Detail & Related papers (2025-06-27T15:05:37Z)
Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC. We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss. Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z)
Generalized Naive Bayes [0.0]
We introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB)
arXiv Detail & Related papers (2024-08-28T16:36:18Z)
SE-GSL: A General and Effective Graph Structure Learning Framework through Structural Entropy Optimization [67.28453445927825]
Graph Neural Networks (GNNs) are de facto solutions to structural data learning. Existing graph structure learning (GSL) frameworks still lack robustness and interpretability. This paper proposes a general GSL framework, SE-GSL, through structural entropy and the graph hierarchy abstracted in the encoding tree.
arXiv Detail & Related papers (2023-03-17T05:20:24Z)
Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z)
A Differentiable Approach to Combinatorial Optimization using Dataless Neural Networks [20.170140039052455]
We propose a radically different approach in that no data is required for training the neural networks that produce the solution. In particular, we reduce the optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest.
arXiv Detail & Related papers (2022-03-15T19:21:31Z)
Hybrid Bayesian network discovery with latent variables by scoring multiple interventions [5.994412766684843]
We present the hybrid mFGS-BS (majority rule and Fast Greedy equivalence Search with Bayesian Scoring) algorithm for structure learning from discrete data. The algorithm assumes causal insufficiency in the presence of latent variables and produces a Partial Ancestral Graph (PAG) Experimental results show that mFGS-BS improves structure learning accuracy relative to the state-of-the-art and it is computationally efficient.
arXiv Detail & Related papers (2021-12-20T14:54:41Z)
A Sparse Structure Learning Algorithm for Bayesian Network Identification from Discrete High-Dimensional Data [0.40611352512781856]
This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. We propose a score function that satisfies the sparsity and the DAG property simultaneously. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data.
arXiv Detail & Related papers (2021-08-21T12:21:01Z)
DiBS: Differentiable Bayesian Structure Learning [38.01659425023988]
We propose a general, fully differentiable framework for Bayesian structure learning (DiBS) DiBS operates in the continuous space of a latent probabilistic graph representation. Contrary to existing work, DiBS is agnostic to the form of the local conditional distributions.
arXiv Detail & Related papers (2021-05-25T11:23:08Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)
Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model. This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs) The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant. Our approach relies on perturbeds, and can be used readily together with existing solvers. We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.