Related papers: Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level

Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level

URL: http://arxiv.org/abs/2303.04339v1
Date: Wed, 8 Mar 2023 02:31:49 GMT
Title: Learning the Finer Things: Bayesian Structure Learning at the Instantiation Level
Authors: Chase Yakaboski and Eugene Santos Jr
Abstract summary: Successful machine learning methods require a trade-off between memorization and generalization. We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in elusive domains.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Successful machine learning methods require a trade-off between memorization and generalization. Too much memorization and the model cannot generalize to unobserved examples. Too much over-generalization and we risk under-fitting the data. While we commonly measure their performance through cross validation and accuracy metrics, how should these algorithms cope in domains that are extremely under-determined where accuracy is always unsatisfactory? We present a novel probabilistic graphical model structure learning approach that can learn, generalize and explain in these elusive domains by operating at the random variable instantiation level. Using Minimum Description Length (MDL) analysis, we propose a new decomposition of the learning problem over all training exemplars, fusing together minimal entropy inferences to construct a final knowledge base. By leveraging Bayesian Knowledge Bases (BKBs), a framework that operates at the instantiation level and inherently subsumes Bayesian Networks (BNs), we develop both a theoretical MDL score and associated structure learning algorithm that demonstrates significant improvements over learned BNs on 40 benchmark datasets. Further, our algorithm incorporates recent off-the-shelf DAG learning techniques enabling tractable results even on large problems. We then demonstrate the utility of our approach in a significantly under-determined domain by learning gene regulatory networks on breast cancer gene mutational data available from The Cancer Genome Atlas (TCGA).

Related papers

BLUR: A Bi-Level Optimization Approach for LLM Unlearning [105.98410883830596]
We argue that it is important to model the hierarchical structure of the unlearning problem.<n>We propose a novel algorithm, termed Bi-Level UnleaRning (textttBLUR), which delivers superior performance.
arXiv Detail & Related papers (2025-06-09T19:23:05Z)
Binarized Neural Networks Converge Toward Algorithmic Simplicity: Empirical Support for the Learning-as-Compression Hypothesis [36.24954635616374]
We propose a shift toward algorithmic information theory, using Binarized Neural Networks (BNNs) as a first proxy.<n>We apply the Block Decomposition Method (BDM) and demonstrate it more closely tracks structural changes during training than entropy.<n>These results support the view of training as a process of algorithmic compression, where learning corresponds to the progressive internalization of structured regularities.
arXiv Detail & Related papers (2025-05-27T02:51:36Z)
Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs [11.724887822269528]
Large language models (LLMs) have achieved unprecedented performance by leveraging vast pretraining corpora.<n>Their performance remains suboptimal in knowledge-intensive domains such as medicine and scientific research.<n>We propose a novel Structural Entropy-guided Knowledge Navigator (SENATOR) framework that addresses the intrinsic knowledge deficiencies of LLMs.
arXiv Detail & Related papers (2025-05-12T02:21:36Z)
On the Generalization Capability of Temporal Graph Learning Algorithms: Theoretical Insights and a Simpler Method [59.52204415829695]
Temporal Graph Learning (TGL) has become a prevalent technique across diverse real-world applications. This paper investigates the generalization ability of different TGL algorithms. We propose a simplified TGL network, which enjoys a small generalization error, improved overall performance, and lower model complexity.
arXiv Detail & Related papers (2024-02-26T08:22:22Z)
Maximizing Model Generalization for Machine Condition Monitoring with Self-Supervised Learning and Federated Learning [4.214064911004321]
Deep Learning can diagnose faults and assess machine health from raw condition monitoring data without manually designed statistical features. Traditional supervised learning may struggle to learn compact, discriminative representations that generalize to unseen target domains. This study proposes focusing on maximizing the feature generality on the source domain and applying TL via weight transfer to copy the model to the target domain.
arXiv Detail & Related papers (2023-04-27T17:57:54Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
Hierarchically Structured Task-Agnostic Continual Learning [0.0]
We take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle. We propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths. Our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms.
arXiv Detail & Related papers (2022-11-14T19:53:15Z)
Network Gradient Descent Algorithm for Decentralized Federated Learning [0.2867517731896504]
We study a fully decentralized federated learning algorithm, which is a novel descent gradient algorithm executed on a communication-based network. In the NGD method, only statistics (e.g., parameter estimates) need to be communicated, minimizing the risk of privacy. We find that both the learning rate and the network structure play significant roles in determining the NGD estimator's statistical efficiency.
arXiv Detail & Related papers (2022-05-06T02:53:31Z)
Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points. The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains. We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
Efficient and Scalable Structure Learning for Bayesian Networks: Algorithms and Applications [33.8980356362084]
We propose a new structure learning algorithm, LEAST, which comprehensively fulfills our business requirements. LEAST is deployed and serves for more than 20 applications with thousands of executions per day. We show that LEAST unlocks the possibility of applying BN structure learning in new areas, such as large-scale gene expression data analysis and explainable recommendation system.
arXiv Detail & Related papers (2020-12-07T09:11:08Z)
Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z)
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice. We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
Information-theoretic analysis for transfer learning [5.081241420920605]
We give an information-theoretic analysis on the generalization error and the excess risk of transfer learning algorithms. Our results suggest, perhaps as expected, that the Kullback-Leibler divergence $D(mu||mu')$ plays an important role in characterizing the generalization error.
arXiv Detail & Related papers (2020-05-18T13:23:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.