Related papers: Hierarchical Bayesian Flow Networks for Molecular Graph Generation

Hierarchical Bayesian Flow Networks for Molecular Graph Generation

URL: http://arxiv.org/abs/2510.10211v2
Date: Sat, 08 Nov 2025 00:57:35 GMT
Title: Hierarchical Bayesian Flow Networks for Molecular Graph Generation
Authors: Yida Xiong, Jiameng Chen, Kun Li, Hongzhi Zhang, Xiantao Cai, Wenbin Hu,
Abstract summary: GraphBFN is a novel hierarchical coarse-to-fine framework based on Bayesian Flow Networks that operates on the parameters of distributions.<n>We demonstrate that our method achieves superior performance and faster generation, setting new state-of-the-art results on the QM9 and ZINC250k molecular graph generation benchmarks.
Score: 15.495256638671284
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Molecular graph generation is essentially a classification generation problem, aimed at predicting categories of atoms and bonds. Currently, prevailing paradigms such as continuous diffusion models are trained to predict continuous numerical values, treating the training process as a regression task. However, the final generation necessitates a rounding step to convert these predictions back into discrete classification categories, which is intrinsically a classification operation. Given that the rounding operation is not incorporated during training, there exists a significant discrepancy between the model's training objective and its inference procedure. As a consequence, an excessive emphasis on point-wise precision can lead to overfitting and inefficient learning. This occurs because considerable efforts are devoted to capturing intra-bin variations that are ultimately irrelevant to the discrete nature of the task at hand. Such a flaw results in diminished molecular diversity and constrains the model's generalization capabilities. To address this fundamental limitation, we propose GraphBFN, a novel hierarchical coarse-to-fine framework based on Bayesian Flow Networks that operates on the parameters of distributions. By innovatively introducing Cumulative Distribution Function, GraphBFN is capable of calculating the probability of selecting the correct category, thereby unifying the training objective with the sampling rounding operation. We demonstrate that our method achieves superior performance and faster generation, setting new state-of-the-art results on the QM9 and ZINC250k molecular graph generation benchmarks.

Related papers

Two Birds with One Stone: Enhancing Uncertainty Quantification and Interpretability with Graph Functional Neural Process [27.760002432327962]
Graph neural networks (GNNs) are powerful tools on graph data.<n>However, their predictions are mis-calibrated and lack interpretability.<n>We propose a new uncertainty-aware and interpretable graph classification model.
arXiv Detail & Related papers (2025-08-23T17:48:05Z)
A Bayesian Flow Network Framework for Chemistry Tasks [0.0]
We introduce ChemBFN, a language model that handles chemistry tasks based on Bayesian flow networks.<n>A new accuracy schedule is proposed to improve the sampling quality.<n>We show evidence that our method is appropriate for generating molecules with satisfied diversity even when a smaller number of sampling steps is used.
arXiv Detail & Related papers (2024-07-28T04:46:32Z)
A Metalearned Neural Circuit for Nonparametric Bayesian Inference [4.767884267554628]
Most applications of machine learning to classification assume a closed set of balanced classes. This is at odds with the real world, where class occurrence statistics often follow a long-tailed power-law distribution. We present a method for extracting the inductive bias from a nonparametric Bayesian model and transferring it to an artificial neural network.
arXiv Detail & Related papers (2023-11-24T16:43:17Z)
Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy. At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z)
RanPAC: Random Projections and Pre-trained Models for Continual Learning [59.07316955610658]
Continual learning (CL) aims to learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones. We propose a concise and effective approach for CL with pre-trained models.
arXiv Detail & Related papers (2023-07-05T12:49:02Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image Classification [24.91896527342631]
Graph convolutional network (GCN) based models have shown impressive performance. Deep learning frameworks based on point estimation suffer from low generalization and inability to quantify the classification results uncertainty. In this paper, we propose a Bayesian layer with Bayesian idea as an insertion layer into point estimation based neural networks. A Generative Adversarial Network (GAN) is built to solve the sample imbalance problem of HSI dataset.
arXiv Detail & Related papers (2022-11-14T12:56:56Z)
Similarity-aware Positive Instance Sampling for Graph Contrastive Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set. Our selection is based on certain domain-specific pair-wise similarity measurements. Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z)
Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z)
Last Layer Marginal Likelihood for Invariance Learning [12.00078928875924]
We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions. We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
arXiv Detail & Related papers (2021-06-14T15:40:51Z)
Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn. We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z)
A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z)
Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.