Related papers: Sliced Denoising: A Physics-Informed Molecular Pre-Training Method

Sliced Denoising: A Physics-Informed Molecular Pre-Training Method

URL: http://arxiv.org/abs/2311.02124v1
Date: Fri, 3 Nov 2023 07:58:05 GMT
Title: Sliced Denoising: A Physics-Informed Molecular Pre-Training Method
Authors: Yuyan Ni, Shikun Feng, Wei-Ying Ma, Zhi-Ming Ma, Yanyan Lan
Abstract summary: This paper proposes a new method for molecular pre-training, called sliced denoising (SliDe) SliDe uses a novel noise strategy that perturbs bond lengths, angles, and torsion angles to achieve better sampling over conformations. It shows a 42% improvement in the accuracy of estimated force fields compared to current state-of-the-art denoising methods.
Score: 29.671249096191726
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: While molecular pre-training has shown great potential in enhancing drug discovery, the lack of a solid physical interpretation in current methods raises concerns about whether the learned representation truly captures the underlying explanatory factors in observed data, ultimately resulting in limited generalization and robustness. Although denoising methods offer a physical interpretation, their accuracy is often compromised by ad-hoc noise design, leading to inaccurate learned force fields. To address this limitation, this paper proposes a new method for molecular pre-training, called sliced denoising (SliDe), which is based on the classical mechanical intramolecular potential theory. SliDe utilizes a novel noise strategy that perturbs bond lengths, angles, and torsion angles to achieve better sampling over conformations. Additionally, it introduces a random slicing approach that circumvents the computationally expensive calculation of the Jacobian matrix, which is otherwise essential for estimating the force field. By aligning with physical principles, SliDe shows a 42\% improvement in the accuracy of estimated force fields compared to current state-of-the-art denoising methods, and thus outperforms traditional baselines on various molecular property prediction tasks.

Related papers

Equivariant Masked Position Prediction for Efficient Molecular Representation [6.761418610103767]
Graph neural networks (GNNs) have shown considerable promise in computational chemistry. We introduce a novel self-supervised approach termed Equivariant Masked Position Prediction. EMPP formulates a nuanced position prediction task that is more well-defined and enhances the learning of quantum mechanical features.
arXiv Detail & Related papers (2025-02-12T08:39:26Z)
Pre-training with Fractional Denoising to Enhance Molecular Property Prediction [26.93248595345132]
We introduce a molecular pre-training framework called fractional denoising (Frad), which decouples noise design from the constraints imposed by force learning equivalence. Experiments demonstrate that our framework consistently outperforms existing methods, establishing state-of-the-art results across force prediction, quantum chemical properties, and binding affinity tasks.
arXiv Detail & Related papers (2024-07-14T11:09:42Z)
Diffusion-Driven Generative Framework for Molecular Conformation Prediction [0.66567375919026]
The rapid advancement of machine learning has revolutionized the precision of predictive modeling in this context. This research introduces a cutting-edge generative framework named method. Method views atoms as discrete entities and excels in guiding the reversal of diffusion.
arXiv Detail & Related papers (2023-12-22T11:49:39Z)
Fractional Denoising for 3D Molecular Pre-training [29.671249096191726]
Coordinate denoising is a promising 3D molecular pre-training method, which has achieved remarkable performance in various downstream drug discovery tasks. There are two challenges for coordinate denoising to learn an effective force field, i.e. low coverage samples and isotropic force field. We propose a novel hybrid noise strategy, including noises on both dihedral angel and coordinate.
arXiv Detail & Related papers (2023-07-20T08:20:12Z)
End-To-End Latent Variational Diffusion Models for Inverse Problems in High Energy Physics [61.44793171735013]
We introduce a novel unified architecture, termed latent variation models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework. Our unified approach achieves a distribution-free distance to the truth of over 20 times less than non-latent state-of-the-art baseline.
arXiv Detail & Related papers (2023-05-17T17:43:10Z)
Aspects of scaling and scalability for flow-based sampling of lattice QCD [137.23107300589385]
Recent applications of machine-learned normalizing flows to sampling in lattice field theory suggest that such methods may be able to mitigate critical slowing down and topological freezing. It remains to be determined whether they can be applied to state-of-the-art lattice quantum chromodynamics calculations.
arXiv Detail & Related papers (2022-11-14T17:07:37Z)
Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel [4.515606790756141]
We study deep ensembles with large layer widths operating in simplified linear training regimes. We identify two sources of noise, each inducing a distinct inductive bias in the predictive variance. We propose practical ways to eliminate part of these noise sources leading to significant changes and improved OOD detection in trained deep ensembles.
arXiv Detail & Related papers (2022-10-18T12:55:53Z)
Pre-training via Denoising for Molecular Property Prediction [53.409242538744444]
We describe a pre-training technique that utilizes large datasets of 3D molecular structures at equilibrium. Inspired by recent advances in noise regularization, our pre-training objective is based on denoising.
arXiv Detail & Related papers (2022-05-31T22:28:34Z)
Point Cloud Denoising via Momentum Ascent in Gradient Fields [72.93429911044903]
gradient-based method was proposed to estimate the gradient fields from the noisy point clouds using neural networks. We develop a momentum gradient ascent method that leverages the information of previous iterations in determining the trajectories of the points. Experiments demonstrate that the proposed method outperforms state-of-the-art approaches with a variety of point clouds, noise types, and noise levels.
arXiv Detail & Related papers (2022-02-21T10:21:40Z)
Linear Adversarial Concept Erasure [108.37226654006153]
We formulate the problem of identifying and erasing a linear subspace that corresponds to a given concept. We show that the method is highly expressive, effectively mitigating bias in deep nonlinear classifiers while maintaining tractability and interpretability.
arXiv Detail & Related papers (2022-01-28T13:00:17Z)
Reintroducing Straight-Through Estimators as Principled Methods for Stochastic Binary Networks [85.94999581306827]
Training neural networks with binary weights and activations is a challenging problem due to the lack of gradients and difficulty of optimization over discrete weights. Many successful experimental results have been achieved with empirical straight-through (ST) approaches. At the same time, ST methods can be truly derived as estimators in the binary network (SBN) model with Bernoulli weights.
arXiv Detail & Related papers (2020-06-11T23:58:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.