Shift-Robust Molecular Relational Learning with Causal Substructure
- URL: http://arxiv.org/abs/2305.18451v3
- Date: Thu, 20 Jul 2023 23:59:38 GMT
- Title: Shift-Robust Molecular Relational Learning with Causal Substructure
- Authors: Namkyeong Lee, Kanghoon Yoon, Gyoung S. Na, Sein Kim, Chanyoung Park
- Abstract summary: We propose CMRL that is robust to the distributional shift in molecular relational learning.
We first assume a causal relationship based on the domain knowledge of molecular sciences and construct a structural causal model (SCM) that reveals the relationship between variables.
Our model successfully learns from the causal substructure and alleviates the confounding effect of shortcut substructures that are spuriously correlated to chemical reactions.
- Score: 5.319553007291377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, molecular relational learning, whose goal is to predict the
interaction behavior between molecular pairs, got a surge of interest in
molecular sciences due to its wide range of applications. In this work, we
propose CMRL that is robust to the distributional shift in molecular relational
learning by detecting the core substructure that is causally related to
chemical reactions. To do so, we first assume a causal relationship based on
the domain knowledge of molecular sciences and construct a structural causal
model (SCM) that reveals the relationship between variables. Based on the SCM,
we introduce a novel conditional intervention framework whose intervention is
conditioned on the paired molecule. With the conditional intervention
framework, our model successfully learns from the causal substructure and
alleviates the confounding effect of shortcut substructures that are spuriously
correlated to chemical reactions. Extensive experiments on various tasks with
real-world and synthetic datasets demonstrate the superiority of CMRL over
state-of-the-art baseline models. Our code is available at
https://github.com/Namkyeong/CMRL.
Related papers
- Learning Cell-Aware Hierarchical Multi-Modal Representations for Robust Molecular Modeling [74.25438319700929]
We propose CHMR (Cell-aware Hierarchical Multi-modal Representations), a robust framework that models local-global dependencies between molecules and cellular responses.<n> evaluated on nine public benchmarks spanning 728 tasks, CHMR outperforms state-of-the-art baselines.<n>Results demonstrate the advantage of hierarchy-aware, multimodal learning for reliable and biologically grounded molecular representations.
arXiv Detail & Related papers (2025-11-26T07:15:00Z) - Mamba-driven multi-perspective structural understanding for molecular ground-state conformation prediction [69.32436472760712]
We propose an approach of Mamba-driven multi-perspective structural understanding (MPSU-Mamba) to localize molecular ground-state conformation.<n>For complex and diverse molecules, three different kinds of dedicated scanning strategies are explored to construct a comprehensive perception of corresponding molecular structures.<n> Experimental results on QM9 and Molecule3D datasets indicate that MPSU-Mamba significantly outperforms existing methods.
arXiv Detail & Related papers (2025-11-10T11:18:32Z) - Root Cause Analysis of Hydrogen Bond Separation in Spatio-Temporal Molecular Dynamics using Causal Models [12.86220226072]
A critical research gap lies in identifying underlying causes of hydrogen bond formation and separation.<n>We propose leveraging data analytics and machine learning models to enhance the detection of these phenomena.
arXiv Detail & Related papers (2025-08-17T21:23:12Z) - UniIF: Unified Molecule Inverse Folding [67.60267592514381]
We propose a unified model UniIF for inverse folding of all molecules.
Our proposed method surpasses state-of-the-art methods on all tasks.
arXiv Detail & Related papers (2024-05-29T10:26:16Z) - MolTC: Towards Molecular Relational Modeling In Language Models [28.960416816491392]
We propose a novel framework for Molecular inTeraction prediction following Chain-of-Thought (CoT) theory termed MolTC.
Our experiments, conducted across various datasets involving over 4,000,000 molecular pairs, exhibit the superiority of our method over current GNN and LLM-based baselines.
arXiv Detail & Related papers (2024-02-06T07:51:56Z) - Targeted Reduction of Causal Models [55.11778726095353]
Causal Representation Learning offers a promising avenue to uncover interpretable causal patterns in simulations.
We introduce Targeted Causal Reduction (TCR), a method for condensing complex intervenable models into a concise set of causal factors.
Its ability to generate interpretable high-level explanations from complex models is demonstrated on toy and mechanical systems.
arXiv Detail & Related papers (2023-11-30T15:46:22Z) - From molecules to scaffolds to functional groups: building context-dependent molecular representation via multi-channel learning [10.025809630976065]
This paper introduces a novel pre-training framework that learns robust and generalizable chemical knowledge.
Our approach demonstrates competitive performance across various molecular property benchmarks.
arXiv Detail & Related papers (2023-11-05T23:47:52Z) - Towards Predicting Equilibrium Distributions for Molecular Systems with
Deep Learning [60.02391969049972]
We introduce a novel deep learning framework, called Distributional Graphormer (DiG), in an attempt to predict the equilibrium distribution of molecular systems.
DiG employs deep neural networks to transform a simple distribution towards the equilibrium distribution, conditioned on a descriptor of a molecular system.
arXiv Detail & Related papers (2023-06-08T17:12:08Z) - Bi-level Contrastive Learning for Knowledge-Enhanced Molecule
Representations [55.42602325017405]
We propose a novel method called GODE, which takes into account the two-level structure of individual molecules.
By pre-training two graph neural networks (GNNs) on different graph structures, combined with contrastive learning, GODE fuses molecular structures with their corresponding knowledge graph substructures.
When fine-tuned across 11 chemical property tasks, our model outperforms existing benchmarks, registering an average ROC-AUC uplift of 13.8% for classification tasks and an average RMSE/MAE enhancement of 35.1% for regression tasks.
arXiv Detail & Related papers (2023-06-02T15:49:45Z) - Conditional Graph Information Bottleneck for Molecular Relational
Learning [9.56625683182106]
We propose a novel relational learning framework, CGIB, that predicts the interaction behavior between a pair of graphs by detecting core subgraphs therein.
Our proposed method mimics the nature of chemical reactions, i.e., the core substructure of a molecule varies depending on which other molecule it interacts with.
arXiv Detail & Related papers (2023-04-29T01:17:43Z) - Implicit Geometry and Interaction Embeddings Improve Few-Shot Molecular
Property Prediction [53.06671763877109]
We develop molecular embeddings that encode complex molecular characteristics to improve the performance of few-shot molecular property prediction.
Our approach leverages large amounts of synthetic data, namely the results of molecular docking calculations.
On multiple molecular property prediction benchmarks, training from the embedding space substantially improves Multi-Task, MAML, and Prototypical Network few-shot learning performance.
arXiv Detail & Related papers (2023-02-04T01:32:40Z) - Discovery of structure-property relations for molecules via
hypothesis-driven active learning over the chemical space [0.0]
We introduce a novel approach for the active learning over the chemical spaces based on hypothesis learning.
We construct the hypotheses on the possible relationships between structures and functionalities of interest based on a small subset of data.
This approach combines the elements from the symbolic regression methods such as SISSO and active learning into a single framework.
arXiv Detail & Related papers (2023-01-06T14:22:43Z) - Semi-Supervised GCN for learning Molecular Structure-Activity
Relationships [4.468952886990851]
We propose to train graph-to-graph neural network using semi-supervised learning for attributing structure-property relationships.
As final goal, our approach could represent a valuable tool to deal with problems such as activity cliffs, lead optimization and de-novo drug design.
arXiv Detail & Related papers (2022-01-25T09:09:43Z) - Chemical-Reaction-Aware Molecule Representation Learning [88.79052749877334]
We propose using chemical reactions to assist learning molecule representation.
Our approach is proven effective to 1) keep the embedding space well-organized and 2) improve the generalization ability of molecule embeddings.
Experimental results demonstrate that our method achieves state-of-the-art performance in a variety of downstream tasks.
arXiv Detail & Related papers (2021-09-21T00:08:43Z) - ASGN: An Active Semi-supervised Graph Neural Network for Molecular
Property Prediction [61.33144688400446]
We propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules.
In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution.
At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning.
arXiv Detail & Related papers (2020-07-07T04:22:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.