CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction
- URL: http://arxiv.org/abs/2502.06836v1
- Date: Thu, 06 Feb 2025 02:29:39 GMT
- Title: CAST: Cross Attention based multimodal fusion of Structure and Text for materials property prediction
- Authors: Jaewan Lee, Changyoung Park, Hongjun Yang, Sungbin Lim, Sehui Han,
- Abstract summary: Graph neural networks (GNNs) stand out due to their ability to represent crystal structures as graphs.
These methods often lose critical global information, such as crystal systems and repetitive unit connectivity.
We propose CAST, a cross-attention-based multimodal fusion model that integrates graph and text modalities to preserve essential material information.
- Score: 5.958532929795774
- License:
- Abstract: Recent advancements in AI have revolutionized property prediction in materials science and accelerating material discovery. Graph neural networks (GNNs) stand out due to their ability to represent crystal structures as graphs, effectively capturing local interactions and delivering superior predictions. However, these methods often lose critical global information, such as crystal systems and repetitive unit connectivity. To address this, we propose CAST, a cross-attention-based multimodal fusion model that integrates graph and text modalities to preserve essential material information. CAST combines node- and token-level features using cross-attention mechanisms, surpassing previous approaches reliant on material-level embeddings like graph mean-pooling or [CLS] tokens. A masked node prediction pretraining strategy further enhances atomic-level information integration. Our method achieved up to 22.9\% improvement in property prediction across four crystal properties including band gap compared to methods like CrysMMNet and MultiMat. Pretraining was key to aligning node and text embeddings, with attention maps confirming its effectiveness in capturing relationships between nodes and tokens. This study highlights the potential of multimodal learning in materials science, paving the way for more robust predictive models that incorporate both local and global information.
Related papers
- Characterizing Massive Activations of Attention Mechanism in Graph Neural Networks [0.9499648210774584]
Recently, attention mechanisms have been integrated into Graph Neural Networks (GNNs) to improve their ability to capture complex patterns.
This paper presents the first comprehensive study revealing the emergence of Massive Activations (MAs) within attention layers.
Our study assesses various GNN models using benchmark datasets, including ZINC, TOX21, and PROTEINS.
arXiv Detail & Related papers (2024-09-05T12:19:07Z) - Cross-Modal Learning for Chemistry Property Prediction: Large Language Models Meet Graph Machine Learning [0.0]
We introduce a Multi-Modal Fusion (MMF) framework that harnesses the analytical prowess of Graph Neural Networks (GNNs) and the linguistic generative and predictive abilities of Large Language Models (LLMs)
Our framework combines the effectiveness of GNNs in modeling graph-structured data with the zero-shot and few-shot learning capabilities of LLMs, enabling improved predictions while reducing the risk of overfitting.
arXiv Detail & Related papers (2024-08-27T11:10:39Z) - Enhancing material property prediction with ensemble deep graph convolutional networks [9.470117608423957]
Recent efforts have focused on employing advanced ML algorithms, including deep learning - based graph neural network, for property prediction.
Our research provides an in-depth evaluation of ensemble strategies in deep learning - based graph neural network, specifically targeting material property prediction tasks.
By testing the Crystal Graph Convolutional Neural Network (CGCNN) and its multitask version, MT-CGCNN, we demonstrated that ensemble techniques, especially prediction averaging, substantially improve precision beyond traditional metrics.
arXiv Detail & Related papers (2024-07-26T16:12:06Z) - Benchmark on Drug Target Interaction Modeling from a Structure Perspective [48.60648369785105]
Drug-target interaction prediction is crucial to drug discovery and design.
Recent methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets.
We conduct a comprehensive survey and benchmark for drug-target interaction modeling from a structure perspective, via integrating tens of explicit (i.e., GNN-based) and implicit (i.e., Transformer-based) structure learning algorithms.
arXiv Detail & Related papers (2024-07-04T16:56:59Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - Distance-aware Molecule Graph Attention Network for Drug-Target Binding
Affinity Prediction [54.93890176891602]
We propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction.
As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph.
We also propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation.
arXiv Detail & Related papers (2020-12-17T17:44:01Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z) - Graph Neural Network for Hamiltonian-Based Material Property Prediction [56.94118357003096]
We present and compare several different graph convolution networks that are able to predict the band gap for inorganic materials.
The models are developed to incorporate two different features: the information of each orbital itself and the interaction between each other.
The results show that our model can get a promising prediction accuracy with cross-validation.
arXiv Detail & Related papers (2020-05-27T13:32:10Z) - Global Attention based Graph Convolutional Neural Networks for Improved
Materials Property Prediction [8.371766047183739]
We develop a novel model, GATGNN, for predicting inorganic material properties based on graph neural networks.
We show that our method is able to both outperform the previous models' predictions and provide insight into the crystallization of the material.
arXiv Detail & Related papers (2020-03-11T07:43:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.