Related papers: In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs

In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs

URL: http://arxiv.org/abs/2406.01808v1
Date: Mon, 3 Jun 2024 21:59:21 GMT
Title: In-Context Learning of Physical Properties: Few-Shot Adaptation to Out-of-Distribution Molecular Graphs
Authors: Grzegorz Kaszuba, Amirhossein D. Naghdi, Dario Massa, Stefanos Papanikolaou, Andrzej Jaszkiewicz, Piotr Sankowski,
Abstract summary: In-context learning allows for performing nontrivial machine learning tasks during inference only. In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? We employ a compound model in which GPT-2 acts on the output of geometry-aware graph neural networks to adapt in-context information.
Score: 1.8635507597668244
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models manifest the ability of few-shot adaptation to a sequence of provided examples. This behavior, known as in-context learning, allows for performing nontrivial machine learning tasks during inference only. In this work, we address the question: can we leverage in-context learning to predict out-of-distribution materials properties? However, this would not be possible for structure property prediction tasks unless an effective method is found to pass atomic-level geometric features to the transformer model. To address this problem, we employ a compound model in which GPT-2 acts on the output of geometry-aware graph neural networks to adapt in-context information. To demonstrate our model's capabilities, we partition the QM9 dataset into sequences of molecules that share a common substructure and use them for in-context learning. This approach significantly improves the performance of the model on out-of-distribution examples, surpassing the one of general graph neural network models.

Related papers

An unified approach to link prediction in collaboration networks [0.0]
This article investigates and compares three approaches to link prediction in colaboration networks. The ERGM is employed to capture general structural patterns within the network. The GCN and Word2Vec+MLP models leverage deep learning techniques to learn adaptive structural representations of nodes and their relationships.
arXiv Detail & Related papers (2024-11-01T22:40:39Z)
Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
Discovering interpretable elastoplasticity models via the neural polynomial method enabled symbolic regressions [0.0]
Conventional neural network elastoplasticity models are often perceived as lacking interpretability. This paper introduces a two-step machine learning approach that returns mathematical models interpretable by human experts.
arXiv Detail & Related papers (2023-07-24T22:22:32Z)
FAENet: Frame Averaging Equivariant GNN for Materials Modeling [123.19473575281357]
We introduce a flexible framework relying on frameaveraging (SFA) to make any model E(3)-equivariant or invariant through data transformations. We prove the validity of our method theoretically and empirically demonstrate its superior accuracy and computational scalability in materials modeling.
arXiv Detail & Related papers (2023-04-28T21:48:31Z)
Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain. In this work, we build models that rely on the message-passing rule of predictive coding. We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z)
Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems. Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored. We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z)
Curvature-informed multi-task learning for graph networks [56.155331323304]
State-of-the-art graph neural networks attempt to predict multiple properties simultaneously. We investigate a potential explanation for this phenomenon: the curvature of each property's loss surface significantly varies, leading to inefficient learning.
arXiv Detail & Related papers (2022-08-02T18:18:41Z)
A Physics-Guided Neural Operator Learning Approach to Model Biological Tissues from Digital Image Correlation Measurements [3.65211252467094]
We present a data-driven correlation to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios. A material database is constructed from the DIC displacement tracking measurements of multiple biaxial stretching protocols on a porcine tricuspid valve leaflet. The material response is modeled as a solution operator from the loading to the resultant displacement field, with the material properties learned implicitly from the data and naturally embedded in the network parameters.
arXiv Detail & Related papers (2022-04-01T04:56:41Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.