Unifying Molecular and Textual Representations via Multi-task Language
Modelling
- URL: http://arxiv.org/abs/2301.12586v2
- Date: Thu, 18 May 2023 00:37:00 GMT
- Title: Unifying Molecular and Textual Representations via Multi-task Language
Modelling
- Authors: Dimitrios Christofidellis, Giorgio Giannone, Jannis Born, Ole Winther,
Teodoro Laino, Matteo Manica
- Abstract summary: We propose the first multi-domain, multi-task language model that can solve a wide range of tasks in both the chemical and natural language domains.
Our model can handle chemical and natural language concurrently, without requiring expensive pre-training on single domains or task-specific models.
Our work suggests that such models can robustly and efficiently accelerate discovery in physical sciences.
- Score: 11.474894472719543
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent advances in neural language models have also been successfully
applied to the field of chemistry, offering generative solutions for classical
problems in molecular design and synthesis planning. These new methods have the
potential to fuel a new era of data-driven automation in scientific discovery.
However, specialized models are still typically required for each task, leading
to the need for problem-specific fine-tuning and neglecting task
interrelations. The main obstacle in this field is the lack of a unified
representation between natural language and chemical representations,
complicating and limiting human-machine interaction. Here, we propose the first
multi-domain, multi-task language model that can solve a wide range of tasks in
both the chemical and natural language domains. Our model can handle chemical
and natural language concurrently, without requiring expensive pre-training on
single domains or task-specific models. Interestingly, sharing weights across
domains remarkably improves our model when benchmarked against state-of-the-art
baselines on single-domain and cross-domain tasks. In particular, sharing
information across domains and tasks gives rise to large improvements in
cross-domain tasks, the magnitude of which increase with scale, as measured by
more than a dozen of relevant metrics. Our work suggests that such models can
robustly and efficiently accelerate discovery in physical sciences by
superseding problem-specific fine-tuning and enhancing human-model
interactions.
Related papers
- Exploring the Benefits of Domain-Pretraining of Generative Large Language Models for Chemistry [5.4665365335928024]
We investigate the trade-offs of leveraging off-the-shelf versus more targeted foundation models for scientific domains.
In this work, we examine the benefits of in-domain pre-training for a given scientific domain, chemistry, and compare these to open-source, off-the-shelf models with zero-shot and few-shot prompting.
Our results show that not only do in-domain base models perform reasonably well on in-domain tasks in a zero-shot setting but that further adaptation using instruction fine-tuning yields impressive performance on chemistry-specific tasks.
arXiv Detail & Related papers (2024-11-05T22:45:10Z) - LICO: Large Language Models for In-Context Molecular Optimization [33.5918976228562]
We introduce LICO, a general-purpose model that extends arbitrary base LLMs for black-box optimization.
We train the model to perform in-context predictions on a diverse set of functions defined over the domain.
Once trained, LICO can generalize to unseen molecule properties simply via in-context prompting.
arXiv Detail & Related papers (2024-06-27T02:43:18Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - nach0: Multimodal Natural and Chemical Languages Foundation Model [7.815497069231599]
This paper introduces a new foundation model, nach0, capable of solving various chemical and biological tasks.
nach0 is a multi-domain and multi-task encoder-decoder LLM pre-trained on unlabeled text from scientific literature, patents, and molecule strings.
arXiv Detail & Related papers (2023-11-21T07:56:30Z) - MechAgents: Large language model multi-agent collaborations can solve
mechanics problems, generate new data, and integrate knowledge [0.6708125191843434]
A set of AI agents can solve mechanics tasks, here demonstrated for elasticity problems, via autonomous collaborations.
A two-agent team can effectively write, execute and self-correct code, in order to apply finite element methods to solve classical elasticity problems.
For more complex tasks, we construct a larger group of agents with enhanced division of labor among planning, formulating, coding, executing and criticizing the process and results.
arXiv Detail & Related papers (2023-11-14T13:49:03Z) - Solving Quantitative Reasoning Problems with Language Models [53.53969870599973]
We introduce Minerva, a large language model pretrained on general natural language data and further trained on technical content.
The model achieves state-of-the-art performance on technical benchmarks without the use of external tools.
We also evaluate our model on over two hundred undergraduate-level problems in physics, biology, chemistry, economics, and other sciences.
arXiv Detail & Related papers (2022-06-29T18:54:49Z) - Sparse*BERT: Sparse Models Generalize To New tasks and Domains [79.42527716035879]
This paper studies how models pruned using Gradual Unstructured Magnitude Pruning can transfer between domains and tasks.
We demonstrate that our general sparse model Sparse*BERT can become SparseBioBERT simply by pretraining the compressed architecture on unstructured biomedical text.
arXiv Detail & Related papers (2022-05-25T02:51:12Z) - Set-based Meta-Interpolation for Few-Task Meta-Learning [79.4236527774689]
We propose a novel domain-agnostic task augmentation method, Meta-Interpolation, to densify the meta-training task distribution.
We empirically validate the efficacy of Meta-Interpolation on eight datasets spanning across various domains.
arXiv Detail & Related papers (2022-05-20T06:53:03Z) - High-Modality Multimodal Transformer: Quantifying Modality & Interaction
Heterogeneity for High-Modality Representation Learning [112.51498431119616]
This paper studies efficient representation learning for high-modality scenarios involving a large set of diverse modalities.
A single model, HighMMT, scales up to 10 modalities (text, image, audio, video, sensors, proprioception, speech, time-series, sets, and tables) and 15 tasks from 5 research areas.
arXiv Detail & Related papers (2022-03-02T18:56:20Z) - Reprogramming Language Models for Molecular Representation Learning [65.00999660425731]
We propose Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained language models for molecular learning tasks.
The adversarial program learns a linear transformation between a dense source model input space (language data) and a sparse target model input space (e.g., chemical and biological molecule data) using a k-SVD solver.
R2DL achieves the baseline established by state of the art toxicity prediction models trained on domain-specific data and outperforms the baseline in a limited training-data setting.
arXiv Detail & Related papers (2020-12-07T05:50:27Z) - CALM: Continuous Adaptive Learning for Language Modeling [18.72860206714457]
Training large language representation models has become a standard in the natural language processing community.
We demonstrate that in practice these pre-trained models present performance deterioration in the form of catastrophic forgetting.
We propose CALM, Continuous Adaptive Learning for Language Modeling: techniques to render models which retain knowledge across multiple domains.
arXiv Detail & Related papers (2020-04-08T03:51:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.