ProtAgents: Protein discovery via large language model multi-agent
collaborations combining physics and machine learning
- URL: http://arxiv.org/abs/2402.04268v1
- Date: Sat, 27 Jan 2024 20:19:49 GMT
- Title: ProtAgents: Protein discovery via large language model multi-agent
collaborations combining physics and machine learning
- Authors: A. Ghafarollahi, M.J. Buehler
- Abstract summary: ProtAgents is a platform for de novo protein design based on Large Language Models (LLMs)
Multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment.
The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment unleashes great potentials.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Designing de novo proteins beyond those found in nature holds significant
promise for advancements in both scientific and engineering applications.
Current methodologies for protein design often rely on AI-based models, such as
surrogate models that address end-to-end problems by linking protein structure
to material properties or vice versa. However, these models frequently focus on
specific material objectives or structural properties, limiting their
flexibility when incorporating out-of-domain knowledge into the design process
or comprehensive data analysis is required. In this study, we introduce
ProtAgents, a platform for de novo protein design based on Large Language
Models (LLMs), where multiple AI agents with distinct capabilities
collaboratively address complex tasks within a dynamic environment. The
versatility in agent development allows for expertise in diverse domains,
including knowledge retrieval, protein structure analysis, physics-based
simulations, and results analysis. The dynamic collaboration between agents,
empowered by LLMs, provides a versatile approach to tackling protein design and
analysis problems, as demonstrated through diverse examples in this study. The
problems of interest encompass designing new proteins, analyzing protein
structures and obtaining new first-principles data -- natural vibrational
frequencies -- via physics simulations. The concerted effort of the system
allows for powerful automated and synergistic design of de novo proteins with
targeted mechanical properties. The flexibility in designing the agents, on one
hand, and their capacity in autonomous collaboration through the dynamic
LLM-based multi-agent environment on the other hand, unleashes great potentials
of LLMs in addressing multi-objective materials problems and opens up new
avenues for autonomous materials discovery and design.
Related papers
- Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design.
Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths.
We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models.
We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z) - ProteinBench: A Holistic Evaluation of Protein Foundation Models [53.59325047872512]
We introduce ProteinBench, a holistic evaluation framework for protein foundation models.
Our approach consists of three key components: (i) A taxonomic classification of tasks that broadly encompass the main challenges in the protein domain, based on the relationships between different protein modalities; (ii) A multi-metric evaluation approach that assesses performance across four key dimensions: quality, novelty, diversity, and robustness; and (iii) In-depth analyses from various user objectives, providing a holistic view of model performance.
arXiv Detail & Related papers (2024-09-10T06:52:33Z) - AtomAgents: Alloy design and discovery through physics-aware multi-modal multi-agent artificial intelligence [0.0]
The proposed physics-aware generative AI platform, AtomAgents, synergizes the intelligence of large language models (LLM)
Our results enable accurate prediction of key characteristics across alloys and highlight the crucial role of solid solution alloying to steer the development of advanced metallic alloys.
arXiv Detail & Related papers (2024-07-13T22:46:02Z) - ProteinEngine: Empower LLM with Domain Knowledge for Protein Engineering [5.474946062328154]
textscProteinEngine is a human-centered platform aimed at amplifying the capabilities of large language models in protein engineering.
Uniquely, textscProteinEngine assigns three distinct roles to LLMs, facilitating efficient task delegation, specialized task resolution, and effective communication of results.
Our findings highlight the potential of textscProteinEngine to bride the disconnected tools for future research in the protein engineering domain.
arXiv Detail & Related papers (2024-04-21T01:07:33Z) - X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design [0.0]
We report a mixture of expert strategy to create fine-tuned large language models using a deep layer-wise token-level approach based on low-rank adaptation (LoRA)
The design is inspired by the biological principles of universality and diversity, where neural network building blocks are reused in different hierarchical manifestations.
We develop a tailored X-LoRA model that offers scientific capabilities including forward/inverse analysis tasks and enhanced reasoning capability, focused on biomaterial analysis, protein mechanics and design.
arXiv Detail & Related papers (2024-02-11T10:23:34Z) - When large language models meet evolutionary algorithms [48.213640761641926]
Pre-trained large language models (LLMs) have powerful capabilities for generating creative natural text.
Evolutionary algorithms (EAs) can discover diverse solutions to complex real-world problems.
Motivated by the common collective and directionality of text generation and evolution, this paper illustrates the parallels between LLMs and EAs.
arXiv Detail & Related papers (2024-01-19T05:58:30Z) - Efficiently Predicting Protein Stability Changes Upon Single-point
Mutation with Large Language Models [51.57843608615827]
The ability to precisely predict protein thermostability is pivotal for various subfields and applications in biochemistry.
We introduce an ESM-assisted efficient approach that integrates protein sequence and structural features to predict the thermostability changes in protein upon single-point mutations.
arXiv Detail & Related papers (2023-12-07T03:25:49Z) - MeLM, a generative pretrained language modeling framework that solves
forward and inverse mechanics problems [0.0]
We report a flexible multi-modal mechanics language model, MeLM, applied to solve various nonlinear forward and inverse problems.
The framework is applied to various examples including bio-inspired hierarchical honeycomb design and carbon nanotube mechanics.
arXiv Detail & Related papers (2023-06-30T10:28:20Z) - Generative Pretrained Autoregressive Transformer Graph Neural Network
applied to the Analysis and Discovery of Novel Proteins [0.0]
We report a flexible language-model based deep learning strategy, applied here to solve complex forward and inverse problems in protein modeling.
The model is applied to predict secondary structure content (per-residue level and overall content), protein solubility, and sequencing tasks.
We find that adding additional tasks yields emergent synergies that the model exploits in improving overall performance.
arXiv Detail & Related papers (2023-05-07T12:30:24Z) - Integration of Pre-trained Protein Language Models into Geometric Deep
Learning Networks [68.90692290665648]
We integrate knowledge learned by protein language models into several state-of-the-art geometric networks.
Our findings show an overall improvement of 20% over baselines.
Strong evidence indicates that the incorporation of protein language models' knowledge enhances geometric networks' capacity by a significant margin.
arXiv Detail & Related papers (2022-12-07T04:04:04Z) - Learning Geometrically Disentangled Representations of Protein Folding
Simulations [72.03095377508856]
This work focuses on learning a generative neural network on a structural ensemble of a drug-target protein.
Model tasks involve characterizing the distinct structural fluctuations of the protein bound to various drug molecules.
Results show that our geometric learning-based method enjoys both accuracy and efficiency for generating complex structural variations.
arXiv Detail & Related papers (2022-05-20T19:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.