BioinspiredLLM: Conversational Large Language Model for the Mechanics of
Biological and Bio-inspired Materials
- URL: http://arxiv.org/abs/2309.08788v2
- Date: Mon, 11 Dec 2023 18:05:25 GMT
- Title: BioinspiredLLM: Conversational Large Language Model for the Mechanics of
Biological and Bio-inspired Materials
- Authors: Rachel K. Luu, Markus J. Buehler
- Abstract summary: Open-source autoregressive transformer large language model (LLM), BioinspiredLLM, is reported.
The model was finetuned with a corpus of over a thousand peer-reviewed articles in the field of structural biological and bio-inspired materials.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The study of biological materials and bio-inspired materials science is well
established; however, surprisingly little knowledge has been systematically
translated to engineering solutions. To accelerate discovery and guide
insights, an open-source autoregressive transformer large language model (LLM),
BioinspiredLLM, is reported. The model was finetuned with a corpus of over a
thousand peer-reviewed articles in the field of structural biological and
bio-inspired materials and can be prompted to recall information, assist with
research tasks, and function as an engine for creativity. The model has proven
that it is able to accurately recall information about biological materials and
is further enhanced with enhanced reasoning ability, as well as with
retrieval-augmented generation to incorporate new data during generation that
can also help to traceback sources, update the knowledge base, and connect
knowledge domains. BioinspiredLLM also has been shown to develop sound
hypotheses regarding biological materials design and remarkably so for
materials that have never been explicitly studied before. Lastly, the model
showed impressive promise in collaborating with other generative artificial
intelligence models in a workflow that can reshape the traditional materials
design process. This collaborative generative artificial intelligence method
can stimulate and enhance bio-inspired materials design workflows. Biological
materials are at a critical intersection of multiple scientific fields and
models like BioinspiredLLM help to connect knowledge domains.
Related papers
- Polymetis:Large Language Modeling for Multiple Material Domains [11.396295878658924]
This paper proposes a large language model Polymetis model for a variety of materials fields.
The model uses a dataset of about 2 million material knowledge instructions, and in the process of building the dataset, we developed the Intelligent Extraction Large Model.
We inject this data into the GLM4-9B model for learning to enhance its inference capabilities in a variety of material domains.
arXiv Detail & Related papers (2024-11-13T16:10:14Z) - Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design [0.0]
Cephalo is a series of vision large language models (V-LLMs) designed for materials science applications.
It is trained on integrated image and text data from thousands of scientific papers.
Generative applications include bio-inspired designs, including pollen-inspired architected materials.
arXiv Detail & Related papers (2024-05-29T13:34:32Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.
BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.
It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Improving Biomedical Abstractive Summarisation with Knowledge
Aggregation from Citation Papers [24.481854035628434]
Existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts.
We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers.
Our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.
arXiv Detail & Related papers (2023-10-24T09:56:46Z) - BioT5: Enriching Cross-modal Integration in Biology with Chemical
Knowledge and Natural Language Associations [54.97423244799579]
$mathbfBioT5$ is a pre-training framework that enriches cross-modal integration in biology with chemical knowledge and natural language associations.
$mathbfBioT5$ distinguishes between structured and unstructured knowledge, leading to more effective utilization of information.
arXiv Detail & Related papers (2023-10-11T07:57:08Z) - MatChat: A Large Language Model and Application Service Platform for
Materials Science [18.55541324347915]
We harness the power of the LLaMA2-7B model and enhance it through a learning process that incorporates 13,878 pieces of structured material knowledge data.
This specialized AI model, named MatChat, focuses on predicting inorganic material synthesis pathways.
MatChat is now accessible online and open for use, with both the model and its application framework available as open source.
arXiv Detail & Related papers (2023-10-11T05:11:46Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.