ChemDFM: Dialogue Foundation Model for Chemistry
- URL: http://arxiv.org/abs/2401.14818v1
- Date: Fri, 26 Jan 2024 12:45:55 GMT
- Title: ChemDFM: Dialogue Foundation Model for Chemistry
- Authors: Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Hongshen Xu,
Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Xin Chen and Kai Yu
- Abstract summary: ChemDFM-13B is trained on 34B tokens from chemical literature, textbooks, and instructions as well as various data from the general domain.
It can store, understand, and reason over chemical knowledge and languages while still possessing advanced free-form language comprehension capabilities.
ChemDFM can also surpass GPT-4 on a great portion of chemical tasks, despite the significant size difference.
- Score: 27.804229420333137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have established great success in the general
domain of natural language processing. Their emerging task generalization and
free-form dialogue capabilities can greatly help to design Chemical General
Intelligence (CGI) to assist real-world research in chemistry. However, the
existence of specialized language and knowledge in the field of chemistry, such
as the highly informative SMILES notation, hinders the performance of
general-domain LLMs in chemistry. To this end, we develop ChemDFM, the first
LLM towards CGI. ChemDFM-13B is trained on 34B tokens from chemical literature,
textbooks, and instructions as well as various data from the general domain.
Therefore, it can store, understand, and reason over chemical knowledge and
languages while still possessing advanced free-form language comprehension
capabilities. Extensive quantitative evaluation shows that ChemDFM can
significantly outperform the representative open-sourced LLMs. Moreover,
ChemDFM can also surpass GPT-4 on a great portion of chemical tasks, despite
the significant size difference. Further qualitative evaluations demonstrate
the efficiency and effectiveness of ChemDFM in real-world research scenarios.
We will open-source the ChemDFM model soon.
Related papers
- Are large language models superhuman chemists? [5.1611032009738205]
"ChemBench" is an automated framework designed to rigorously evaluate the chemical knowledge and reasoning abilities of state-of-the-art models.
We curated more than 7,000 question-answer pairs for a wide array of subfields of the chemical sciences.
We found that the best models outperformed the best human chemists in our study on average.
arXiv Detail & Related papers (2024-04-01T20:56:25Z) - An Autonomous Large Language Model Agent for Chemical Literature Data
Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature.
Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset [13.063678216852473]
We show that large language models (LLMs) can achieve very strong results on a comprehensive set of chemistry tasks.
We propose SMolInstruct, a large-scale, comprehensive, and high-quality dataset for instruction tuning.
Using SMolInstruct, we fine-tune a set of open-source LLMs, among which, we find that Mistral serves as the best base model for chemistry tasks.
arXiv Detail & Related papers (2024-02-14T18:42:25Z) - ChemLLM: A Chemical Large Language Model [49.308528569982805]
Large language models (LLMs) have made impressive progress in chemistry applications.
However, the community lacks an LLM specifically designed for chemistry.
Here, we introduce ChemLLM, a comprehensive framework that features the first LLM dedicated to chemistry.
arXiv Detail & Related papers (2024-02-10T01:11:59Z) - Structured Chemistry Reasoning with Large Language Models [70.13959639460015]
Large Language Models (LLMs) excel in diverse areas, yet struggle with complex scientific reasoning, especially in chemistry.
We introduce StructChem, a simple yet effective prompting strategy that offers the desired guidance and substantially boosts the LLMs' chemical reasoning capability.
Tests across four chemistry areas -- quantum chemistry, mechanics, physical chemistry, and kinetics -- StructChem substantially enhances GPT-4's performance, with up to 30% peak improvement.
arXiv Detail & Related papers (2023-11-16T08:20:36Z) - Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [57.70772230913099]
Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology.
Chemist-X interrogates online molecular databases and distills critical data from the latest literature database.
Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
arXiv Detail & Related papers (2023-11-16T01:21:33Z) - Empowering Molecule Discovery for Molecule-Caption Translation with Large Language Models: A ChatGPT Perspective [53.300288393173204]
Large Language Models (LLMs) have shown remarkable performance in various cross-modal tasks.
In this work, we propose an In-context Few-Shot Molecule Learning paradigm for molecule-caption translation.
We evaluate the effectiveness of MolReGPT on molecule-caption translation, including molecule understanding and text-based molecule generation.
arXiv Detail & Related papers (2023-06-11T08:16:25Z) - What can Large Language Models do in chemistry? A comprehensive
benchmark on eight tasks [41.9830989458936]
Large Language Models (LLMs) with strong abilities in natural language processing tasks have emerged.
We aim to evaluate capabilities of LLMs in a wide range of tasks across the chemistry domain.
arXiv Detail & Related papers (2023-05-27T14:17:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.