SciGLM: Training Scientific Language Models with Self-Reflective
Instruction Annotation and Tuning
- URL: http://arxiv.org/abs/2401.07950v2
- Date: Tue, 12 Mar 2024 18:34:05 GMT
- Title: SciGLM: Training Scientific Language Models with Self-Reflective
Instruction Annotation and Tuning
- Authors: Dan Zhang and Ziniu Hu and Sining Zhoubian and Zhengxiao Du and Kaiyu
Yang and Zihan Wang and Yisong Yue and Yuxiao Dong and Jie Tang
- Abstract summary: SciGLM is a suite of scientific language models able to conduct college-level scientific reasoning.
We apply a self-reflective instruction annotation framework to generate step-by-step reasoning for unlabelled scientific questions.
We fine-tuned the ChatGLM family of language models with SciInstruct, enhancing their scientific and mathematical reasoning capabilities.
- Score: 60.14510984576027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown promise in assisting scientific
discovery. However, such applications are currently limited by LLMs'
deficiencies in understanding intricate scientific concepts, deriving symbolic
equations, and solving advanced numerical calculations. To bridge these gaps,
we introduce SciGLM, a suite of scientific language models able to conduct
college-level scientific reasoning. Central to our approach is a novel
self-reflective instruction annotation framework to address the data scarcity
challenge in the science domain. This framework leverages existing LLMs to
generate step-by-step reasoning for unlabelled scientific questions, followed
by a process of self-reflective critic-and-revise. Applying this framework, we
curated SciInstruct, a diverse and high-quality dataset encompassing physics,
chemistry, math, and formal proofs. We fine-tuned the ChatGLM family of
language models with SciInstruct, enhancing their scientific and mathematical
reasoning capabilities. Remarkably, the SciGLM consistently improves both the
base model (ChatGLM3-6B-Base) by 4.87% and larger-scale models (32B) by 2.67%,
without sacrificing the language understanding capabilities of the base model.
This makes SciGLM a suitable foundational model to facilitate diverse
scientific discovery tasks. For the benefit of the wider research community, we
release SciInstruct, and SciGLM, alongside a self-reflective framework and
fine-tuning code at https://github.com/THUDM/SciGLM.
Related papers
- Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models [20.648157071328807]
Large language models (LLMs) can identify novel research directions by analyzing existing knowledge.
LLMs are prone to generating hallucinations'', outputs that are plausible-sounding but factually incorrect.
We propose KG-CoI, a system that enhances LLM hypothesis generation by integrating external, structured knowledge from knowledge graphs.
arXiv Detail & Related papers (2024-11-04T18:50:00Z) - A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery [68.48094108571432]
Large language models (LLMs) have revolutionized the way text and other modalities of data are handled.
We aim to provide a more holistic view of the research landscape by unveiling cross-field and cross-modal connections between scientific LLMs.
arXiv Detail & Related papers (2024-06-16T08:03:24Z) - SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models [35.98892300665275]
We introduce the SciKnowEval benchmark, a framework that evaluates large language models (LLMs) across five progressive levels of scientific knowledge.
These levels aim to assess the breadth and depth of scientific knowledge in LLMs, including memory, comprehension, reasoning, discernment, and application.
We benchmark 26 advanced open-source and proprietary LLMs using zero-shot and few-shot prompting strategies.
arXiv Detail & Related papers (2024-06-13T13:27:52Z) - SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature [80.49349719239584]
We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks.
SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields.
arXiv Detail & Related papers (2024-06-10T21:22:08Z) - LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery [141.39722070734737]
We propose to enhance the knowledge-driven, abstract reasoning abilities of Large Language Models with the computational strength of simulations.
We introduce Scientific Generative Agent (SGA), a bilevel optimization framework.
We conduct experiments to demonstrate our framework's efficacy in law discovery and molecular design.
arXiv Detail & Related papers (2024-05-16T03:04:10Z) - A Survey on Self-Evolution of Large Language Models [116.54238664264928]
Large language models (LLMs) have significantly advanced in various fields and intelligent agent applications.
To address this issue, self-evolution approaches that enable LLMs to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.
arXiv Detail & Related papers (2024-04-22T17:43:23Z) - Scientific Large Language Models: A Survey on Biological & Chemical Domains [47.97810890521825]
Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension.
The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines.
As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration.
arXiv Detail & Related papers (2024-01-26T05:33:34Z) - DARWIN Series: Domain Specific Large Language Models for Natural Science [20.864698325126735]
We present DARWIN, a series of tailored LLMs for natural science, mainly in physics, chemistry, and material science.
We fine-tuned the models using over 60,000 instruction data points, emphasizing factual correctness.
DARWIN series not only achieves state-of-the-art results on various scientific tasks but also diminishes reliance on closed-source AI models.
arXiv Detail & Related papers (2023-08-25T01:40:48Z) - SCITUNE: Aligning Large Language Models with Scientific Multimodal
Instructions [0.7264378254137809]
In this work, we present SciTune as a tuning framework to improve the ability of LLMs to follow scientific multimodal instructions.
To test our methodology, we use a human-generated scientific instruction tuning dataset and train a large multimodal model LLaMA-SciTune.
In comparison to the models that are finetuned with machine generated data only, LLaMA-SciTune surpasses human performance on average and in many sub-categories on the ScienceQA benchmark.
arXiv Detail & Related papers (2023-07-03T16:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.