Related papers: SciAgent: Tool-augmented Language Models for Scientific Reasoning

SciAgent: Tool-augmented Language Models for Scientific Reasoning

URL: http://arxiv.org/abs/2402.11451v2
Date: Wed, 21 Feb 2024 03:04:49 GMT
Title: SciAgent: Tool-augmented Language Models for Scientific Reasoning
Authors: Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla and Weizhu Chen
Abstract summary: We introduce a new task setting named tool-augmented scientific reasoning. This setting supplements Large Language Models with scalable toolsets. We construct a tool-augmented training corpus named MathFunc which encompasses over 30,000 samples and roughly 6,000 tools. Building on MathFunc, we develop SciAgent to retrieve, understand and, if necessary, use tools for scientific problem solving.
Score: 129.51442677710452
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Scientific reasoning poses an excessive challenge for even the most advanced Large Language Models (LLMs). To make this task more practical and solvable for LLMs, we introduce a new task setting named tool-augmented scientific reasoning. This setting supplements LLMs with scalable toolsets, and shifts the focus from pursuing an omniscient problem solver to a proficient tool-user. To facilitate the research of such setting, we construct a tool-augmented training corpus named MathFunc which encompasses over 30,000 samples and roughly 6,000 tools. Building on MathFunc, we develop SciAgent to retrieve, understand and, if necessary, use tools for scientific problem solving. Additionally, we craft a benchmark, SciToolBench, spanning five scientific domains to evaluate LLMs' abilities with tool assistance. Extensive experiments on SciToolBench confirm the effectiveness of SciAgent. Notably, SciAgent-Mistral-7B surpasses other LLMs with the same size by more than 13% in absolute accuracy. Furthermore, SciAgent-DeepMath-7B shows much superior performance than ChatGPT.

Related papers

SciToolAgent: A Knowledge Graph-Driven Scientific Agent for Multi-Tool Integration [39.43814195462455]
SciToolAgent automates hundreds of scientific tools across biology, chemistry, and materials science.<n>The agent also incorporates a comprehensive safety-checking module to ensure responsible and ethical tool usage.
arXiv Detail & Related papers (2025-07-27T13:55:35Z)
LLM Agents Making Agent Tools [2.5529148902034637]
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks. We propose ToolMaker, a novel agentic framework that autonomously transforms papers with code into LLM-compatible tools. Given a short task description and a repository URL, ToolMaker autonomously installs required dependencies and generates code to perform the task.
arXiv Detail & Related papers (2025-02-17T11:44:11Z)
StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs [44.906714156993694]
We introduce StepTool, a novel step-grained reinforcement learning framework to improve tool learning in Large Language Models. StepTool significantly outperforms existing methods in multi-step, tool-based tasks.
arXiv Detail & Related papers (2024-10-10T09:23:26Z)
Efficient and Scalable Estimation of Tool Representations in Vector Space [34.767193045989515]
We present a framework for generating synthetic data for tool retrieval applications and an efficient data-driven tool retrieval strategy using small encoder models. We create ToolBank, a new tool retrieval dataset that reflects real human user usages. With these new methods, we achieve improvements of up to 27.28 in Recall@K on the ToolBench dataset and 30.5 in Recall@K on ToolBank.
arXiv Detail & Related papers (2024-09-02T19:39:24Z)
Tool Learning in the Wild: Empowering Language Models as Automatic Tool Agents [56.822238860147024]
Augmenting large language models with external tools has emerged as a promising approach to extend their utility. Previous methods manually parse tool documentation and create in-context demonstrations, transforming tools into structured formats for LLMs to use in their step-by-step reasoning. We propose AutoTools, a framework that enables LLMs to automate the tool-use workflow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z)
What Are Tools Anyway? A Survey from the Language Model Perspective [67.18843218893416]
Language models (LMs) are powerful yet mostly for text generation tasks. We provide a unified definition of tools as external programs used by LMs. We empirically study the efficiency of various tooling methods.
arXiv Detail & Related papers (2024-03-18T17:20:07Z)
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error [54.954211216847135]
Existing large language models (LLMs) only reach a correctness rate in the range of 30% to 60%. We propose a biologically inspired method for tool-augmented LLMs, simulated trial and error (STE) STE orchestrates three key mechanisms for successful tool use behaviors in the biological system: trial and error, imagination, and memory.
arXiv Detail & Related papers (2024-03-07T18:50:51Z)
EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [56.02100384015907]
EasyTool is a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction. It can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios.
arXiv Detail & Related papers (2024-01-11T15:45:11Z)
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use [82.24774504584066]
Large language models (LLMs) have garnered significant attention due to their impressive natural language processing (NLP) capabilities. We introduce MetaTool, a benchmark designed to evaluate whether LLMs have tool usage awareness and can correctly choose tools. We conduct experiments involving eight popular LLMs and find that the majority of them still struggle to effectively select tools.
arXiv Detail & Related papers (2023-10-04T19:39:26Z)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability. We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.