Fugu-MT 論文翻訳(概要): BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

論文の概要: BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

arxiv url: http://arxiv.org/abs/2605.05758v1
Date: Thu, 07 May 2026 06:53:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.576312
Title: BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models
Title（参考訳）: BioTool: 大規模言語モデルのバイオメディカル能力を向上するための総合的ツールカートリングデータセット
Authors: Xin Gao, Ruiyi Zhang, Meixi Du, Peijia Qin, Pengtao Xie,
Abstract要約: BioToolは、大規模言語モデル(LLM)を微調整するためのバイオメディカルツールコールデータセットである。 NCBI、Ensembl、UniProtデータベースから収集された34の頻繁なツールと、7,040の高品質で人間認証されたクエリAPIコールペアで構成されている。
参考スコア（独自算出の注目度）: 31.15706839420908
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite the success of large language models (LLMs) on general-purpose tasks, their performance in highly specialized domains such as biomedicine remains unsatisfactory. A key limitation is the inability of LLMs to effectively leverage biomedical tools, which clinical experts and biomedical researchers rely on extensively in daily workflows. While recent general-domain tool-calling datasets have substantially improved the capabilities of LLM agents, existing efforts in the biomedical domain largely rely on in-context learning and restrict models to a small set of tools. To address this gap, we introduce BioTool, a comprehensive biomedical tool-calling dataset designed for fine-tuning LLMs. BioTool comprises 34 frequently used tools collected from the NCBI, Ensembl, and UniProt databases, along with 7,040 high-quality, human-verified query-API call pairs spanning variation, genomics, proteomics, evolution, and general biology. Fine-tuning a 4-billion-parameter LLM on BioTool yields substantial improvements in biomedical tool-calling performance, outperforming cutting-edge commercial LLMs such as GPT-5.1. Furthermore, human expert evaluations demonstrate that integrating a BioTool-fine-tuned tool caller significantly improves downstream answer quality compared to the same LLM without tool usage, highlighting the effectiveness of BioTool in enhancing the biomedical capabilities of LLMs. The full dataset and evaluation code are available at https://github.com/gxx27/BioTool
Abstract（参考訳）: 汎用タスクにおける大規模言語モデル(LLM)の成功にもかかわらず、バイオメディシンのような高度に専門化された領域における性能は相変わらず不満足である。重要な制限は、LLMがバイオメディカルツールを効果的に活用できないことであり、臨床専門家やバイオメディカル研究者は日々のワークフローに広く依存している。最近の汎用ツールコールデータセットはLLMエージェントの機能を大幅に改善しているが、バイオメディカル領域における既存の取り組みは、コンテキスト内学習に大きく依存し、モデルを小さなツールセットに制限している。このギャップに対処するために、我々は、微調整LDM用に設計された包括的バイオメディカルツールコールデータセットであるBioToolを紹介した。 BioToolは、NCBI、Ensembl、UniProtデータベースから収集された34の頻繁なツールと、変異、ゲノム学、プロテオミクス、進化、一般的な生物学にまたがる7,040の高品質な人間認証クエリAPIコールペアで構成されている。バイオツール上での4ビリオンパラメーターLLMの微調整により、GPT-5.1のような最先端の商用LLMよりも優れたバイオメディカルツールコール性能が向上した。さらに,ヒトの専門家による評価では,BioToolを微調整したツールコールを組み込むことで,LLMのバイオメディカル能力の向上にBioToolの有効性が強調され,同じLLMと比較して下流の回答品質が著しく向上することが示されている。完全なデータセットと評価コードはhttps://github.com/gxx27/BioToolで公開されている。

関連論文リスト

Advancing AI Research Assistants with Expert-Involved Learning [84.30323604785646]
大規模言語モデル (LLM) と大規模マルチモーダルモデル (LMM) は、生物医学的な発見を促進することを約束するが、その信頼性は未定である。 ARIEL(AI Research Assistant for Expert-in-the-Loop Learning)は,オープンソースの評価・最適化フレームワークである。 LMMは詳細な視覚的推論に苦しむのに対し、最先端のモデルでは流動性はあるが不完全な要約を生成する。
論文参考訳（メタデータ） (2025-05-03T14:21:48Z)
Benchmarking Large Language Models on Multiple Tasks in Bioinformatics NLP with Prompting [17.973195066083797]
大規模言語モデル(LLM)は生物学的問題を解決する上で重要なツールとなっている。我々はBio-benchmarkと呼ばれる総合的なプロンプトベースのベンチマークフレームワークを導入する。 GPT-4oとLlama-3.1-70bを含む6つの主要LCMを0ショットと数ショットのChain-of-Thought設定を用いて評価した。
論文参考訳（メタデータ） (2025-03-06T02:01:59Z)
MedBioLM: Optimizing Medical and Biological QA with Fine-Tuned Large Language Models and Retrieval-Augmented Generation [0.0]
本稿では,ドメイン適応型バイオメディカル質問応答モデルであるMedBioLMを紹介する。 MedBioLMは、微調整および検索拡張生成(RAG)を統合することで、ドメイン固有の知識を動的に組み込む。微調整はベンチマークデータセットの精度を大幅に向上する一方、RAGは事実整合性を高める。
論文参考訳（メタデータ） (2025-02-05T08:58:35Z)
Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models [55.74944165932666]
本稿では,生物配列の大規模学習データセットであるBiology-Instructionsを紹介する。このデータセットは、大きな言語モデル(LLM)と複雑な生物学的シーケンス関連タスクをブリッジし、その汎用性と推論を強化する。また,マルチオミクスタスクにおける現状のLLMの,専門訓練なしでの大幅な制限を強調した。
論文参考訳（メタデータ） (2024-12-26T12:12:23Z)
Augmenting Biomedical Named Entity Recognition with General-domain Resources [47.24727904076347]
ニューラルネットワークに基づくバイオメディカル名前付きエンティティ認識(BioNER)モデルのトレーニングは通常、広範囲でコストのかかる人的アノテーションを必要とする。 GERBERAは、一般ドメインのNERデータセットをトレーニングに利用した、単純なyet効率の手法である。我々は,81,410インスタンスからなる8つのエンティティタイプの5つのデータセットに対して,GERBERAを体系的に評価した。
論文参考訳（メタデータ） (2024-06-15T15:28:02Z)
BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers [48.21255861863282]
BMRetrieverは、バイオメディカル検索を強化するための一連の密集したレトリバーである。 BMRetrieverは強力なパラメータ効率を示し、410Mの派生型はベースラインを最大11.7倍まで上回っている。
論文参考訳（メタデータ） (2024-04-29T05:40:08Z)
An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
本研究では,大規模言語モデル(LLM)の性能について,バイオインフォマティクスの幅広い課題について検討する。これらのタスクには、潜在的なコーディング領域の同定、遺伝子とタンパク質の命名されたエンティティの抽出、抗微生物および抗がんペプチドの検出、分子最適化、教育生物情報学問題の解決が含まれる。以上の結果から, GPT 変種のような LLM がこれらのタスクの多くをうまく処理できることが示唆された。
論文参考訳（メタデータ） (2024-02-21T11:27:31Z)
A Comprehensive Evaluation of Large Language Models on Benchmark Biomedical Text Processing Tasks [2.5027382653219155]
本稿では,バイオメディカル・タスクのベンチマークにおいて,LLM(Large Language Models)の性能を評価することを目的とする。我々の知る限りでは、生物医学領域における様々なLSMの広範な評価と比較を行う最初の研究である。
論文参考訳（メタデータ） (2023-10-06T14:16:28Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。