Fugu-MT 論文翻訳(概要): WildSci: Advancing Scientific Reasoning from In-the-Wild Literature

論文の概要: WildSci: Advancing Scientific Reasoning from In-the-Wild Literature

arxiv url: http://arxiv.org/abs/2601.05567v1
Date: Fri, 09 Jan 2026 06:35:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-12 17:41:49.868267
Title: WildSci: Advancing Scientific Reasoning from In-the-Wild Literature
Title（参考訳）: WildSci: 科学的な推論を促進する
Authors: Tengxiao Liu, Deepak Nathani, Zekun Li, Kevin Yang, William Yang Wang,
Abstract要約: 我々は、ピアレビューされた文献から自動的に合成されるドメイン固有の科学質問の新しいデータセットWildSciを紹介する。複雑な科学的推論タスクを複数選択形式でフレーミングすることにより、明確に定義された報酬信号によるスケーラブルなトレーニングを可能にする。一連の科学的ベンチマークの実験は、我々のデータセットとアプローチの有効性を実証している。
参考スコア（独自算出の注目度）: 50.16160754134139
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent progress in large language model (LLM) reasoning has focused on domains like mathematics and coding, where abundant high-quality data and objective evaluation metrics are readily available. In contrast, progress in LLM reasoning models remains limited in scientific domains such as medicine and materials science due to limited dataset coverage and the inherent complexity of open-ended scientific questions. To address these challenges, we introduce WildSci, a new dataset of domain-specific science questions automatically synthesized from peer-reviewed literature, covering 9 scientific disciplines and 26 subdomains. By framing complex scientific reasoning tasks in a multiple-choice format, we enable scalable training with well-defined reward signals. We further apply reinforcement learning to finetune models on these data and analyze the resulting training dynamics, including domain-specific performance changes, response behaviors, and generalization trends. Experiments on a suite of scientific benchmarks demonstrate the effectiveness of our dataset and approach. We release WildSci to enable scalable and sustainable research in scientific reasoning, available at https://huggingface.co/datasets/JustinTX/WildSci.
Abstract（参考訳）: 大規模言語モデル(LLM)推論の最近の進歩は、豊富な高品質のデータと客観的評価指標が容易に利用できる数学やコーディングのような領域に焦点を当てている。対照的に、LLM推論モデルの進歩は、限られたデータセットカバレッジとオープンエンドの科学的問題の本質的な複雑さのために、医学や材料科学のような科学分野に限られている。これらの課題に対処するために、9つの科学分野と26のサブドメインをカバーする、ピアレビューされた文献から自動的に合成される、ドメイン固有の科学質問のデータセットであるWildSciを紹介した。複雑な科学的推論タスクを複数選択形式でフレーミングすることにより、明確に定義された報酬信号によるスケーラブルなトレーニングを可能にする。さらに、これらのデータに基づくモデルの微調整に強化学習を適用し、ドメイン固有のパフォーマンス変化、応答挙動、一般化トレンドを含むトレーニングのダイナミクスを解析する。一連の科学的ベンチマークの実験は、我々のデータセットとアプローチの有効性を実証している。 WildSciは、科学的推論におけるスケーラブルで持続可能な研究を可能にするために、https://huggingface.co/datasets/JustinTX/WildSciで公開しています。

論文の概要: WildSci: Advancing Scientific Reasoning from In-the-Wild Literature

関連論文リスト