Fugu-MT 論文翻訳(概要): AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

論文の概要: AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

arxiv url: http://arxiv.org/abs/2505.08311v2
Date: Sun, 25 May 2025 07:57:14 GMT
ステータス: 翻訳完了
システム内更新日: 2025-05-27 14:32:54.761515
Title: AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale
Title（参考訳）: AMシンキングv1:32Bスケールでの推論のフロンティア向上
Authors: Yunjie Ji, Xiaoyu Tian, Sitong Zhao, Haotian Wang, Shuaiting Chen, Yiping Peng, Han Zhao, Xiangang Li,
Abstract要約: AM-Thinking-v1は32Bの高密度言語モデルであり、推論のフロンティアを前進させる。 DeepSeek-R1と、Qwen3-235B-A22BやSeed1.5-Thinkingのような主要なMixture-of-Experts (MoE)モデルに対抗して、AM-Thinking-v1は、AIME 2024で85.3、AIME 2025で74.4、LiveCodeBenchで70.3を達成している。
参考スコア（独自算出の注目度）: 16.441081996257576
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present AM-Thinking-v1, a 32B dense language model that advances the frontier of reasoning, embodying the collaborative spirit of open-source innovation. Outperforming DeepSeek-R1 and rivaling leading Mixture-of-Experts (MoE) models like Qwen3-235B-A22B and Seed1.5-Thinking, AM-Thinking-v1 achieves impressive scores of 85.3 on AIME 2024, 74.4 on AIME 2025, and 70.3 on LiveCodeBench, showcasing state-of-the-art mathematical and coding capabilities among open-source models of similar scale. Built entirely from the open-source Qwen2.5-32B base model and publicly available queries, AM-Thinking-v1 leverages a meticulously crafted post-training pipeline - combining supervised fine-tuning and reinforcement learning - to deliver exceptional reasoning capabilities. This work demonstrates that the open-source community can achieve high performance at the 32B scale, a practical sweet spot for deployment and fine-tuning. By striking a balance between top-tier performance and real-world usability, we hope AM-Thinking-v1 inspires further collaborative efforts to harness mid-scale models, pushing reasoning boundaries while keeping accessibility at the core of innovation. We have open-sourced our model on \href{https://huggingface.co/a-m-team/AM-Thinking-v1}{Hugging Face}.
Abstract（参考訳）: オープンソースイノベーションの協調的な精神を具現化した,32Bの高密度言語モデルAM-Thinking-v1を提案する。 DeepSeek-R1と競合するQwen3-235B-A22BやSeed1.5-ThinkingのようなMixture-of-Experts(MoE)モデルでは、AM-Thinking-v1はAIME 2024では85.3、AIME 2025では74.4、LiveCodeBenchでは70.3、同様のスケールのオープンソースモデルでは最先端の数学的およびコーディング能力を示している。 AM-Thinking-v1はオープンソースのQwen2.5-32Bベースモデルと公開クエリから完全に構築されている。この研究は、オープンソースコミュニティが32Bスケールでハイパフォーマンスを達成できることを実証している。トップレベルのパフォーマンスと現実のユーザビリティのバランスを保ちながら、AM-Thinking-v1は、中規模モデルを活用するためのさらなる協力的な取り組みを刺激し、イノベーションの核心にアクセシビリティを維持しながら、推論境界を押し上げています。当社は,当社のモデルをオープンソースとして公開しています。

関連論文リスト

Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction [95.91743732150233]
一連のオープンソースの言語モデルであるGoedel-Prover-V2は、自動定理の新たな最先端を証明した。我々は、より複雑な定理をマスターするためにモデルを訓練することの困難さを増す合成タスクを生成する。 Goedel-Prover-V2-32Bは、標準モードのpass@32でMiniF2Fの88.1%、自己補正モードの90.4%を達成する。
論文参考訳（メタデータ） (2025-08-05T16:28:22Z)
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization [74.04867639197445]
MiroMind-M1 は Qwen-2.5 ベースのベンチマーク上に構築された完全なオープンソース RLM のセットである。我々のモデルは2つの段階で訓練されている: SFT on a carefully curated corpus of 719K math-reasoning problem with confirmed CoT trajectories, then RLVR on 62K challenge and verible problem。
論文参考訳（メタデータ） (2025-07-19T16:21:23Z)
KAT-V1: Kwai-AutoThink Technical Report [50.84483585850113]
Kwaipilot-AutoThink (KAT) はオープンソースの40B大言語モデルであり、推論集約タスクにおける過大な問題に対処するために開発された。 KATはタスクの複雑さに基づいて推論モードと非推論モードを動的に切り替える。また、GRPOフレームワークに中間管理を組み込んだ強化学習アルゴリズムであるStep-SRPOを提案する。
論文参考訳（メタデータ） (2025-07-11T04:07:10Z)
Skywork Open Reasoner 1 Technical Report [51.403686909760914]
提案するSkywork-OR1は,長期チェーン・オブ・ソート(CoT)モデルのための,効果的かつスケーラブルな強化学習(RL)実装である。 DeepSeek-R1-Distillモデルシリーズをベースとして、我々のRLアプローチは顕著なパフォーマンス向上を実現している。我々のSkywork-OR1-32Bモデルは、AIME24とAIME25ベンチマークでDeepSeek-R1とQwen3-32Bを上回っています。
論文参考訳（メタデータ） (2025-05-28T12:56:04Z)
Not All Correct Answers Are Equal: Why Your Distillation Source Matters [16.441081996257576]
蒸留は、オープンソースの言語モデルの推論能力を高めるための実用的で効果的なアプローチとして登場した。我々は,最先端の教師モデルであるAM-Thinking-v1,Qwen3-235B-A22B,DeepSeek-R1の3つの出力を1億9900万クエリの共有コーパスで収集した。各データセットでトレーニングされた学生モデルは、AIME2024、AIME2025、MATH500、LiveCodeBenchなどの推論ベンチマークに基づいて評価される。
論文参考訳（メタデータ） (2025-05-20T15:00:51Z)
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning [17.233735911531117]
次世代マルチモーダル推論モデルであるSkywork R1V2を提案する。コアとなるR1V2では、ハイブリッド強化学習パラダイムが導入されている。
論文参考訳（メタデータ） (2025-04-23T12:24:10Z)
Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning [58.86928947970342]
Embodied-Rは、知覚のための大規模視覚言語モデルと推論のための小規模言語モデルを組み合わせたフレームワークである。わずか5kのエボダイドビデオサンプルのトレーニングの後、Embodied-Rと3B LMは最先端のマルチモーダル推論モデルと一致した。 Embodied-Rは、体系的分析や文脈統合のような創発的な思考パターンも示している。
論文参考訳（メタデータ） (2025-04-17T06:16:11Z)
Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning [231.11339402237903]
反応前に思考を通して推論できるSeed1.5-Thinkingを紹介した。 Seed1.5-ThinkingはAIME 2024で86.7、Codeforcesで55.0、GPQAで77.3を達成した。これは、STEMとコーディングにおいて優れた推論能力を示す。
論文参考訳（メタデータ） (2025-04-10T17:10:51Z)
START: Self-taught Reasoner with Tools [51.38785489790888]
ツール統合長チェーン・オブ・シークレット(CoT)推論LSMであるSTART(Self-Taught Reasoner with Tools)を紹介する。 STARTは複雑な計算、自己チェック、多様な方法の探索、そして自己老化を行うことができる。基礎となるQwQ-32Bを著しく上回り、最先端のオープンウェイトモデルR1-Distill-Qwen-32Bに匹敵する性能を達成する。
論文参考訳（メタデータ） (2025-03-06T17:11:51Z)
Hymba: A Hybrid-head Architecture for Small Language Models [65.94140459055244]
Hymbaは、ハイブリッドヘッド並列アーキテクチャを特徴とする、小さな言語モデルのファミリーである。重要な情報を保持するプロンプトに先立って,学習可能なメタトークンを導入する。このモデルは、層間鍵値共有と部分的スライディングウィンドウアテンションを組み込むことにより、さらに最適化される。
論文参考訳（メタデータ） (2024-11-20T19:51:25Z)
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub [2.595302141947391]
私たちはHugging Face (HF) Hub上での開発活動を分析します。例えば、70%以上のモデルが0回ダウンロードされ、1%が99%ダウンロードされている。コミュニティにはコア周辺構造があり、多彩な開発者のコアと、分離された開発者の大多数があります。
論文参考訳（メタデータ） (2024-05-20T11:10:49Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。