Fugu-MT 論文翻訳(概要): Cutting AI Research Costs: How Task-Aware Compression Makes Large Language Model Agents Affordable

論文の概要: Cutting AI Research Costs: How Task-Aware Compression Makes Large Language Model Agents Affordable

arxiv url: http://arxiv.org/abs/2601.05191v1
Date: Thu, 08 Jan 2026 18:13:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-09 17:01:53.328406
Title: Cutting AI Research Costs: How Task-Aware Compression Makes Large Language Model Agents Affordable
Title（参考訳）: AI研究のコスト削減 - タスク認識圧縮が大規模言語モデルエージェントを定着させる方法
Authors: Zuhair Ahmed Khan Taha, Mohammed Mudassir Uddin, Shahnawaz Alam,
Abstract要約: 70ビリオンのパラメータモデルを用いた単一の研究セッションは、クラウド料金が約127ドルである。我々はこの問題に正面から取り組むためにAgentCompressを開発した。私たちのシステムは小さなニューラルネットワークを使って、各タスクがどれだけ難しいかを測定します。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: When researchers deploy large language models for autonomous tasks like reviewing literature or generating hypotheses, the computational bills add up quickly. A single research session using a 70-billion parameter model can cost around $127 in cloud fees, putting these tools out of reach for many academic labs. We developed AgentCompress to tackle this problem head-on. The core idea came from a simple observation during our own work: writing a novel hypothesis clearly demands more from the model than reformatting a bibliography. Why should both tasks run at full precision? Our system uses a small neural network to gauge how hard each incoming task will be, based only on its opening words, then routes it to a suitably compressed model variant. The decision happens in under a millisecond. Testing across 500 research workflows in four scientific fields, we cut compute costs by 68.3% while keeping 96.2% of the original success rate. For labs watching their budgets, this could mean the difference between running experiments and sitting on the sidelines
Abstract（参考訳）: 研究者が論文のレビューや仮説の生成など、自律的なタスクのための大規模な言語モデルを展開すると、計算請求書はすぐに増える。 70ビリオンのパラメーターモデルを用いた単一の研究セッションは、クラウド料金が約127ドル(約1万2000円)で、多くの学術研究所では利用できない。我々はこの問題に正面から取り組むためにAgentCompressを開発した。新たな仮説を書くことは、書誌を改革するよりも明らかにモデルから要求される。どちらのタスクも、完全な精度で実行すべきなのか? 我々のシステムは、入力された各タスクがどれだけ難しいかを評価するために小さなニューラルネットワークを使用し、その開始語のみに基づいて、適切に圧縮されたモデル変種にルーティングする。決定は1ミリ秒以内に行われる。 4つの科学分野で500の研究ワークフローをテストし、計算コストを68.3%削減し、当初の成功率の96.2%を維持した。予算を見守る実験室にとって、これは実験とサイドラインに座ることの違いを意味するかもしれない

論文の概要: Cutting AI Research Costs: How Task-Aware Compression Makes Large Language Model Agents Affordable

関連論文リスト