Fugu-MT 論文翻訳(概要): In-IDE Toolkit for Developers of AI-Based Features

論文の概要: In-IDE Toolkit for Developers of AI-Based Features

arxiv url: http://arxiv.org/abs/2605.14612v1
Date: Thu, 14 May 2026 09:28:14 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-15 21:45:34.751443
Title: In-IDE Toolkit for Developers of AI-Based Features
Title（参考訳）: AIベースの機能開発のためのIDEツールキット
Authors: Yaroslav Sokolov, Yury Khudyakov, Lenar Sharipov, Andrei Gasparian, Parth Tiwary, Artem Trofimov,
Abstract要約: 我々はJetBrainsのAI Toolkitプラグインを紹介し、Run/Debugループに直接トレースと評価をもたらす。 AIエージェントとAI評価の設計と実装について詳述し、最初の採用テレメトリを報告し、フレームワークのカバレッジとスケール評価を拡大するための次のステップを概説する。
参考スコア（独自算出の注目度）: 0.24629531282150877
License: http://creativecommons.org/licenses/by/4.0/
Abstract: AI-enabled features built on LLMs and agentic workflows are difficult to test, debug, and reproduce, especially for product-focused software engineers without a machine learning background. We present the AI Toolkit plugin for JetBrains IDEs, which brings tracing and evaluation directly into the Run/Debug loop. A mixed methods study with practitioners presents three consistent needs: (1) make evaluation regular and repeatable, (2) expose traces at the moment of execution, and (3) minimize setup and context switching. Guided by these needs, the AI Toolkit introduces an IDE-native workflow: run-triggered trace capture; immediate, hierarchical inspection; one-click "Add to Dataset" from traces; and unit-test-like evaluations with pluggable metrics. The first release in PyCharm shows promising early signals - strong conversion when promoted at Run, sustained usage among those who capture traces, and low churn - suggesting that IDE-native observability lowers activation energy and helps developers adopt disciplined practices. We detail the design and implementation of the AI Agents Debugger and AI Evaluation, report initial adoption telemetry, and outline next steps to broaden framework coverage and scale evaluations. Together, these results indicate that integrating AI observability and evaluation into everyday IDE workflows can make modern AI development accessible to non-ML specialists while preserving software-engineering practices.
Abstract（参考訳）: LLMとエージェントワークフロー上に構築されたAI対応機能は、特に機械学習のバックグラウンドを持たない製品にフォーカスしたソフトウェアエンジニアに対して、テスト、デバッグ、再現が難しい。我々はJetBrains IDE向けのAI Toolkitプラグインを紹介し、Run/Debugループに直接トレースと評価をもたらす。 1) 評価を規則的かつ反復可能であること,(2) 実行時にトレースを露出すること,(3) 設定とコンテキストの切り替えを最小限にすること,である。これらのニーズにガイドされたAI Toolkitは、実行トリガーされたトレースキャプチャ、即時かつ階層的なインスペクション、トレースからのワンクリック“Add to Dataset”、プラグイン可能なメトリクスによるユニットテストのような評価という、IDEネイティブワークフローを導入している。 PyCharmの最初のリリースは、将来有望な早期シグナル - Runでのプロモート時の強力な変換、トレースをキャプチャする人々の間での持続的な使用、チャーン(low churn) – を示し、IDEネイティブな可観測性によってアクティベーションエネルが低下し、開発者が規律付きプラクティスを採用するのに役立つことを示唆している。 AIエージェントデバッガとAI評価の設計と実装について詳述し、最初の採用テレメトリを報告し、フレームワークのカバレッジとスケール評価を拡大するための次のステップを概説する。これらの結果は、AIの可観測性と評価を日々のIDEワークフローに統合することで、ソフトウェアエンジニアリングのプラクティスを保ちながら、現代のAI開発が非MLスペシャリストにアクセスできることを示唆している。

論文の概要: In-IDE Toolkit for Developers of AI-Based Features

関連論文リスト