Fugu-MT 論文翻訳(概要): SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

論文の概要: SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

arxiv url: http://arxiv.org/abs/2603.26720v1
Date: Thu, 19 Mar 2026 01:36:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-06 02:36:13.10581
Title: SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space
Title（参考訳）: SutureAgent: 画像空間におけるゴール条件付きオフラインRLによる外科的軌跡の学習
Authors: Huanrong Liu, Chunlin Tian, Tongyu Jia, Tailai Zhou, Qin Liu, Yu Gao, Yutong Ban, Yun Gu, Guy Rosman, Xin Ma, Qingbiao Li,
Abstract要約: 内視鏡的ビデオからの針の軌跡の予測はロボットによる縫合に不可欠である。画像に基づく針軌道予測を逐次決定問題として定式化する。 SutureAgentは、観測エンコーダを使用して可変長クリップを符号化し、局所的な空間的キューと長距離時間ダイナミクスの両方をキャプチャする。
参考スコア（独自算出の注目度）: 22.300952176139948
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Predicting surgical needle trajectories from endoscopic video is critical for robot-assisted suturing, enabling anticipatory planning, real-time guidance, and safer motion execution. Existing methods that directly learn motion distributions from visual observations tend to overlook the sequential dependency among adjacent motion steps. Moreover, sparse waypoint annotations often fail to provide sufficient supervision, further increasing the difficulty of supervised or imitation learning methods. To address these challenges, we formulate image-based needle trajectory prediction as a sequential decision-making problem, in which the needle tip is treated as an agent that moves step by step in pixel space. This formulation naturally captures the continuity of needle motion and enables the explicit modeling of physically plausible pixel-wise state transitions over time. From this perspective, we propose SutureAgent, a goal-conditioned offline reinforcement learning framework that leverages sparse annotations to dense reward signals via cubic spline interpolation, encouraging the policy to exploit limited expert guidance while exploring plausible future motion paths. SutureAgent encodes variable-length clips using an observation encoder to capture both local spatial cues and long-range temporal dynamics, and autoregressively predicts future waypoints through actions composed of discrete directions and continuous magnitudes. To enable stable offline policy optimization from expert demonstrations, we adopt Conservative Q-Learning with Behavioral Cloning regularization. Experiments on a new kidney wound suturing dataset containing 1,158 trajectories from 50 patients show that SutureAgent reduces Average Displacement Error by 58.6% compared with the strongest baseline, demonstrating the effectiveness of modeling needle trajectory prediction as pixel-level sequential action learning.
Abstract（参考訳）: 内視鏡的ビデオからの手術針の軌跡の予測は、ロボットによる縫合、予測計画、リアルタイムガイダンス、より安全な動作実行を可能にするために重要である。視覚的観察から運動分布を直接学習する既存の手法は、隣接する動きステップ間の逐次的依存を無視する傾向にある。さらに、疎度なウェイポイントアノテーションは十分な監視を提供しないことが多く、教師付きや模倣の学習方法の難しさが増す。これらの課題に対処するため,画像に基づく針の軌跡予測を逐次決定問題として定式化し,針先端を段階的にピクセル空間に移動させるエージェントとして扱う。この定式化は、針の動きの連続性を自然に捉え、時間とともに物理的に可塑性な画素状態遷移の明示的なモデリングを可能にする。この観点から,3次スプライン補間による厳密な報酬信号に対するスパースアノテーションを活用する,目標条件付きオフライン強化学習フレームワークSutureAgentを提案する。 SutureAgentは、観測エンコーダを用いて可変長のクリップを符号化し、局所的な空間的手がかりと長距離時間的ダイナミクスの両方をキャプチャし、離散的な方向と連続的な大きさからなるアクションによって、自動回帰的に将来のウェイポイントを予測する。専門家による実証から安定したオフラインポリシーの最適化を可能にするため、我々は行動クローン正規化による保守的なQ-Learningを採用する。 50人の患者から1,158件のトラジェクトリを含む新しい腎臓創傷縫合データセットの実験により、SutureAgentは最強のベースラインと比較して平均変位誤差を58.6%削減し、ピクセルレベルのシーケンシャルな動作学習として針軌道予測をモデル化する効果を実証した。

論文の概要: SutureAgent: Learning Surgical Trajectories via Goal-conditioned Offline RL in Pixel Space

関連論文リスト