Fugu-MT 論文翻訳(概要): TETO: Tracking Events with Teacher Observation for Motion Estimation and Frame Interpolation

論文の概要: TETO: Tracking Events with Teacher Observation for Motion Estimation and Frame Interpolation

arxiv url: http://arxiv.org/abs/2603.23487v1
Date: Tue, 24 Mar 2026 17:53:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-25 19:53:37.626773
Title: TETO: Tracking Events with Teacher Observation for Motion Estimation and Frame Interpolation
Title（参考訳）: TETO:教師による行動推定とフレーム補間のための追跡イベント
Authors: Jini Yang, Eunbeen Hong, Soowon Son, Hyunkoo Lee, Sunghwan Hong, Sunok Kim, Seungryong Kim,
Abstract要約: マイクロ秒の解像度で画素ごとの明るさが変化し、RGBフレーム間で連続的な動き情報が失われる。本稿では,事前学習したRGBトラッカーからの知識蒸留を通じて,実世界の無注釈記録のsim$25分からイベント動作推定を学習するTETOを提案する。 EVIMO2 と DSEC の光学的フローの同時追跡を極小のトレーニングデータを用いて達成し、正確な動き推定がBS-ERGB とHQ-EVFI のフレーム品質に直接変換されることを示す。
参考スコア（独自算出の注目度）: 39.17414948577463
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Event cameras capture per-pixel brightness changes with microsecond resolution, offering continuous motion information lost between RGB frames. However, existing event-based motion estimators depend on large-scale synthetic data that often suffers from a significant sim-to-real gap. We propose TETO (Tracking Events with Teacher Observation), a teacher-student framework that learns event motion estimation from only $\sim$25 minutes of unannotated real-world recordings through knowledge distillation from a pretrained RGB tracker. Our motion-aware data curation and query sampling strategy maximizes learning from limited data by disentangling object motion from dominant ego-motion. The resulting estimator jointly predicts point trajectories and dense optical flow, which we leverage as explicit motion priors to condition a pretrained video diffusion transformer for frame interpolation. We achieve state-of-the-art point tracking on EVIMO2 and optical flow on DSEC using orders of magnitude less training data, and demonstrate that accurate motion estimation translates directly to superior frame interpolation quality on BS-ERGB and HQ-EVFI.
Abstract（参考訳）: イベントカメラはマイクロ秒の解像度で画素ごとの明るさ変化を捉え、RGBフレーム間で連続的な動き情報を提供する。しかし、既存のイベントベースの運動推定器は、しばしば大きなsim-to-realギャップに悩まされる大規模な合成データに依存している。本稿では,教師が学習する学習フレームワークであるTETO(Tracking Events with Teacher Observation)を提案する。我々の動き認識型データキュレーションとクエリサンプリング戦略は、支配的なエゴモーションから物体の動きを遠ざけ、限られたデータからの学習を最大化する。得られた推定器は、フレーム補間のための予め訓練されたビデオ拡散変換器を条件に、明示的な動きとして利用し、点軌跡と高密度光流を共同で予測する。 EVIMO2 と DSEC の光学的フローの同時追跡を極小のトレーニングデータを用いて達成し、正確な動き推定がBS-ERGB とHQ-EVFI のフレーム補間品質に直接変換されることを示す。

論文の概要: TETO: Tracking Events with Teacher Observation for Motion Estimation and Frame Interpolation

関連論文リスト