Fugu-MT 論文翻訳(概要): YUV20K: A Complexity-Driven Benchmark and Trajectory-Aware Alignment Model for Video Camouflaged Object Detection

論文の概要: YUV20K: A Complexity-Driven Benchmark and Trajectory-Aware Alignment Model for Video Camouflaged Object Detection

arxiv url: http://arxiv.org/abs/2604.09985v1
Date: Sat, 11 Apr 2026 01:57:58 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-14 20:13:15.778312
Title: YUV20K: A Complexity-Driven Benchmark and Trajectory-Aware Alignment Model for Video Camouflaged Object Detection
Title（参考訳）: YUV20K:ビデオカモフラージュオブジェクト検出のための複雑度駆動ベンチマークと軌道認識アライメントモデル
Authors: Yiyu Liu, Shuo Ye, Chao Hao, Zitong Yu,
Abstract要約: 本稿では,YUV20Kという,ピクセルレベルのアノテート複雑性駆動VCODベンチマークを提案する。 MFS(Motion Feature Stabilization)とTAA(Trajectory-Aware Alignment)の2つの重要なモジュールを特徴とする新しいフレームワークを提案する。本フレームワークは,複雑な時間的シナリオに直面する場合,ドメイン間の一般化とロバスト性に優れる。
参考スコア（独自算出の注目度）: 30.226415717379066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video Camouflaged Object Detection (VCOD) is currently constrained by the scarcity of challenging benchmarks and the limited robustness of models against erratic motion dynamics. Existing methods often struggle with Motion-Induced Appearance Instability and Temporal Feature Misalignment caused by complex motion scenarios. To address the data bottleneck, we present YUV20K, a pixel-level annoated complexity-driven VCOD benchmark. Comprising 24,295 annotated frames across 91 scenes and 47 kinds of species, it specifically targets challenging scenarios like large-displacement motion, camera motion and other 4 types scenarios. On the methodological front, we propose a novel framework featuring two key modules: Motion Feature Stabilization (MFS) and Trajectory-Aware Alignment (TAA). The MFS module utilizes frame-agnostic Semantic Basis Primitives to stablize features, while the TAA module leverages trajectory-guided deformable sampling to ensure precise temporal alignment. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art competitors on existing datasets and establishes a new baseline on the challenging YUV20K. Notably, our framework exhibits superior cross-domain generalization and robustness when confronting complex spatiotemporal scenarios. Our code and dataset will be available at https://github.com/K1NSA/YUV20K
Abstract（参考訳）: Video Camouflaged Object Detection (VCOD)は、現在、挑戦的なベンチマークの不足と、不規則な運動力学に対するモデルの堅牢性に制約されている。既存の手法では、複雑な動きのシナリオによって生じる動きによる外観不安定と時間的特徴の相違にしばしば苦労する。データボトルネックに対処するため、画素レベルのアノテート複雑性駆動VCODベンチマークであるYUV20Kを提案する。 91のシーンに24,295の注釈付きフレームと47種類のタイプがあり、大きな変位運動、カメラの動き、その他の4種類のシナリオなど、困難なシナリオをターゲットにしている。本稿では,MFS(Motion Feature Stabilization)とTAA(Trajectory-Aware Alignment)の2つの重要なモジュールを特徴とする新しいフレームワークを提案する。 MFSモジュールはフレーム非依存のSemantic Basis Primitivesを使用して特徴を安定化し、TAAモジュールは軌道誘導型変形可能なサンプリングを活用して正確な時間的アライメントを確保する。大規模な実験により,本手法は既存のデータセットにおける最先端の競合相手を著しく上回り,挑戦的なYUV20Kの新たなベースラインを確立した。特に,複雑な時空間シナリオに直面する場合,ドメイン間の一般化やロバスト性に優れる。私たちのコードとデータセットはhttps://github.com/K1NSA/YUV20Kで公開されます。

論文の概要: YUV20K: A Complexity-Driven Benchmark and Trajectory-Aware Alignment Model for Video Camouflaged Object Detection

関連論文リスト