Fugu-MT 論文翻訳(概要): VDPP: Video Depth Post-Processing for Speed and Scalability

論文の概要: VDPP: Video Depth Post-Processing for Speed and Scalability

arxiv url: http://arxiv.org/abs/2604.06665v1
Date: Wed, 08 Apr 2026 04:33:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.336097
Title: VDPP: Video Depth Post-Processing for Speed and Scalability
Title（参考訳）: VDPP: スピードとスケーラビリティのためのビデオの深さ後処理
Authors: Daewon Yoon, Injun Baek, Sangyu Han, Yearim Kim, Nojun Kwak,
Abstract要約: ビデオ深度推定は、自律運転から混合現実に至るまでのアプリケーションに3Dシーン構造を提供するために不可欠である。本稿では,ビデオ深度推定のための後処理手法の高速化と精度向上を目的としたVDPP(Video Depth Post-Processing)を提案する。以上の結果から,VDPPは速度,精度,メモリ効率のバランスが良く,リアルタイムエッジデプロイメントにおける最も実用的なソリューションであることが示された。
参考スコア（独自算出の注目度）: 25.51919473064388
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Video depth estimation is essential for providing 3D scene structure in applications ranging from autonomous driving to mixed reality. Current end-to-end video depth models have established state-of-the-art performance. Although current end-to-end (E2E) models have achieved state-of-the-art performance, they function as tightly coupled systems that suffer from a significant adaptation lag whenever superior single-image depth estimators are released. To mitigate this issue, post-processing methods such as NVDS offer a modular plug-and-play alternative to incorporate any evolving image depth model without retraining. However, existing post-processing methods still struggle to match the efficiency and practicality of E2E systems due to limited speed, accuracy, and RGB reliance. In this work, we revitalize the role of post-processing by proposing VDPP (Video Depth Post-Processing), a framework that improves the speed and accuracy of post-processing methods for video depth estimation. By shifting the paradigm from computationally expensive scene reconstruction to targeted geometric refinement, VDPP operates purely on geometric refinements in low-resolution space. This design achieves exceptional speed (>43.5 FPS on NVIDIA Jetson Orin Nano) while matching the temporal coherence of E2E systems, with dense residual learning driving geometric representations rather than full reconstructions. Furthermore, our VDPP's RGB-free architecture ensures true scalability, enabling immediate integration with any evolving image depth model. Our results demonstrate that VDPP provides a superior balance of speed, accuracy, and memory efficiency, making it the most practical solution for real-time edge deployment. Our project page is at https://github.com/injun-baek/VDPP
Abstract（参考訳）: ビデオ深度推定は、自律運転から混合現実に至るまでのアプリケーションに3Dシーン構造を提供するために不可欠である。現在のエンド・ツー・エンドのビデオ深度モデルでは、最先端のパフォーマンスが確立されている。現在のエンド・ツー・エンド(E2E)モデルは最先端の性能を達成したが、優れた単一画像深度推定器がリリースされるたびに大きな適応遅延に悩まされる密結合システムとして機能する。この問題を緩和するため、NVDSのような後処理手法は、再トレーニングせずに進化する画像深度モデルを組み込むためのモジュラー・プラグ・アンド・プレイの代替手段を提供する。しかし、既存の後処理手法は、速度、精度、RGBに依存するため、E2Eシステムの効率性と実用性に相応しい。本稿では,ビデオ深度推定のための後処理手法の高速化と精度向上を目的とした,VDPP(Video Depth Post-Processing)を提案することによって,後処理の役割を再活性化する。 VDPPは、計算コストのかかるシーン再構成から目的の幾何学的洗練へパラダイムをシフトすることで、低解像度空間における幾何学的洗練を純粋に操作する。この設計は、E2Eシステムの時間的コヒーレンスと完全な再構成ではなく、密集した残留学習による幾何学的表現とを一致させながら、例外的な速度(NVIDIA Jetson Orin Nanoの>43.5 FPS)を達成する。さらに、VDPPのRGBフリーアーキテクチャは真のスケーラビリティを保証し、進化する画像深度モデルとの即時統合を可能にします。以上の結果から,VDPPは速度,精度,メモリ効率のバランスが良く,リアルタイムエッジデプロイメントにおける最も実用的なソリューションであることが示された。私たちのプロジェクトページはhttps://github.com/injun-baek/VDPPです。

論文の概要: VDPP: Video Depth Post-Processing for Speed and Scalability

関連論文リスト