Fugu-MT 論文翻訳(概要): PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching

論文の概要: PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching

arxiv url: http://arxiv.org/abs/2510.20178v1
Date: Thu, 23 Oct 2025 03:52:39 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 03:08:17.23697
Title: PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching
Title（参考訳）: PPMStereo: 一貫性のある動的ステレオマッチングのためのピック・アンド・プレイメモリの構築
Authors: Yun Wang, Junjie Hu, Qiaole Dong, Yongjian Zhang, Yanwei Fu, Tin Lun Lam, Dapeng Wu,
Abstract要約: textbfPick-and-textbflay textbfMemory (PM) construction module for dynamic bfStereo matching, called bftextPPMStereo。 bftextPPMStereo と呼ばれる動的 bfStereo マッチングのための textbfPick-and-textbflay textbfMemory (PM) 構築モジュールを提案する。
参考スコア（独自算出の注目度）: 51.98089287914147
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Temporally consistent depth estimation from stereo video is critical for real-world applications such as augmented reality, where inconsistent depth estimation disrupts the immersion of users. Despite its importance, this task remains challenging due to the difficulty in modeling long-term temporal consistency in a computationally efficient manner. Previous methods attempt to address this by aggregating spatio-temporal information but face a fundamental trade-off: limited temporal modeling provides only modest gains, whereas capturing long-range dependencies significantly increases computational cost. To address this limitation, we introduce a memory buffer for modeling long-range spatio-temporal consistency while achieving efficient dynamic stereo matching. Inspired by the two-stage decision-making process in humans, we propose a \textbf{P}ick-and-\textbf{P}lay \textbf{M}emory (PPM) construction module for dynamic \textbf{Stereo} matching, dubbed as \textbf{PPMStereo}. PPM consists of a `pick' process that identifies the most relevant frames and a `play' process that weights the selected frames adaptively for spatio-temporal aggregation. This two-stage collaborative process maintains a compact yet highly informative memory buffer while achieving temporally consistent information aggregation. Extensive experiments validate the effectiveness of PPMStereo, demonstrating state-of-the-art performance in both accuracy and temporal consistency. % Notably, PPMStereo achieves 0.62/1.11 TEPE on the Sintel clean/final (17.3\% \& 9.02\% improvements over BiDAStereo) with fewer computational costs. Codes are available at \textcolor{blue}{https://github.com/cocowy1/PPMStereo}.
Abstract（参考訳）: ステレオビデオからの時間的に一貫した深度推定は、拡張現実のような現実世界のアプリケーションにとって重要であり、不整合深度推定はユーザの没入を阻害する。その重要性にもかかわらず、このタスクは、長期的時間的一貫性を計算的に効率的な方法でモデル化することが困難であるため、依然として困難である。従来の手法では、時空間情報を集約することでこの問題に対処しようとするが、基本的なトレードオフに直面している。この制限に対処するために、高速な動的ステレオマッチングを実現しつつ、長距離時空間一貫性をモデル化するためのメモリバッファを導入する。ヒトの2段階決定プロセスにインスパイアされ、動的な \textbf{Stereo} マッチングのための \textbf{P}ick-and-\textbf{P}lay \textbf{M}emory (PPM) 構築モジュールを提案し、これを \textbf{PPMStereo} と呼ぶ。 PPMは最も関連性の高いフレームを識別する 'pick' プロセスと、時空間アグリゲーションのために選択したフレームを適応的に重み付けする 'play' プロセスで構成される。この2段階の協調プロセスは、時間的に一貫した情報アグリゲーションを達成しつつ、コンパクトで高情報性の高いメモリバッファを維持する。大規模実験によりPPMStereoの有効性が検証され、精度と時間的整合性の両方で最先端の性能が示された。 PPMStereo は Sintel clean/final (17.3\% \& 9.02\%) で 0.62/1.11 TEPE を達成するが、計算コストは少ない。コードは \textcolor{blue}{https://github.com/cocowy1/PPMStereo} で公開されている。

論文の概要: PPMStereo: Pick-and-Play Memory Construction for Consistent Dynamic Stereo Matching

関連論文リスト