Fugu-MT 論文翻訳(概要): Zero-Shot Depth from Defocus

論文の概要: Zero-Shot Depth from Defocus

arxiv url: http://arxiv.org/abs/2603.26658v1
Date: Fri, 27 Mar 2026 17:56:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-30 21:49:48.631549
Title: Zero-Shot Depth from Defocus
Title（参考訳）: Defocus (複数形 Defocuss)
Authors: Yiming Zuo, Hongyu Wen, Venkat Subramanian, Patrick Chen, Karhan Kayan, Mario Bijelic, Felix Heide, Jia Deng,
Abstract要約: Defocus (DfD) からの深さは、焦点スタックから密度計量深度マップを推定するタスクである。本稿ではゼロショット一般化の挑戦的で実践的な設定に焦点を当てる。まず、実世界のDfDベンチマークであるZEDDを提案する。このベンチマークでは、以前のベンチマークと比べて8.3倍のシーンと、かなり高品質な画像と地上深度マップが提供されている。
参考スコア（独自算出の注目度）: 43.62160290661175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth from Defocus (DfD) is the task of estimating a dense metric depth map from a focus stack. Unlike previous works overfitting to a certain dataset, this paper focuses on the challenging and practical setting of zero-shot generalization. We first propose a new real-world DfD benchmark ZEDD, which contains 8.3x more scenes and significantly higher quality images and ground-truth depth maps compared to previous benchmarks. We also design a novel network architecture named FOSSA. FOSSA is a Transformer-based architecture with novel designs tailored to the DfD task. The key contribution is a stack attention layer with a focus distance embedding, allowing efficient information exchange across the focus stack. Finally, we develop a new training data pipeline allowing us to utilize existing large-scale RGBD datasets to generate synthetic focus stacks. Experiment results on ZEDD and other benchmarks show a significant improvement over the baselines, reducing errors by up to 55.7%. The ZEDD benchmark is released at https://zedd.cs.princeton.edu. The code and checkpoints are released at https://github.com/princeton-vl/FOSSA.
Abstract（参考訳）: Defocus (DfD) からの深さは、焦点スタックから密度計量深度マップを推定するタスクである。本論文は, あるデータセットに過剰に適合する以前の研究とは異なり, ゼロショット一般化の挑戦的で実践的な設定に焦点を当てる。まず、実世界のDfDベンチマークであるZEDDを提案する。このベンチマークでは、以前のベンチマークと比べて8.3倍のシーンと、かなり高品質な画像と地上深度マップが提供されている。また、FOSSAという新しいネットワークアーキテクチャも設計する。 FOSSAはトランスフォーマーベースのアーキテクチャで、DfDタスクに合わせた新しい設計である。重要なコントリビューションは、フォーカス距離を埋め込んだスタックアテンション層であり、フォーカススタック間の効率的な情報交換を可能にする。最後に,既存の大規模RGBDデータセットを利用して合成フォーカススタックを生成するための,新たなトレーニングデータパイプラインを開発した。 ZEDDや他のベンチマークの実験結果は、ベースラインよりも大幅に改善され、エラーを最大55.7%削減した。 ZEDDベンチマークはhttps://zedd.cs.princeton.eduで公開されている。コードとチェックポイントはhttps://github.com/princeton-vl/FOSSAで公開されている。

論文の概要: Zero-Shot Depth from Defocus

関連論文リスト