Fugu-MT 論文翻訳(概要): DART: Depth-Enhanced Accurate and Real-Time Background Matting

論文の概要: DART: Depth-Enhanced Accurate and Real-Time Background Matting

arxiv url: http://arxiv.org/abs/2402.15820v1
Date: Sat, 24 Feb 2024 14:10:17 GMT
ステータス: 翻訳完了
システム内更新日: 2024-02-27 16:55:06.236435
Title: DART: Depth-Enhanced Accurate and Real-Time Background Matting
Title（参考訳）: DART: 深度向上した精度とリアルタイムバックグラウンドマッチング
Authors: Hanxi Li, Guofeng Li, Bo Li, Lin Wu and Yan Cheng
Abstract要約: 静的な背景を持つマッティングは、しばしばバックグラウンド・マッティング(BGM)と呼ばれ、コンピュータビジョンコミュニティ内で大きな注目を集めている。我々は,RGB-Dカメラによって提供される豊富な深度情報を活用し,リアルタイムの背景マッチング性能を向上させる。
参考スコア（独自算出の注目度）: 11.78381754863757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Matting with a static background, often referred to as ``Background Matting" (BGM), has garnered significant attention within the computer vision community due to its pivotal role in various practical applications like webcasting and photo editing. Nevertheless, achieving highly accurate background matting remains a formidable challenge, primarily owing to the limitations inherent in conventional RGB images. These limitations manifest in the form of susceptibility to varying lighting conditions and unforeseen shadows. In this paper, we leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time, dubbed DART. Firstly, we adapt the original RGB-based BGM algorithm to incorporate depth information. The resulting model's output undergoes refinement through Bayesian inference, incorporating a background depth prior. The posterior prediction is then translated into a "trimap," which is subsequently fed into a state-of-the-art matting algorithm to generate more precise alpha mattes. To ensure real-time matting capabilities, a critical requirement for many real-world applications, we distill the backbone of our model from a larger and more versatile BGM network. Our experiments demonstrate the superior performance of the proposed method. Moreover, thanks to the distillation operation, our method achieves a remarkable processing speed of 33 frames per second (fps) on a mid-range edge-computing device. This high efficiency underscores DART's immense potential for deployment in mobile applications}
Abstract（参考訳）: Matting with a static background, often referred to as ``Background Matting" (BGM), has garnered significant attention within the computer vision community due to its pivotal role in various practical applications like webcasting and photo editing. Nevertheless, achieving highly accurate background matting remains a formidable challenge, primarily owing to the limitations inherent in conventional RGB images. These limitations manifest in the form of susceptibility to varying lighting conditions and unforeseen shadows. In this paper, we leverage the rich depth information provided by the RGB-Depth (RGB-D) cameras to enhance background matting performance in real-time, dubbed DART. Firstly, we adapt the original RGB-based BGM algorithm to incorporate depth information. The resulting model's output undergoes refinement through Bayesian inference, incorporating a background depth prior. The posterior prediction is then translated into a "trimap," which is subsequently fed into a state-of-the-art matting algorithm to generate more precise alpha mattes. 多くの実世界のアプリケーションにとって重要な要件であるリアルタイムマッチング機能を確保するため、我々はより大きく汎用性の高いBGMネットワークからモデルのバックボーンを蒸留する。本実験は,提案手法の優れた性能を示す。また, 蒸留操作により, 中距離エッジコンピューティング装置において, 毎秒33フレーム(fps)の顕著な処理速度を達成する。この高効率は、モバイルアプリケーションにおけるDARTの巨大な可能性の基盤となる。

関連論文リスト

TransDiff: Diffusion-Based Method for Manipulating Transparent Objects Using a Single RGB-D Image [9.242427101416226]
デスクトップ上での材料に依存しない物体の把握を実現するため,単一ビューのRGB-D-based depth completion frameworkであるTransDiffを提案する。我々は,RGB画像から抽出した特徴(セグメンテーション,エッジマップ,正規マップなど)を,深度マップ生成プロセスの条件として活用する。提案手法は,ランダムな深度分布を深度マップに変換する反復的復調過程を学習し,初期改良された深度情報を用いて導出する。
論文参考訳（メタデータ） (2025-03-17T03:29:37Z)
Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB [48.31210455404533]
ヒートマップに基づく3Dポーズ推定器は、推定時に与えられるRGBフレームから深度情報を幻覚することができる。深度情報は、RGBベースの幻覚ネットワークを強制して、深度データのみに基づいて事前訓練されたバックボーンに類似した特徴を学習することによって、トレーニング中にのみ使用される。
論文参考訳（メタデータ） (2024-09-17T11:59:34Z)
Scene Prior Filtering for Depth Super-Resolution [97.30137398361823]
テクスチャ干渉とエッジ不正確性を緩和するScene Prior Filtering Network(SPFNet)を導入する。我々のSPFNetは、実データと合成データの両方で広範囲に評価され、最先端のパフォーマンスを実現しています。
論文参考訳（メタデータ） (2024-02-21T15:35:59Z)
AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion [1.8820731605557168]
注意誘導ゲート畳み込みネットワーク(AGG-Net)に基づく深度画像補完のための新しいモデルを提案する。符号化段階では、異なるスケールでの深度と色の特徴の融合を実現するために、AG-GConvモジュールが提案されている。復号段階では、アテンションガイドスキップ接続(AG-SC)モジュールが提示され、再構成にあまりにも多くの深度に関係のない特徴を導入することを避ける。
論文参考訳（メタデータ） (2023-09-04T14:16:08Z)
Symmetric Uncertainty-Aware Feature Transmission for Depth Super-Resolution [52.582632746409665]
カラー誘導DSRのためのSymmetric Uncertainty-aware Feature Transmission (SUFT)を提案する。本手法は最先端の手法と比較して優れた性能を実現する。
論文参考訳（メタデータ） (2023-06-01T06:35:59Z)
Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography [54.36608424943729]
2秒で取得した12メガピクセルのRAWフレームの「長バースト」では,自然手震動のみからの視差情報で高品質のシーン深度を回復できることが示されている。我々は、長時間バーストデータにニューラルRGB-D表現を適合させるテスト時間最適化手法を考案し、シーン深度とカメラモーションを同時に推定する。
論文参考訳（メタデータ） (2022-12-22T18:54:34Z)
Consistent Depth Prediction under Various Illuminations using Dilated Cross Attention [1.332560004325655]
我々は,インターネット3D屋内シーンを用いて照明を手動で調整し,写真リアルなRGB写真とその対応する深度とBRDFマップを作成することを提案する。異なる照明条件下での深度予測の整合性を維持するため,これらの拡張された特徴に横断的な注意を払っている。提案手法は,Variデータセットの最先端手法との比較により評価され,実験で有意な改善が見られた。
論文参考訳（メタデータ） (2021-12-15T10:02:46Z)
Wild ToFu: Improving Range and Quality of Indirect Time-of-Flight Depth with RGB Fusion in Challenging Environments [56.306567220448684]
本稿では,ノイズの多い生のI-ToF信号とRGB画像を用いた学習に基づくエンド・ツー・エンドの深度予測ネットワークを提案する。最終深度マップでは,ベースラインアプローチと比較して40%以上のRMSE改善が見られた。
論文参考訳（メタデータ） (2021-12-07T15:04:14Z)
Real-Time High-Resolution Background Matting [19.140664310700107]
4k解像度で30fps、最新のgpuで60fpsのhdで動作する、リアルタイムで高解像度なバックグラウンド置換技術を導入する。提案手法は,従来の背景組立技術と比較して品質が向上し,同時に速度と解像度の両面で劇的な向上が得られた。
論文参考訳（メタデータ） (2020-12-14T18:43:32Z)
A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection [89.88222217065858]
我々は、深度マップを用いて、RGBと深度の間の早期融合と中核融合を誘導する単一ストリームネットワークを設計する。このモデルは、現在の最も軽量なモデルよりも55.5%軽く、32 FPSのリアルタイム速度で384倍の384ドルの画像を処理している。
論文参考訳（メタデータ） (2020-07-14T04:40:14Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。