Fugu-MT 論文翻訳(概要): Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps

論文の概要: Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps

arxiv url: http://arxiv.org/abs/2605.00052v1
Date: Wed, 29 Apr 2026 17:45:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-04 17:43:28.653886
Title: Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps
Title（参考訳）: ハイブリット・キャプチャ・ガウス・スプラッティングの初等訓練レバーとしての2視点蓄積 : 勾配手術が有効である場合のばらつき分解
Authors: Sungjun Cho,
Abstract要約: ハイブリッドキャプチャーノベルビュー合成は、かなり異なるカメラビューを組み合わせる。標準3DGSは、ステップ毎に1つのレンダリングビューで30Kイテレーションでトレーニングされている。本稿では,この発見を予測・説明する分散分解フレームワークを提案する。
参考スコア（独自算出の注目度）: 7.6889618752994595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hybrid-capture novel view synthesis combines images at substantially different camera distances (e.g., aerial drone and ground-level views). Standard 3D Gaussian Splatting (3DGS), trained for 30K iterations with one rendered view per optimizer step, under-fits the minority regime by 1-3 dB on five hybrid-capture benchmarks. We isolate the lever that closes this gap. Among compute-matched alternatives -- vanilla 60K iterations, magnitude corrections (GradNorm), direction-aware near/far gradient surgery, projective preconditioning, confidence-gated sample-level surgery, and a random two-view-per-step control -- the simplest structural change wins: rendering two views per optimizer step. The pairing rule (geometry-defined near/far, random, or active loss-disparity) does not change PSNR beyond seed variance on any of the five scenes; the structural change of having two views per step does. We propose a variance-decomposition framework that predicts and explains this finding: under bimodal camera regimes, between-regime gradient variance turns out to be small relative to within-regime variance in 3DGS, so structured and random pairings are variance-equivalent in expectation, and the variance halving from two-view accumulation itself is the dominant effect. We verify the framework on five scenes whose camera-altitude bimodality coefficients span [0.55, 1.00], and we report the negative result that direction-aware projection, magnitude correction, confidence gating, and an active loss-disparity pairing all fall within seed variance of random two-view pairing. The two-view structural lever transfers cleanly to the Scaffold-GS and Pixel-GS backbones. We position this work as an honest characterization of which training-side axes do and do not move PSNR for hybrid-capture 3DGS, together with the framework that explains why.
Abstract（参考訳）: ハイブリッドキャプチャーのノベルビュー合成は、かなり異なるカメラ距離の画像を合成する(例えば、空中ドローンや地上レベルのビュー)。標準3Dガウススプラッティング(3DGS)は、オプティマイザステップ毎に1つのレンダリングビューで30Kイテレーションでトレーニングされ、5つのハイブリッドキャプチャーベンチマークで1-3dBのマイノリティレジームに不適合である。この隙間を埋めるレバーを分離する。バニラ60Kイテレーション、マグニチュード補正(GradNorm)、方向対応の近/遠勾配手術、プロジェクティブプレコンディション、信頼度の高いサンプルレベル手術、ランダムな2ビュー・パー・ステップ制御など、計算に適合した代替案では、最も単純な構造的変更が勝利する。ペアリング規則(Geometry-defined near/far, random, or active loss-disparity)は、PSNRが5つのシーンのいずれのシード分散を超えても変化しない。バイモーダルカメラ体制下では, 3DGS のレジム内分散に対して, 偏差は小さく, 構造的, ランダムなペアリングは予測に等価であり, 2ビューの累積から半減する分散が支配的効果である。カメラ高度の両モード係数が[0.55,1.00]の範囲にまたがる5つの場面において,この枠組みを検証し,ランダムな2ビューペアリングのシード分散に収まる方向対応の投影,大きさ補正,信頼ゲーティング,アクティブな損失分散ペアリングの負の結果を報告する。 2ビュー構造レバーは、Scaffold-GSとPixel-GSのバックボーンにきれいに転送される。本研究は, ハイブリッド3DGS用PSNRを, 理由を説明するフレームワークとともに, トレーニング側軸がどの動作を行うのか, 動作しないのかを, 正直に評価するものである。

論文の概要: Two-View Accumulation as the Primary Training Lever for Hybrid-Capture Gaussian Splatting: A Variance-Decomposition View of When Gradient Surgery Helps

関連論文リスト