Fugu-MT 論文翻訳(概要): DiffCamera: Arbitrary Refocusing on Images

論文の概要: DiffCamera: Arbitrary Refocusing on Images

arxiv url: http://arxiv.org/abs/2509.26599v1
Date: Tue, 30 Sep 2025 17:48:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-01 14:45:00.235039
Title: DiffCamera: Arbitrary Refocusing on Images
Title（参考訳）: DiffCamera: 画像に任意に焦点を合わせる
Authors: Yiyang Wang, Xi Chen, Xiaogang Xu, Yu Liu, Hengshuang Zhao,
Abstract要約: DiffCameraは、任意の新しいフォーカスポイントとぼやけレベルに条件付けされた生成画像のフレキシブルな再フォーカスを可能にするモデルである。 DiffCameraは、さまざまな場面で安定したリフォーカスをサポートし、写真や生成AIアプリケーションのためのDoF調整を前例のないコントロールを提供する。
参考スコア（独自算出の注目度）: 55.948229011478304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The depth-of-field (DoF) effect, which introduces aesthetically pleasing blur, enhances photographic quality but is fixed and difficult to modify once the image has been created. This becomes problematic when the applied blur is undesirable~(e.g., the subject is out of focus). To address this, we propose DiffCamera, a model that enables flexible refocusing of a created image conditioned on an arbitrary new focus point and a blur level. Specifically, we design a diffusion transformer framework for refocusing learning. However, the training requires pairs of data with different focus planes and bokeh levels in the same scene, which are hard to acquire. To overcome this limitation, we develop a simulation-based pipeline to generate large-scale image pairs with varying focus planes and bokeh levels. With the simulated data, we find that training with only a vanilla diffusion objective often leads to incorrect DoF behaviors due to the complexity of the task. This requires a stronger constraint during training. Inspired by the photographic principle that photos of different focus planes can be linearly blended into a multi-focus image, we propose a stacking constraint during training to enforce precise DoF manipulation. This constraint enhances model training by imposing physically grounded refocusing behavior that the focusing results should be faithfully aligned with the scene structure and the camera conditions so that they can be combined into the correct multi-focus image. We also construct a benchmark to evaluate the effectiveness of our refocusing model. Extensive experiments demonstrate that DiffCamera supports stable refocusing across a wide range of scenes, providing unprecedented control over DoF adjustments for photography and generative AI applications.
Abstract（参考訳）: 奥行き効果(DoF)は、美的快楽のぼかしを導入し、写真の品質を高めるが、画像が作成されると固定され、修正が困難になる。これは、適用されたぼかしが望ましくない~(例えば、主題は焦点外である)場合に問題になる。そこで本稿では,任意の新たなフォーカスポイントとボケレベルに条件付された生成画像のフレキシブルな再フォーカスを可能にするモデルであるDiffCameraを提案する。具体的には,学習再焦点化のための拡散トランスフォーマーフレームワークを設計する。しかし、トレーニングには異なるフォーカスプレーンと同じシーンでボケレベルを持つデータのペアが必要ですが、取得は困難です。この制限を克服するために,焦点面やボケレベルの異なる大規模な画像ペアを生成するシミュレーションベースのパイプラインを開発した。シミュレーションデータから,バニラ拡散目標のみを用いたトレーニングは,タスクの複雑さによる誤ったDoF行動につながることがよく見いだされる。これは訓練中に強い制約を必要とする。異なる焦点面の写真が複数焦点画像に線形にブレンドできるという写真原理に着想を得て,DF操作を正確に行うための訓練中に積み重ね制約を提案する。この制約は、フォーカス結果がシーン構造やカメラ条件に忠実に整合し、正しいマルチフォーカス画像に組み合わさるように、物理的に根拠づけられた再焦点行動を与えることにより、モデルトレーニングを強化する。また,再焦点モデルの有効性を評価するためのベンチマークを構築した。大規模な実験により、DiffCameraはさまざまな場面で安定した再フォーカスをサポートし、写真や生成AIアプリケーションのためのDoF調整を前例のないコントロールを提供する。

論文の概要: DiffCamera: Arbitrary Refocusing on Images

関連論文リスト