Fugu-MT 論文翻訳(概要): SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing

論文の概要: SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing

arxiv url: http://arxiv.org/abs/2604.19587v1
Date: Tue, 21 Apr 2026 15:38:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-22 22:41:49.849163
Title: SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing
Title（参考訳）: SmartPhotoCrafter: 自動写真編集のための統一推論、生成、最適化
Authors: Ying Zeng, Miaosen Luo, Guangyuan Li, Yang Yang, Ruiyang Fan, Linxiao Shi, Qirui Yang, Jian Zhang, Chengcheng Liu, Siming Zheng, Jinwei Chen, Bo Li, Peng-Tao Jiang,
Abstract要約: 本稿では,画像編集を緊密に結合した推論・生成プロセスとして定式化する自動写真編集手法であるSmartPhotoCrafterを提案する。実験により、SmartPhotoCrafterは自動写真強調作業において、既存の生成モデルよりも優れていることが示された。
参考スコア（独自算出の注目度）: 29.50529252612344
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Traditional photographic image editing typically requires users to possess sufficient aesthetic understanding to provide appropriate instructions for adjusting image quality and camera parameters. However, this paradigm relies on explicit human instruction of aesthetic intent, which is often ambiguous, incomplete, or inaccessible to non-expert users. In this work, we propose SmartPhotoCrafter, an automatic photographic image editing method which formulates image editing as a tightly coupled reasoning-to-generation process. The proposed model first performs image quality comprehension and identifies deficiencies by the Image Critic module, and then the Photographic Artist module realizes targeted edits to enhance image appeal, eliminating the need for explicit human instructions. A multi-stage training pipeline is adopted: (i) Foundation pretraining to establish basic aesthetic understanding and editing capabilities, (ii) Adaptation with reasoning-guided multi-edit supervision to incorporate rich semantic guidance, and (iii) Coordinated reasoning-to generation reinforcement learning to jointly optimize reasoning and generation. During training, SmartPhotoCrafter emphasizes photo-realistic image generation, while supporting both image restoration and retouching tasks with consistent adherence to color- and tone-related semantics. We also construct a stage-specific dataset, which progressively builds reasoning and controllable generation, effective cross-module collaboration, and ultimately high-quality photographic enhancement. Experiments demonstrate that SmartPhotoCrafter outperforms existing generative models on the task of automatic photographic enhancement, achieving photo-realistic results while exhibiting higher tonal sensitivity to retouching instructions. Project page: https://github.com/vivoCameraResearch/SmartPhotoCrafter.
Abstract（参考訳）: 従来の写真画像編集では、ユーザーは画像の品質とカメラパラメータを調整するための適切な指示を与えるのに十分な審美的理解が必要である。しかし、このパラダイムは、しばしば曖昧で不完全で、専門家でないユーザーにはアクセスできない、美的意図の明示的な人的指示に依存している。本研究では,画像編集を緊密に結合した推論・生成プロセスとして定式化する自動写真編集手法であるSmartPhotoCrafterを提案する。提案モデルではまず画像品質の理解を行い,画像批判モジュールによる欠陥を識別し,写真アーティストモジュールは画像の魅力を高めるために対象の編集を実現し,明示的な人的指示の必要性を排除した。マルチステージトレーニングパイプラインが採用されている。一基本的な審美的理解及び編集能力を確立するための基礎二豊かな意味指導を取り入れた推論誘導型マルチエディターによる適応、及び三推論と生成を協調的に最適化する合理化世代強化学習。トレーニング中、SmartPhotoCrafterはフォトリアリスティックな画像生成を強調し、画像復元と修正タスクの両方をサポートし、色とトーン関連のセマンティクスに一貫して準拠する。また、段階固有のデータセットを構築し、推論と制御可能な生成、効果的なクロスモジュールコラボレーション、そして最終的には高品質な写真強調を段階的に構築する。実験により、SmartPhotoCrafterは自動写真強調作業において既存の生成モデルよりも優れており、写真リアルな結果が得られると同時に、修正指示に対する高音節感度を示す。プロジェクトページ: https://github.com/vivoCameraResearch/SmartPhotoCrafter.com

論文の概要: SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing

関連論文リスト