Related papers: IRPO: Boosting Image Restoration via Post-training GRPO

IRPO: Boosting Image Restoration via Post-training GRPO

URL: http://arxiv.org/abs/2512.00814v2
Date: Tue, 09 Dec 2025 06:21:17 GMT
Title: IRPO: Boosting Image Restoration via Post-training GRPO
Authors: Haoxuan Xu, Yi Liu, Boyuan Jiang, Jinlong Peng, Donghao Luo, Xiaobin Hu, Shuicheng Yan, Haoang Li,
Abstract summary: We propose IRPO, a low-level GRPO-based post-training paradigm.<n>We first explore a data formulation principle for low-level post-training paradigm.<n>We then model a reward-level criteria system that balances objective accuracy and human perceptual preference.
Score: 59.588079259093035
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in post-training paradigms have achieved remarkable success in high-level generation tasks, yet their potential for low-level vision remains rarely explored. Existing image restoration (IR) methods rely on pixel-level hard-fitting to ground-truth images, struggling with over-smoothing and poor generalization. To address these limitations, we propose IRPO, a low-level GRPO-based post-training paradigm that systematically explores both data formulation and reward modeling. We first explore a data formulation principle for low-level post-training paradigm, in which selecting underperforming samples from the pre-training stage yields optimal performance and improved efficiency. Furthermore, we model a reward-level criteria system that balances objective accuracy and human perceptual preference through three complementary components: a General Reward for structural fidelity, an Expert Reward leveraging Qwen-VL for perceptual alignment, and a Restoration Reward for task-specific low-level quality. Comprehensive experiments on six in-domain and five out-of-domain (OOD) low-level benchmarks demonstrate that IRPO achieves state-of-the-art results across diverse degradation types, surpassing the AdaIR baseline by 0.83 dB on in-domain tasks and 3.43 dB on OOD settings. Our code can be shown in https://github.com/HaoxuanXU1024/IRPO.

Related papers

Simplicity Prevails: The Emergence of Generalizable AIGI Detection in Visual Foundation Models [15.709482146201283]
A simple linear classifier trained on the frozen features of modern Vision Foundation Models establishes a new state-of-the-art.<n>We show that this baseline matches specialized detectors on standard benchmarks but also decisively outperforms them on in-the-wild datasets.<n>We conclude by advocating for a paradigm shift in AI forensics, moving from overfitting on static benchmarks to harnessing the evolving world knowledge of foundation models for real-world reliability.
arXiv Detail & Related papers (2026-02-02T07:20:02Z)
Implicit Neural Representation-Based Continuous Single Image Super Resolution: An Empirical Study [50.15623093332659]
Implicit neural representation (INR) has become the standard approach for arbitrary-scale image super-resolution (ASSR)<n>We compare existing techniques across diverse settings and present aggregated performance results on multiple image quality metrics.<n>We examine a new loss function that penalizes intensity variations while preserving edges, textures, and finer details during training.
arXiv Detail & Related papers (2026-01-25T07:09:20Z)
RS-Prune: Training-Free Data Pruning at High Ratios for Efficient Remote Sensing Diffusion Foundation Models [14.093802378976315]
Diffusion-based remote sensing (RS) generative foundation models rely on large amounts of globally representative data.<n>We propose a training-free, two-stage data pruning approach that quickly select a high-quality subset under high pruning ratios.<n> Experiments show that, even after pruning 85% of the training data, our method significantly improves convergence and generation quality.
arXiv Detail & Related papers (2025-12-29T06:44:06Z)
OmniQuality-R: Advancing Reward Models Through All-Encompassing Quality Assessment [55.59322229889159]
We propose OmniQuality-R, a unified reward modeling framework that transforms multi-task quality reasoning into continuous and interpretable reward signals.<n>We use a reasoning-enhanced reward modeling dataset to form a reliable chain-of-thought dataset for supervised fine-tuning.<n>We evaluate OmniQuality-R on three key IQA tasks: aesthetic quality assessment, technical quality evaluation, and text-image alignment.
arXiv Detail & Related papers (2025-10-12T13:46:28Z)
LRPO: Enhancing Blind Face Restoration through Online Reinforcement Learning [54.51101908523586]
Blind Face Restoration (BFR) encounters inherent challenges in exploring its large solution space.<n>We propose a Likelihood-Regularized Policy Optimization (LRPO) framework, the first to apply online reinforcement learning (RL) to the BFR task.<n>Our proposed LRPO significantly improves the face restoration quality over baseline methods and achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-09-27T14:42:29Z)
BranchGRPO: Stable and Efficient GRPO with Structured Branching in Diffusion Models [57.304411396229035]
We present BranchGRPO, a method that restructures the rollout process into a branching tree.<n>On HPDv2.1 image alignment, BranchGRPO improves alignment scores by up to textbf16% over DanceGRPO.<n>A hybrid variant, BranchGRPO-Mix, further accelerates training to 4.7x faster than DanceGRPO without degrading alignment.
arXiv Detail & Related papers (2025-09-07T12:53:06Z)
NatureGAIA: Pushing the Frontiers of GUI Agents with a Challenging Benchmark and High-Quality Trajectory Dataset [16.676904484703]
We introduce NaturalGAIA, a novel benchmark engineered on the principle of Causal Pathways.<n>This paradigm structures complex tasks into a series of verifiable atomic steps, ensuring rigorous, fully automated, and reproducible standard for assessment.<n>We then utilize this dataset to perform Reinforcement FineTuning (RFT) on the Q2.5-VL-7B model.
arXiv Detail & Related papers (2025-08-02T11:53:41Z)
IQPFR: An Image Quality Prior for Blind Face Restoration and Beyond [56.99331967165238]
Blind Face Restoration (BFR) addresses the challenge of reconstructing degraded low-quality (LQ) facial images into high-quality (HQ) outputs.<n>We propose a novel framework that incorporates an Image Quality Prior (IQP) derived from No-Reference Image Quality Assessment (NR-IQA) models.<n>Our method outperforms state-of-the-art techniques across multiple benchmarks.
arXiv Detail & Related papers (2025-03-12T11:39:51Z)
Pseudo-triplet Guided Few-shot Composed Image Retrieval [20.040511832864503]
Composed Image Retrieval (CIR) is a challenging task that aims to retrieve the target image with a multimodal query. We propose a novel two-stage pseudo triplet guided few-shot CIR scheme, dubbed PTG-FSCIR. In the first stage, we propose an attentive masking and captioning-based pseudo triplet generation method, to construct pseudo triplets from pure image data. In the second stage, we propose a challenging triplet-based CIR fine-tuning method, where we design a pseudo modification text-based sample challenging score estimation strategy.
arXiv Detail & Related papers (2024-07-08T14:53:07Z)
Hyperspectral Benchmark: Bridging the Gap between HSI Applications through Comprehensive Dataset and Pretraining [11.935879491267634]
Hyperspectral Imaging (HSI) serves as a non-destructive spatial spectroscopy technique with a multitude of potential applications. A recurring challenge lies in the limited size of the target datasets, impeding exhaustive architecture search. This study introduces an innovative benchmark dataset encompassing three markedly distinct HSI applications.
arXiv Detail & Related papers (2023-09-20T08:08:34Z)
A Practical Contrastive Learning Framework for Single-Image Super-Resolution [51.422185656787285]
We investigate contrastive learning-based single image super-resolution from two perspectives. We propose a practical contrastive learning framework for SISR, named PCL-SR. Compared with existing benchmark methods, we re-train them by our proposed PCL-SR framework and achieve superior performance.
arXiv Detail & Related papers (2021-11-27T15:42:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.