HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
- URL: http://arxiv.org/abs/2505.19742v1
- Date: Mon, 26 May 2025 09:24:11 GMT
- Title: HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance
- Authors: Jue Gong, Tingyu Yang, Jingkai Wang, Zheng Chen, Xing Liu, Hong Gu, Yulun Zhang, Xiaokang Yang,
- Abstract summary: Human-centered images often suffer from severe generic degradation during transmission and are prone to human motion blur (HMB)<n>We design a degradation pipeline that simulates the coexistence of HMB and generic noise, generating synthetic degraded data to train our proposed HAODiff.<n>For fair evaluation, we introduce MPII-Test, a benchmark rich in combined noise and HMB cases.
- Score: 71.5820853722963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human-centered images often suffer from severe generic degradation during transmission and are prone to human motion blur (HMB), making restoration challenging. Existing research lacks sufficient focus on these issues, as both problems often coexist in practice. To address this, we design a degradation pipeline that simulates the coexistence of HMB and generic noise, generating synthetic degraded data to train our proposed HAODiff, a human-aware one-step diffusion. Specifically, we propose a triple-branch dual-prompt guidance (DPG), which leverages high-quality images, residual noise (LQ minus HQ), and HMB segmentation masks as training targets. It produces a positive-negative prompt pair for classifier-free guidance (CFG) in a single diffusion step. The resulting adaptive dual prompts let HAODiff exploit CFG more effectively, boosting robustness against diverse degradations. For fair evaluation, we introduce MPII-Test, a benchmark rich in combined noise and HMB cases. Extensive experiments show that our HAODiff surpasses existing state-of-the-art (SOTA) methods in terms of both quantitative metrics and visual quality on synthetic and real-world datasets, including our introduced MPII-Test. Code is available at: https://github.com/gobunu/HAODiff.
Related papers
- OSCAR: One-Step Diffusion Codec Across Multiple Bit-rates [52.65036099944483]
Pretrained latent diffusion models have shown strong potential for lossy image compression.<n>Most existing methods reconstruct images by iteratively denoising from random noise.<n>We propose a one-step diffusion across multiple bit-rates termed OSCAR.
arXiv Detail & Related papers (2025-05-22T00:14:12Z) - DiffMAC: Diffusion Manifold Hallucination Correction for High Generalization Blind Face Restoration [62.44659039265439]
We propose a Diffusion-Information-Diffusion framework to tackle blind face restoration.
DiffMAC achieves high-generalization face restoration in diverse degraded scenes and heterogeneous domains.
Results demonstrate the superiority of DiffMAC over state-of-the-art methods.
arXiv Detail & Related papers (2024-03-15T08:44:15Z) - PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement [59.6260680005195]
We present a novel Person Search framework based on the Diffusion model, PSDiff.<n> PSDiff formulates the person search as a dual denoising process from noisy boxes and ReID embeddings to ground truths.<n>Following the new paradigm, we further design a new Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in an iterative and collaborative way.
arXiv Detail & Related papers (2023-09-20T08:16:39Z) - DuDGAN: Improving Class-Conditional GANs via Dual-Diffusion [2.458437232470188]
Class-conditional image generation using generative adversarial networks (GANs) has been investigated through various techniques.
We propose a novel approach for class-conditional image generation using GANs called DuDGAN, which incorporates a dual diffusion-based noise injection process.
Our method outperforms state-of-the-art conditional GAN models for image generation in terms of performance.
arXiv Detail & Related papers (2023-05-24T07:59:44Z) - Diffusion Recommender Model [85.9640416600725]
We propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner.<n>To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users' interactions into pure noises like in image synthesis.
arXiv Detail & Related papers (2023-04-11T04:31:00Z) - DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection [80.20339155618612]
DiffusionAD is a novel anomaly detection pipeline comprising a reconstruction sub-network and a segmentation sub-network.<n>A rapid one-step denoising paradigm achieves hundreds of times acceleration while preserving comparable reconstruction quality.<n>Considering the diversity in the manifestation of anomalies, we propose a norm-guided paradigm to integrate the benefits of multiple noise scales.
arXiv Detail & Related papers (2023-03-15T16:14:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.