REG: Rectified Gradient Guidance for Conditional Diffusion Models
- URL: http://arxiv.org/abs/2501.18865v1
- Date: Fri, 31 Jan 2025 03:16:18 GMT
- Title: REG: Rectified Gradient Guidance for Conditional Diffusion Models
- Authors: Zhengqi Gao, Kaiwen Zha, Tianyuan Zhang, Zihui Xue, Duane S. Boning,
- Abstract summary: We propose rectified gradient guidance (REG) to boost the performance of existing guidance methods.<n>REG provides a better approximation to the optimal solution than prior guidance techniques.<n>In experiments on class-conditional ImageNet and text-to-image generation tasks, REG consistently improves FID and Inception/CLIP scores.
- Score: 16.275782069986253
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Guidance techniques are simple yet effective for improving conditional generation in diffusion models. Albeit their empirical success, the practical implementation of guidance diverges significantly from its theoretical motivation. In this paper, we reconcile this discrepancy by replacing the scaled marginal distribution target, which we prove theoretically invalid, with a valid scaled joint distribution objective. Additionally, we show that the established guidance implementations are approximations to the intractable optimal solution under no future foresight constraint. Building on these theoretical insights, we propose rectified gradient guidance (REG), a versatile enhancement designed to boost the performance of existing guidance methods. Experiments on 1D and 2D demonstrate that REG provides a better approximation to the optimal solution than prior guidance techniques, validating the proposed theoretical framework. Extensive experiments on class-conditional ImageNet and text-to-image generation tasks show that incorporating REG consistently improves FID and Inception/CLIP scores across various settings compared to its absence.
Related papers
- Global Variational Inference Enhanced Robust Domain Adaptation [7.414646586981638]
We propose a framework that learns continuous, class-conditional global priors via variational inference to enable structure-aware cross-domain alignment.<n>GVI-DA minimizes domain gaps through latent feature reconstruction, and mitigates posterior collapse using global codebook learning with randomized sampling.<n>It further improves robustness by discarding low-confidence pseudo-labels and generating reliable target-domain samples.
arXiv Detail & Related papers (2025-07-04T04:43:23Z) - Binarization-Aware Adjuster: Bridging Continuous Optimization and Binary Inference in Edge Detection [0.0]
Image edge detection (ED) faces a fundamental mismatch between training and inference.<n>In this paper, we propose a theoretical method to design a Binarization-Aware (BAA)<n>BAA explicitly incorporates binarization behavior into gradient-based optimization.
arXiv Detail & Related papers (2025-06-14T11:56:44Z) - How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models [57.42800112251644]
We propose Step AG, which is a simple, universally applicable adaptive guidance strategy.<n>Our evaluations focus on both image quality and image-text alignment.
arXiv Detail & Related papers (2025-06-10T02:09:48Z) - Feedback Guidance of Diffusion Models [0.0]
Interval-Free Guidance (CFG) has become standard for improving sample fidelity in conditional diffusion models.<n>We propose FeedBack Guidance (FBG), which uses a state-dependent coefficient to self-regulate guidance amounts based on need.
arXiv Detail & Related papers (2025-06-06T13:46:32Z) - LARES: Latent Reasoning for Sequential Recommendation [96.26996622771593]
We present LARES, a novel and scalable LAtent REasoning framework for Sequential recommendation.<n>Our proposed approach employs a recurrent architecture that allows flexible expansion of reasoning depth without increasing parameter complexity.<n>Our framework exhibits seamless compatibility with existing advanced models, further improving their recommendation performance.
arXiv Detail & Related papers (2025-05-22T16:22:54Z) - Model Steering: Learning with a Reference Model Improves Generalization Bounds and Scaling Laws [52.10468229008941]
This paper formalizes an emerging learning paradigm that uses a trained model as a reference to guide and enhance the training of a target model through strategic data selection or weighting.<n>We provide theoretical insights into why this approach improves generalization and data efficiency compared to training without a reference model.<n>Building on these insights, we introduce a novel method for Contrastive Language-Image Pretraining with a reference model, termed DRRho-CLIP.
arXiv Detail & Related papers (2025-05-10T16:55:03Z) - Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model [62.11981915549919]
Domain Guidance is a transfer approach that leverages pre-trained knowledge to guide the sampling process toward the target domain.
We demonstrate its substantial effectiveness across various transfer benchmarks, achieving over a 19.6% improvement in FID and a 23.4% improvement in FD$_textDINOv2$ compared to standard fine-tuning.
arXiv Detail & Related papers (2025-04-02T09:07:55Z) - Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations [2.992602379681373]
We show that multi-modal fine-tuning can achieve notable OoDD performance.
We propose a training objective that enhances cross-modal alignment by regularizing the distances between image and text embeddings of ID data.
arXiv Detail & Related papers (2025-03-24T16:00:21Z) - Contextually Entangled Gradient Mapping for Optimized LLM Comprehension [0.0]
Entually Entangled Gradient Mapping (CEGM) introduces a new approach to gradient optimization.
It treats gradients as dynamic carriers of contextual dependencies rather than isolated numerical entities.
The proposed methodology bridges critical gaps in existing optimization strategies.
arXiv Detail & Related papers (2025-01-28T11:50:35Z) - Optimally-Weighted Maximum Mean Discrepancy Framework for Continual Learning [10.142949909263846]
Continual learning allows models to persistently acquire and retain information.<n> catastrophic forgetting can severely impair model performance.<n>We introduce a novel framework termed Optimally-Weighted Mean Discrepancy (OWMMD), which imposes penalties on representation alterations.
arXiv Detail & Related papers (2025-01-21T13:33:45Z) - Training Free Guided Flow Matching with Optimal Control [6.729886762762167]
We present OC-Flow, a training-free framework for guided flow matching using optimal control.<n>We show that OC-Flow achieved superior performance in experiments on text-guided image manipulation, conditional molecule generation, and all-atom peptide design.
arXiv Detail & Related papers (2024-10-23T17:53:11Z) - COD: Learning Conditional Invariant Representation for Domain Adaptation Regression [20.676363400841495]
Domain Adaptation Regression is developed to generalize label knowledge from a source domain to an unlabeled target domain.
Existing conditional distribution alignment theory and methods with discrete prior are no longer applicable.
To minimize the discrepancy, a COD-based conditional invariant representation learning model is proposed.
arXiv Detail & Related papers (2024-08-13T05:08:13Z) - When Invariant Representation Learning Meets Label Shift: Insufficiency and Theoretical Insights [16.72787996847537]
Generalized label shift (GLS) is the latest developed one which shows great potential to deal with the complex factors within the shift.
Main results show the insufficiency of invariant representation learning, and prove the sufficiency and necessity of GLS correction for generalization.
We propose a kernel embedding-based correction algorithm (KECA) to minimize the generalization error and achieve successful knowledge transfer.
arXiv Detail & Related papers (2024-06-24T12:47:21Z) - Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-Constraint [56.74058752955209]
This paper studies the alignment process of generative models with Reinforcement Learning from Human Feedback (RLHF)
We first identify the primary challenges of existing popular methods like offline PPO and offline DPO as lacking in strategical exploration of the environment.
We propose efficient algorithms with finite-sample theoretical guarantees.
arXiv Detail & Related papers (2023-12-18T18:58:42Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Heterogeneous Calibration: A post-hoc model-agnostic framework for
improved generalization [8.815439276597818]
We introduce the notion of heterogeneous calibration that applies a post-hoc model-agnostic transformation to model outputs for improving AUC performance on binary classification tasks.
We refer to simple patterns as heterogeneous partitions of the feature space and show theoretically that perfectly calibrating each partition separately optimize AUC.
While the theoretical optimality of this framework holds for any model, we focus on deep neural networks (DNNs) and test the simplest instantiation of this paradigm on a variety of open-source datasets.
arXiv Detail & Related papers (2022-02-10T05:08:50Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Optimization-Inspired Learning with Architecture Augmentations and
Control Mechanisms for Low-Level Vision [74.9260745577362]
This paper proposes a unified optimization-inspired learning framework to aggregate Generative, Discriminative, and Corrective (GDC) principles.
We construct three propagative modules to effectively solve the optimization models with flexible combinations.
Experiments across varied low-level vision tasks validate the efficacy and adaptability of GDC.
arXiv Detail & Related papers (2020-12-10T03:24:53Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.