Related papers: FABRIC: Personalizing Diffusion Models with Iterative Feedback

FABRIC: Personalizing Diffusion Models with Iterative Feedback

URL: http://arxiv.org/abs/2307.10159v1
Date: Wed, 19 Jul 2023 17:39:39 GMT
Title: FABRIC: Personalizing Diffusion Models with Iterative Feedback
Authors: Dimitri von R\"utte, Elisabetta Fedele, Jonathan Thomm, Lukas Wolf
Abstract summary: In an era where visual content generation is increasingly driven by machine learning, the integration of human feedback into generative models presents significant opportunities for enhancing user experience and output quality. We propose FABRIC, a training-free approach applicable to a wide range of popular diffusion models, which exploits the self-attention layer present in the most widely used architectures to condition the diffusion process on a set of feedback images. We show that generation results improve over multiple rounds of iterative feedback through exhaustive analysis, implicitly optimizing arbitrary user preferences.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In an era where visual content generation is increasingly driven by machine learning, the integration of human feedback into generative models presents significant opportunities for enhancing user experience and output quality. This study explores strategies for incorporating iterative human feedback into the generative process of diffusion-based text-to-image models. We propose FABRIC, a training-free approach applicable to a wide range of popular diffusion models, which exploits the self-attention layer present in the most widely used architectures to condition the diffusion process on a set of feedback images. To ensure a rigorous assessment of our approach, we introduce a comprehensive evaluation methodology, offering a robust mechanism to quantify the performance of generative visual models that integrate human feedback. We show that generation results improve over multiple rounds of iterative feedback through exhaustive analysis, implicitly optimizing arbitrary user preferences. The potential applications of these findings extend to fields such as personalized content creation and customization.

Related papers

Improved visual-information-driven model for crowd simulation and its modular application [4.683197108420276]
Data-driven crowd simulation models offer advantages in enhancing the accuracy and realism of simulations. It is still an open question to develop data-driven crowd simulation models with strong generalizability. This paper proposes a data-driven model incorporating a refined visual information extraction method and exit cues to enhance generalizability.
arXiv Detail & Related papers (2025-04-02T07:53:33Z)
HRR: Hierarchical Retrospection Refinement for Generated Image Detection [16.958383381415445]
We propose a diffusion model-based generative image detection framework termed Hierarchical Retrospection Refinement(HRR) The HRR framework consistently delivers significant performance improvements, outperforming state-of-the-art methods in generated image detection task.
arXiv Detail & Related papers (2025-02-25T05:13:44Z)
Interactive Visual Assessment for Text-to-Image Generation Models [28.526897072724662]
We propose DyEval, a dynamic interactive visual assessment framework for generative models. DyEval features an intuitive visual interface that enables users to interactively explore and analyze model behaviors. Our framework provides valuable insights for improving generative models and has broad implications for advancing the reliability and capabilities of visual generation systems.
arXiv Detail & Related papers (2024-11-23T10:06:18Z)
A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on. Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions. The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z)
ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO [36.69910114305134]
We propose Iterative Self-Retrospective Direct Preference Optimization (ISR-DPO) to enhance preference modeling. ISR-DPO enhances the self-judge's focus on informative video regions, resulting in more visually grounded preferences. In extensive empirical evaluations, the ISR-DPO significantly outperforms the state of the art.
arXiv Detail & Related papers (2024-06-17T07:33:30Z)
Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation [87.50120181861362]
VisionPrefer is a high-quality and fine-grained preference dataset that captures multiple preference aspects. We train a reward model VP-Score over VisionPrefer to guide the training of text-to-image generative models and the preference prediction accuracy of VP-Score is comparable to human annotators.
arXiv Detail & Related papers (2024-04-23T14:53:15Z)
YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences. We analyze how these choices affect both the efficiency of the training process and the quality of the generated images. We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z)
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets. We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability. Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z)
Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedback [0.0]
We explore a potential method to amplify the performance of the Deep Neural Network Model to generate captions that are preferred by humans. This was achieved by integrating Supervised Learning and Reinforcement Learning with Human Feedback. We provide a sketch of our approach and results, hoping to contribute to the ongoing advances in the field of human-aligned generative AI models.
arXiv Detail & Related papers (2024-03-11T13:57:05Z)
HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback [46.744192144648764]
HuTuMotion is an innovative approach for generating natural human motions that navigates latent motion diffusion models by leveraging few-shot human feedback. Our findings reveal that utilizing few-shot feedback can yield performance levels on par with those attained through extensive human feedback.
arXiv Detail & Related papers (2023-12-19T15:13:08Z)
Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration. We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR. We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z)
Diffusion-based Visual Counterfactual Explanations -- Towards Systematic Quantitative Evaluation [64.0476282000118]
Latest methods for visual counterfactual explanations (VCE) harness the power of deep generative models to synthesize new examples of high-dimensional images of impressive quality. It is currently difficult to compare the performance of these VCE methods as the evaluation procedures largely vary and often boil down to visual inspection of individual examples and small scale user studies. We propose a framework for systematic, quantitative evaluation of the VCE methods and a minimal set of metrics to be used.
arXiv Detail & Related papers (2023-08-11T12:22:37Z)
Denoising Diffusion Probabilistic Models for Generation of Realistic Fully-Annotated Microscopy Image Data Sets [1.07539359851877]
In this study, we demonstrate that diffusion models can effectively generate fully-annotated microscopy image data sets. The proposed pipeline helps to reduce the reliance on manual annotations when training deep learning-based segmentation approaches.
arXiv Detail & Related papers (2023-01-02T14:17:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.