Related papers: Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

URL: http://arxiv.org/abs/2209.06970v1
Date: Wed, 14 Sep 2022 22:55:18 GMT
Title: Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models
Authors: Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre
Abstract summary: Generative Visual Prompt (PromptGen) is a framework for distributional control over pre-trained generative models. PromptGen approximats an energy-based model (EBM) and samples images in a feed-forward manner. Code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.
Score: 77.47505141269035
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative models (e.g., GANs and diffusion models) learn the underlying data distribution in an unsupervised manner. However, many applications of interest require sampling from a specific region of the generative model's output space or evenly over a range of characteristics. To allow efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control over pre-trained generative models by incorporating knowledge of arbitrary off-the-shelf models. PromptGen defines control as an energy-based model (EBM) and samples images in a feed-forward manner by approximating the EBM with invertible neural networks, avoiding optimization at inference. We demonstrate how PromptGen can control several generative models (e.g., StyleGAN2, StyleNeRF, diffusion autoencoder, and NVAE) using various off-the-shelf models: (1) with the CLIP model, PromptGen can sample images guided by text, (2) with image classifiers, PromptGen can de-bias generative models across a set of attributes, and (3) with inverse graphics models, PromptGen can sample images of the same identity in different poses. (4) Finally, PromptGen reveals that the CLIP model shows "reporting bias" when used as control, and PromptGen can further de-bias this controlled distribution in an iterative manner. Our code is available at https://github.com/ChenWu98/Generative-Visual-Prompt.

Related papers

CAR: Controllable Autoregressive Modeling for Visual Generation [100.33455832783416]
Controllable AutoRegressive Modeling (CAR) is a novel, plug-and-play framework that integrates conditional control into multi-scale latent variable modeling. CAR progressively refines and captures control representations, which are injected into each autoregressive step of the pre-trained model to guide the generation process. Our approach demonstrates excellent controllability across various types of conditions and delivers higher image quality compared to previous methods.
arXiv Detail & Related papers (2024-10-07T00:55:42Z)
ControlVAR: Exploring Controllable Visual Autoregressive Modeling [48.66209303617063]
Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs) Challenges such as expensive computational cost, high inference latency, and difficulties of integration with large language models (LLMs) have necessitated exploring alternatives to DMs. This paper introduces Controlmore, a novel framework that explores pixel-level controls in visual autoregressive modeling for flexible and efficient conditional generation.
arXiv Detail & Related papers (2024-06-14T06:35:33Z)
Controlling the Output of a Generative Model by Latent Feature Vector Shifting [0.0]
We present our novel method for latent vector shifting for controlled output image modification. In our approach we use a pre-trained model of StyleGAN3 that generates images of realistic human faces. Our latent feature shifter is a neural network model with a task to shift the latent vectors of a generative model into a specified feature direction.
arXiv Detail & Related papers (2023-11-15T10:42:06Z)
Don't be so negative! Score-based Generative Modeling with Oracle-assisted Guidance [12.039478020062608]
We develop a new denoising diffusion probabilistic modeling (DDPM) methodology, Gen-neG. Our approach builds on generative adversarial networks (GANs) and discriminator guidance in diffusion models to guide the generation process. We empirically establish the utility of Gen-neG in applications including collision avoidance in self-driving simulators and safety-guarded human motion generation.
arXiv Detail & Related papers (2023-07-31T07:52:00Z)
A Hybrid of Generative and Discriminative Models Based on the Gaussian-coupled Softmax Layer [5.33024001730262]
We propose a method to train a hybrid of discriminative and generative models in a single neural network. We demonstrate that the proposed hybrid model can be applied to semi-supervised learning and confidence calibration.
arXiv Detail & Related papers (2023-05-10T05:48:22Z)
Understanding Diffusion Models: A Unified Perspective [0.0]
Diffusion models have shown incredible capabilities as generative models. We review, demystify, and unify the understanding of diffusion models across both variational and score-based perspectives.
arXiv Detail & Related papers (2022-08-25T09:55:25Z)
Controllable and Compositional Generation with Latent-Space Energy-Based Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications. In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes. By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z)
Conditional Generative Models for Counterfactual Explanations [0.0]
We propose a general framework to generate sparse, in-distribution counterfactual model explanations. The framework is flexible with respect to the type of generative model used as well as the task of the underlying predictive model.
arXiv Detail & Related papers (2021-01-25T14:31:13Z)
Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem. We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training. Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)
Towards a Neural Graphics Pipeline for Controllable Image Generation [96.11791992084551]
We present Neural Graphics Pipeline (NGP), a hybrid generative model that brings together neural and traditional image formation models. NGP decomposes the image into a set of interpretable appearance feature maps, uncovering direct control handles for controllable image generation. We demonstrate the effectiveness of our approach on controllable image generation of single-object scenes.
arXiv Detail & Related papers (2020-06-18T14:22:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.