FlexControl: Computation-Aware ControlNet with Differentiable Router for Text-to-Image Generation
- URL: http://arxiv.org/abs/2502.10451v2
- Date: Thu, 20 Feb 2025 13:29:45 GMT
- Title: FlexControl: Computation-Aware ControlNet with Differentiable Router for Text-to-Image Generation
- Authors: Zheng Fang, Lichuan Xiang, Xu Cai, Kaicheng Zhou, Hongkai Wen,
- Abstract summary: ControlNet offers a powerful way to guide diffusion-based generative models.
Most implementations rely on ad-hocs to choose which network blocks to control-an approach that varies unpredictably with different tasks.
We propose FlexControl, a framework that copies all diffusion blocks during training and employs a trainable gating mechanism.
- Score: 10.675687253961595
- License:
- Abstract: ControlNet offers a powerful way to guide diffusion-based generative models, yet most implementations rely on ad-hoc heuristics to choose which network blocks to control-an approach that varies unpredictably with different tasks. To address this gap, we propose FlexControl, a novel framework that copies all diffusion blocks during training and employs a trainable gating mechanism to dynamically select which blocks to activate at each denoising step. With introducing a computation-aware loss, we can encourage control blocks only to activate when it benefit the generation quality. By eliminating manual block selection, FlexControl enhances adaptability across diverse tasks and streamlines the design pipeline, with computation-aware training loss in an end-to-end training manner. Through comprehensive experiments on both UNet (e.g., SD1.5) and DiT (e.g., SD3.0), we show that our method outperforms existing ControlNet variants in certain key aspects of interest. As evidenced by both quantitative and qualitative evaluations, FlexControl preserves or enhances image fidelity while also reducing computational overhead by selectively activating the most relevant blocks. These results underscore the potential of a flexible, data-driven approach for controlled diffusion and open new avenues for efficient generative model design. The code will soon be available at https://github.com/Anonymousuuser/FlexControl.
Related papers
- Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning [0.10878040851638002]
We find conventional federated learning and split learning unsuitable for ControlNet training.
We propose a new distributed learning structure that eliminates the need for the server to send gradients back.
We propose a privacy-preserving activation function and a method to prevent private text prompts from leaving clients.
arXiv Detail & Related papers (2024-09-13T02:55:22Z) - ControlNeXt: Powerful and Efficient Control for Image and Video Generation [59.62289489036722]
We propose ControlNeXt: a powerful and efficient method for controllable image and video generation.
We first design a more straightforward and efficient architecture, replacing heavy additional branches with minimal additional cost.
As for training, we reduce up to 90% of learnable parameters compared to the alternatives.
arXiv Detail & Related papers (2024-08-12T11:41:18Z) - Adding Conditional Control to Diffusion Models with Reinforcement Learning [68.06591097066811]
Diffusion models are powerful generative models that allow for precise control over the characteristics of the generated samples.
While these diffusion models trained on large datasets have achieved success, there is often a need to introduce additional controls in downstream fine-tuning processes.
This work presents a novel method based on reinforcement learning (RL) to add such controls using an offline dataset.
arXiv Detail & Related papers (2024-06-17T22:00:26Z) - FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation [12.925771335213156]
Controllable text generation (CTG) seeks to craft texts adhering to specific attributes.
We propose FreeCtrl, a learning-free approach that dynamically adjusts the weights of selected feedforward neural network (FFN) vectors.
By identifying and adaptively adjusting the weights of attribute-related FFN vectors, FreeCtrl can control the output likelihood of attribute keywords in the generated content.
arXiv Detail & Related papers (2024-06-14T03:18:28Z) - FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation [99.4649330193233]
Controllable text-to-image (T2I) diffusion models generate images conditioned on both text prompts and semantic inputs of other modalities like edge maps.
We propose a novel Flexible and Efficient method, FlexEControl, for controllable T2I generation.
arXiv Detail & Related papers (2024-05-08T06:09:11Z) - DITTO: Diffusion Inference-Time T-Optimization for Music Generation [49.90109850026932]
Diffusion Inference-Time T-Optimization (DITTO) is a frame-work for controlling pre-trained text-to-music diffusion models at inference-time.
We demonstrate a surprisingly wide-range of applications for music generation including inpainting, outpainting, and looping as well as intensity, melody, and musical structure control.
arXiv Detail & Related papers (2024-01-22T18:10:10Z) - UniControl: A Unified Diffusion Model for Controllable Visual Generation
In the Wild [166.25327094261038]
We introduce UniControl, a new generative foundation model for controllable condition-to-image (C2I) tasks.
UniControl consolidates a wide array of C2I tasks within a singular framework, while still allowing for arbitrary language prompts.
trained on nine unique C2I tasks, UniControl demonstrates impressive zero-shot generation abilities.
arXiv Detail & Related papers (2023-05-18T17:41:34Z) - DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross
Diffusion [68.39543754708124]
We introduce DiffFacto, a novel probabilistic generative model that learns the distribution of shapes with part-level control.
Experiments show that our method is able to generate novel shapes with multiple axes of control.
It achieves state-of-the-art part-level generation quality and generates plausible and coherent shapes.
arXiv Detail & Related papers (2023-05-03T06:38:35Z) - Federated Learning with Flexible Control [30.65854375019346]
Federated learning (FL) enables distributed model training from local data collected by users.
In distributed systems with constrained resources and potentially high dynamics, e.g., mobile edge networks, the efficiency of FL is an important problem.
We propose FlexFL - an FL algorithm with multiple options that can be adjusted flexibly.
arXiv Detail & Related papers (2022-12-16T14:21:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.