Related papers: Flow-Guided Controllable Line Drawing Generation

Flow-Guided Controllable Line Drawing Generation

URL: http://arxiv.org/abs/2307.07540v2
Date: Thu, 24 Aug 2023 09:11:26 GMT
Title: Flow-Guided Controllable Line Drawing Generation
Authors: Chengyu Fang, Xianfeng Han
Abstract summary: We present an Image-to-Flow network (I2FNet) to efficiently and robustly create the vector flow field in a learning-based manner. We then introduce our well-designed Double Flow Generator (DFG) framework to fuse features from learned vector flow and input image flow. In order to allow for controllable character line drawing generation, we integrate a Line Control Matrix into DFG and train a Line Control Regressor.
Score: 6.200483285433661
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate the problem of automatically controllable artistic character line drawing generation from photographs by proposing a Vector Flow Aware and Line Controllable Image-to-Image Translation architecture, which can be viewed as an appealing intersection between Artificial Intelligence and Arts. Specifically, we first present an Image-to-Flow network (I2FNet) to efficiently and robustly create the vector flow field in a learning-based manner, which can provide a direction guide for drawing lines. Then, we introduce our well-designed Double Flow Generator (DFG) framework to fuse features from learned vector flow and input image flow guaranteeing the spatial coherence of lines. Meanwhile, in order to allow for controllable character line drawing generation, we integrate a Line Control Matrix (LCM) into DFG and train a Line Control Regressor (LCR) to synthesize drawings with different styles by elaborately controlling the level of details, such as thickness, smoothness, and continuity, of lines. Finally, we design a Fourier Transformation Loss to further constrain the character line generation from the frequency domain view of the point. Quantitative and qualitative experiments demonstrate that our approach can obtain superior performance in producing high-resolution character line-drawing images with perceptually realistic characteristics.

Related papers

Neural Image Abstraction Using Long Smoothing B-Splines [33.22485341851476]
We show how to generate smooth and arbitrarily long paths within image-based deep learning systems.<n>We take advantage of derivative-based smoothing costs for parametric control of fidelity vs. simplicity tradeoffs.
arXiv Detail & Related papers (2025-11-07T15:50:48Z)
CurveFlow: Curvature-Guided Flow Matching for Image Generation [11.836900973675297]
Existing rectified flow models are based on linear trajectories between data and noise distributions.<n>This linearity enforces zero curvature, which can inadvertently force the image generation process through low-probability regions of the data manifold.<n>We introduce CurveFlow, a novel flow matching framework designed to learn smooth, non-linear trajectories by incorporating curvature guidance into the flow path.
arXiv Detail & Related papers (2025-08-20T22:06:13Z)
SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation [57.47730473674261]
We introduce SwiftSketch, a model for image-conditioned vector sketch generation that can produce high-quality sketches in less than a second. SwiftSketch operates by progressively denoising stroke control points sampled from a Gaussian distribution. ControlSketch is a method that enhances SDS-based techniques by incorporating precise spatial control through a depth-aware ControlNet.
arXiv Detail & Related papers (2025-02-12T18:57:12Z)
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer [17.881925697226656]
LayerTracer is a diffusion transformer that bridges the gap by learning designers' layered creation processes from a novel dataset of sequential design operations. For image vectorization, we introduce a conditional diffusion mechanism that encodes reference images into latent tokens. Experiments demonstrate LayerTracer's superior performance against optimization-based and neural baselines in both generation quality and editability.
arXiv Detail & Related papers (2025-02-03T06:49:58Z)
FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers [4.710921988115686]
We introduce FluxSpace, a domain-agnostic image editing method with the ability to control the semantics of images generated by rectified flow transformers. By leveraging the representations learned by the transformer blocks within the rectified flow models, we propose a set of semantically interpretable representations that enable a wide range of image editing tasks.
arXiv Detail & Related papers (2024-12-12T18:59:40Z)
Steering Rectified Flow Models in the Vector Field for Controlled Image Generation [53.965218831845995]
Diffusion models (DMs) excel in photorealism, image editing, and solving inverse problems, aided by classifier-free guidance and image inversion techniques. Existing DM-based methods often require additional training, lack generalization to pretrained latent models, underperform, and demand significant computational resources due to extensive backpropagation through ODE solvers and inversion processes. We propose FlowChef, which leverages the vector field to steer the denoising trajectory for controlled image generation tasks, facilitated by gradient skipping. FlowChef significantly outperforms baselines in terms of performance, memory, and time requirements, achieving new state-of-the
arXiv Detail & Related papers (2024-11-27T19:04:40Z)
TraDiffusion: Trajectory-Based Training-Free Image Generation [85.39724878576584]
We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories.
arXiv Detail & Related papers (2024-08-19T07:01:43Z)
Text-to-Vector Generation with Neural Path Representation [27.949704002538944]
We propose a novel neural path representation that learns the path latent space from both sequence and image modalities. In the first stage, a pre-trained text-to-image diffusion model guides the initial generation of complex vector graphics. In the second stage, we refine the graphics using a layer-wise image vectorization strategy to achieve clearer elements and structure.
arXiv Detail & Related papers (2024-05-16T17:59:22Z)
D-Flow: Differentiating through Flows for Controlled Generation [37.80603174399585]
We introduce D-Flow, a framework for controlling the generation process by differentiating through the flow. We motivate this framework by our key observation stating that for Diffusion/FM models trained with Gaussian probability paths, differentiating through the generation process projects gradient on the data manifold. We validate our framework on linear and non-linear controlled generation problems including: image and audio inverse problems and conditional molecule generation reaching state of the art performance across all.
arXiv Detail & Related papers (2024-02-21T18:56:03Z)
End-to-End Diffusion Latent Optimization Improves Classifier Guidance [81.27364542975235]
Direct Optimization of Diffusion Latents (DOODL) is a novel guidance method. It enables plug-and-play guidance by optimizing diffusion latents. It outperforms one-step classifier guidance on computational and human evaluation metrics.
arXiv Detail & Related papers (2023-03-23T22:43:52Z)
Graph Decision Transformer [83.76329715043205]
Graph Decision Transformer (GDT) is a novel offline reinforcement learning approach. GDT models the input sequence into a causal graph to capture potential dependencies between fundamentally different concepts. Our experiments show that GDT matches or surpasses the performance of state-of-the-art offline RL methods on image-based Atari and OpenAI Gym.
arXiv Detail & Related papers (2023-03-07T09:10:34Z)
Hierarchical Semantic Regularization of Latent Spaces in StyleGANs [53.98170188547775]
We propose a Hierarchical Semantic Regularizer (HSR) which aligns the hierarchical representations learnt by the generator to corresponding powerful features learnt by pretrained networks on large amounts of data. HSR is shown to not only improve generator representations but also the linearity and smoothness of the latent style spaces, leading to the generation of more natural-looking style-edited images.
arXiv Detail & Related papers (2022-08-07T16:23:33Z)
DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation [56.514462874501675]
We propose a dynamic sparse attention based Transformer model to achieve fine-level matching with favorable efficiency. The heart of our approach is a novel dynamic-attention unit, dedicated to covering the variation on the optimal number of tokens one position should focus on. Experiments on three applications, pose-guided person image generation, edge-based face synthesis, and undistorted image style transfer, demonstrate that DynaST achieves superior performance in local details.
arXiv Detail & Related papers (2022-07-13T11:12:03Z)
Differentiable Drawing and Sketching [0.0]
We present a differentiable relaxation of the process of drawing points, lines and curves into a pixel. This relaxation allows end-to-end differentiable programs and deep networks to be learned and optimised.
arXiv Detail & Related papers (2021-03-30T09:25:55Z)
Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications. We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z)
Controllable Continuous Gaze Redirection [47.15883248953411]
We present interpGaze, a novel framework for controllable gaze redirection. Our goal is to redirect the eye gaze of one person into any gaze direction depicted in the reference image. The proposed interpGaze outperforms state-of-the-art methods in terms of image quality and redirection precision.
arXiv Detail & Related papers (2020-10-09T11:50:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.