TraDiffusion: Trajectory-Based Training-Free Image Generation
- URL: http://arxiv.org/abs/2408.09739v1
- Date: Mon, 19 Aug 2024 07:01:43 GMT
- Title: TraDiffusion: Trajectory-Based Training-Free Image Generation
- Authors: Mingrui Wu, Oucheng Huang, Jiayi Ji, Jiale Li, Xinyue Cai, Huafeng Kuang, Jianzhuang Liu, Xiaoshuai Sun, Rongrong Ji,
- Abstract summary: We propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion.
This novel method allows users to effortlessly guide image generation via mouse trajectories.
- Score: 85.39724878576584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we propose a training-free, trajectory-based controllable T2I approach, termed TraDiffusion. This novel method allows users to effortlessly guide image generation via mouse trajectories. To achieve precise control, we design a distance awareness energy function to effectively guide latent variables, ensuring that the focus of generation is within the areas defined by the trajectory. The energy function encompasses a control function to draw the generation closer to the specified trajectory and a movement function to diminish activity in areas distant from the trajectory. Through extensive experiments and qualitative assessments on the COCO dataset, the results reveal that TraDiffusion facilitates simpler, more natural image control. Moreover, it showcases the ability to manipulate salient regions, attributes, and relationships within the generated images, alongside visual input based on arbitrary or enhanced trajectories.
Related papers
- FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models [41.006754386910686]
We argue that diffusion model itself allows decent control over the generated content without requiring any training.
We introduce a tuning-free framework to achieve trajectory-controllable video generation, by imposing guidance on both noise construction and attention computation.
arXiv Detail & Related papers (2024-06-24T17:59:56Z) - Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling [70.34875558830241]
We present a way for learning a-temporal (4D) embedding, based on semantic semantic gears to allow for stratified modeling of dynamic regions of rendering the scene.
At the same time, almost for free, our tracking approach enables free-viewpoint of interest - a functionality not yet achieved by existing NeRF-based methods.
arXiv Detail & Related papers (2024-06-06T03:37:39Z) - FlowIE: Efficient Image Enhancement via Rectified Flow [71.6345505427213]
FlowIE is a flow-based framework that estimates straight-line paths from an elementary distribution to high-quality images.
Our contributions are rigorously validated through comprehensive experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-06-01T17:29:29Z) - D-Cubed: Latent Diffusion Trajectory Optimisation for Dexterous Deformable Manipulation [15.680133621889809]
D-Cubed is a novel trajectory optimisation method using a latent diffusion model (LDM) trained from a task-agnostic play dataset.
We demonstrate that D-Cubed outperforms traditional trajectory optimisation and competitive baseline approaches by a significant margin.
arXiv Detail & Related papers (2024-03-19T16:05:51Z) - Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory
Diffusion [83.88829943619656]
We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals.
Our guided diffusion model allows users to constrain trajectories through target waypoints, speed, and specified social groups.
We propose utilizing the value function learned during RL training of the animation controller to guide diffusion to produce trajectories better suited for particular scenarios.
arXiv Detail & Related papers (2023-04-04T15:46:42Z) - Neural Motion Fields: Encoding Grasp Trajectories as Implicit Value
Functions [65.84090965167535]
We present Neural Motion Fields, a novel object representation which encodes both object point clouds and the relative task trajectories as an implicit value function parameterized by a neural network.
This object-centric representation models a continuous distribution over the SE(3) space and allows us to perform grasping reactively by leveraging sampling-based MPC to optimize this value function.
arXiv Detail & Related papers (2022-06-29T18:47:05Z) - CUDA-GR: Controllable Unsupervised Domain Adaptation for Gaze
Redirection [3.0141238193080295]
The aim of gaze redirection is to manipulate the gaze in an image to the desired direction.
Advancement in generative adversarial networks has shown excellent results in generating photo-realistic images.
To enable such fine-tuned control, one needs to obtain ground truth annotations for the training data which can be very expensive.
arXiv Detail & Related papers (2021-06-21T04:39:42Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Controllable Continuous Gaze Redirection [47.15883248953411]
We present interpGaze, a novel framework for controllable gaze redirection.
Our goal is to redirect the eye gaze of one person into any gaze direction depicted in the reference image.
The proposed interpGaze outperforms state-of-the-art methods in terms of image quality and redirection precision.
arXiv Detail & Related papers (2020-10-09T11:50:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.