Modelling Latent Dynamics of StyleGAN using Neural ODEs
- URL: http://arxiv.org/abs/2208.11197v2
- Date: Sat, 22 Apr 2023 20:18:14 GMT
- Title: Modelling Latent Dynamics of StyleGAN using Neural ODEs
- Authors: Weihao Xia and Yujiu Yang and Jing-Hao Xue
- Abstract summary: We learn the trajectory of independently inverted latent codes from GANs.
The learned continuous trajectory allows us to perform infinite frame and consistent video manipulation.
Our method achieves state-of-the-art performance but with much less computation.
- Score: 52.03496093312985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose to model the video dynamics by learning the
trajectory of independently inverted latent codes from GANs. The entire
sequence is seen as discrete-time observations of a continuous trajectory of
the initial latent code, by considering each latent code as a moving particle
and the latent space as a high-dimensional dynamic system. The latent codes
representing different frames are therefore reformulated as state transitions
of the initial frame, which can be modeled by neural ordinary differential
equations. The learned continuous trajectory allows us to perform infinite
frame interpolation and consistent video manipulation. The latter task is
reintroduced for video editing with the advantage of requiring the core
operations to be applied to the first frame only while maintaining temporal
consistency across all frames. Extensive experiments demonstrate that our
method achieves state-of-the-art performance but with much less computation.
Code is available at https://github.com/weihaox/dynode_released.
Related papers
- Unfolding Videos Dynamics via Taylor Expansion [5.723852805622308]
We present a new self-supervised dynamics learning strategy for videos: Video Time-Differentiation for Instance Discrimination (ViDiDi)
ViDiDi observes different aspects of a video through various orders of temporal derivatives of its frame sequence.
ViDiDi learns a single neural network that encodes a video and its temporal derivatives into consistent embeddings.
arXiv Detail & Related papers (2024-09-04T01:41:09Z) - VDG: Vision-Only Dynamic Gaussian for Driving Simulation [112.6139608504842]
We introduce self-supervised VO into our pose-free dynamic Gaussian method (VDG)
VDG can work with only RGB image input and construct dynamic scenes at a faster speed and larger scenes compared with the pose-free dynamic view-synthesis method.
Our results show favorable performance over the state-of-the-art dynamic view synthesis methods.
arXiv Detail & Related papers (2024-06-26T09:29:21Z) - Continuous Learned Primal Dual [10.111901389604423]
We propose the idea that a sequence of layers in a neural network is just a discretisation of an ODE, and thus can be directly modelled by a parameterised ODE.
In this work, we explore the use of Neural ODEs for learned inverse problems, in particular with the well-known Learned Primal Dual algorithm, and apply it to computed tomography (CT) reconstruction.
arXiv Detail & Related papers (2024-05-03T20:40:14Z) - RIGID: Recurrent GAN Inversion and Editing of Real Face Videos [73.97520691413006]
GAN inversion is indispensable for applying the powerful editability of GAN to real images.
Existing methods invert video frames individually often leading to undesired inconsistent results over time.
We propose a unified recurrent framework, named textbfRecurrent vtextbfIdeo textbfGAN textbfInversion and etextbfDiting (RIGID)
Our framework learns the inherent coherence between input frames in an end-to-end manner.
arXiv Detail & Related papers (2023-08-11T12:17:24Z) - Towards Smooth Video Composition [59.134911550142455]
Video generation requires consistent and persistent frames with dynamic content over time.
This work investigates modeling the temporal relations for composing video with arbitrary length, from a few frames to even infinite, using generative adversarial networks (GANs)
We show that the alias-free operation for single image generation, together with adequately pre-learned knowledge, brings a smooth frame transition without compromising the per-frame quality.
arXiv Detail & Related papers (2022-12-14T18:54:13Z) - Continuous-Time Video Generation via Learning Motion Dynamics with
Neural ODE [26.13198266911874]
We propose a novel video generation approach that learns separate distributions for motion and appearance.
We employ a two-stage approach where the first stage converts a noise vector to a sequence of keypoints in arbitrary frame rates, and the second stage synthesizes videos based on the given keypoints sequence and the appearance noise vector.
arXiv Detail & Related papers (2021-12-21T03:30:38Z) - Simple Video Generation using Neural ODEs [9.303957136142293]
We learn latent variable models that predict the future in latent space and project back to pixels.
We show that our approach yields promising results in the task of future frame prediction on the Moving MNIST dataset with 1 and 2 digits.
arXiv Detail & Related papers (2021-09-07T19:03:33Z) - Dynamic View Synthesis from Dynamic Monocular Video [69.80425724448344]
We present an algorithm for generating views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene.
We show extensive quantitative and qualitative results of dynamic view synthesis from casually captured videos.
arXiv Detail & Related papers (2021-05-13T17:59:50Z) - Efficient Semantic Video Segmentation with Per-frame Inference [117.97423110566963]
In this work, we process efficient semantic video segmentation in a per-frame fashion during the inference process.
We employ compact models for real-time execution. To narrow the performance gap between compact models and large models, new knowledge distillation methods are designed.
arXiv Detail & Related papers (2020-02-26T12:24:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.