StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance
Fields
- URL: http://arxiv.org/abs/2403.08310v1
- Date: Wed, 13 Mar 2024 07:42:21 GMT
- Title: StyleDyRF: Zero-shot 4D Style Transfer for Dynamic Neural Radiance
Fields
- Authors: Hongbin Xu, Weitao Chen, Feng Xiao, Baigui Sun, Wenxiong Kang
- Abstract summary: Existing efforts on 3D style transfer can effectively combine the visual features of style images and neural radiance fields (NeRF)
We introduce StyleDyRF, a method that represents the 4D feature space by deforming a canonical feature volume.
We show that our method not only renders 4D photorealistic style transfer results in a zero-shot manner but also outperforms existing methods in terms of visual quality and consistency.
- Score: 21.55426133036809
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 4D style transfer aims at transferring arbitrary visual style to the
synthesized novel views of a dynamic 4D scene with varying viewpoints and
times. Existing efforts on 3D style transfer can effectively combine the visual
features of style images and neural radiance fields (NeRF) but fail to handle
the 4D dynamic scenes limited by the static scene assumption. Consequently, we
aim to handle the novel challenging problem of 4D style transfer for the first
time, which further requires the consistency of stylized results on dynamic
objects. In this paper, we introduce StyleDyRF, a method that represents the 4D
feature space by deforming a canonical feature volume and learns a linear style
transformation matrix on the feature volume in a data-driven fashion. To obtain
the canonical feature volume, the rays at each time step are deformed with the
geometric prior of a pre-trained dynamic NeRF to render the feature map under
the supervision of pre-trained visual encoders. With the content and style cues
in the canonical feature volume and the style image, we can learn the style
transformation matrix from their covariance matrices with lightweight neural
networks. The learned style transformation matrix can reflect a direct matching
of feature covariance from the content volume to the given style pattern, in
analogy with the optimization of the Gram matrix in traditional 2D neural style
transfer. The experimental results show that our method not only renders 4D
photorealistic style transfer results in a zero-shot manner but also
outperforms existing methods in terms of visual quality and consistency.
Related papers
- 4DStyleGaussian: Zero-shot 4D Style Transfer with Gaussian Splatting [15.456479631131522]
We introduce 4DStyleGaussian, a novel 4D style transfer framework to achieve real-time stylization of arbitrary style references.
Our method can achieve high-quality and zero-shot stylization for 4D scenarios with enhanced efficiency and spatial-temporal consistency.
arXiv Detail & Related papers (2024-10-14T12:03:00Z) - Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering [57.895846642868904]
We present a 3D generative model named DynaVol-S for dynamic scenes that enables object-centric learning.
voxelization infers per-object occupancy probabilities at individual spatial locations.
Our approach integrates 2D semantic features to create 3D semantic grids, representing the scene through multiple disentangled voxel grids.
arXiv Detail & Related papers (2024-07-30T15:33:58Z) - SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer [57.506654943449796]
We propose an efficient, sparse-controlled video-to-4D framework named SC4D that decouples motion and appearance.
Our method surpasses existing methods in both quality and efficiency.
We devise a novel application that seamlessly transfers motion onto a diverse array of 4D entities.
arXiv Detail & Related papers (2024-04-04T18:05:18Z) - Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed
Diffusion Models [94.07744207257653]
We focus on the underexplored text-to-4D setting and synthesize dynamic, animated 3D objects.
We combine text-to-image, text-to-video, and 3D-aware multiview diffusion models to provide feedback during 4D object optimization.
arXiv Detail & Related papers (2023-12-21T11:41:02Z) - A Unified Approach for Text- and Image-guided 4D Scene Generation [58.658768832653834]
We propose Dream-in-4D, which features a novel two-stage approach for text-to-4D synthesis.
We show that our approach significantly advances image and motion quality, 3D consistency and text fidelity for text-to-4D generation.
Our method offers, for the first time, a unified approach for text-to-4D, image-to-4D and personalized 4D generation tasks.
arXiv Detail & Related papers (2023-11-28T15:03:53Z) - Real-time Photorealistic Dynamic Scene Representation and Rendering with
4D Gaussian Splatting [8.078460597825142]
Reconstructing dynamic 3D scenes from 2D images and generating diverse views over time is challenging due to scene complexity and temporal dynamics.
We propose to approximate the underlying-temporal rendering volume of a dynamic scene by optimizing a collection of 4D primitives, with explicit geometry and appearance modeling.
Our model is conceptually simple, consisting of a 4D Gaussian parameterized by anisotropic ellipses that can rotate arbitrarily in space and time, as well as view-dependent and time-evolved appearance represented by the coefficient of 4D spherindrical harmonics.
arXiv Detail & Related papers (2023-10-16T17:57:43Z) - StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields [52.19291190355375]
StyleRF (Style Radiance Fields) is an innovative 3D style transfer technique.
It employs an explicit grid of high-level features to represent 3D scenes, with which high-fidelity geometry can be reliably restored via volume rendering.
It transforms the grid features according to the reference style which directly leads to high-quality zero-shot style transfer.
arXiv Detail & Related papers (2023-03-19T08:26:06Z) - NeRF-Art: Text-Driven Neural Radiance Fields Stylization [38.3724634394761]
We present NeRF-Art, a text-guided NeRF stylization approach that manipulates the style of a pre-trained NeRF model with a simple text prompt.
We show that our method is effective and robust regarding both single-view stylization quality and cross-view consistency.
arXiv Detail & Related papers (2022-12-15T18:59:58Z) - Stylizing 3D Scene via Implicit Representation and HyperNetwork [34.22448260525455]
A straightforward solution is to combine existing novel view synthesis and image/video style transfer approaches.
Inspired by the high quality results of the neural radiance fields (NeRF) method, we propose a joint framework to directly render novel views with the desired style.
Our framework consists of two components: an implicit representation of the 3D scene with the neural radiance field model, and a hypernetwork to transfer the style information into the scene representation.
arXiv Detail & Related papers (2021-05-27T09:11:30Z) - Neural Radiance Flow for 4D View Synthesis and Video Processing [59.9116932930108]
We present a method to learn a 4D spatial-temporal representation of a dynamic scene from a set of RGB images.
Key to our approach is the use of a neural implicit representation that learns to capture the 3D occupancy, radiance, and dynamics of the scene.
arXiv Detail & Related papers (2020-12-17T17:54:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.