CD-NGP: A Fast Scalable Continual Representation for Dynamic Scenes
- URL: http://arxiv.org/abs/2409.05166v4
- Date: Wed, 23 Oct 2024 02:10:29 GMT
- Title: CD-NGP: A Fast Scalable Continual Representation for Dynamic Scenes
- Authors: Zhenhuan Liu, Shuai Liu, Zhiwei Ning, Jie Yang, Wei Liu,
- Abstract summary: We propose continual dynamic neural graphics primitives (CD-NGP) for view synthesis.
Our approach synergizes features from both temporal and spatial hash encodings to achieve high rendering quality.
We introduce a novel dataset comprising multi-view, exceptionally long video sequences with substantial rigid and non-rigid motion.
- Score: 9.217592165862762
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Current methodologies for novel view synthesis (NVS) in dynamic scenes encounter significant challenges in harmonizing memory consumption, model complexity, training efficiency, and rendering fidelity. Existing offline techniques, while delivering high-quality results, are often characterized by substantial memory demands and limited scalability. In contrast, online methods grapple with the challenge of balancing rapid convergence with model compactness. To address these issues, we propose continual dynamic neural graphics primitives (CD-NGP). Our approach synergizes features from both temporal and spatial hash encodings to achieve high rendering quality, employs parameter reuse to enhance scalability, and leverages a continual learning framework to mitigate memory overhead. Furthermore, we introduce a novel dataset comprising multi-view, exceptionally long video sequences with substantial rigid and non-rigid motion, thereby substantiating the scalability of our method.
Related papers
- Adaptive and Temporally Consistent Gaussian Surfels for Multi-view Dynamic Reconstruction [3.9363268745580426]
AT-GS is a novel method for reconstructing high-quality dynamic surfaces from multi-view videos through per-frame incremental optimization.
We reduce temporal jittering in dynamic surfaces by ensuring consistency in curvature maps across consecutive frames.
Our method achieves superior accuracy and temporal coherence in dynamic surface reconstruction, delivering high-fidelity space-time novel view synthesis.
arXiv Detail & Related papers (2024-11-10T21:30:16Z) - ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation [83.62931466231898]
This paper presents ARLON, a framework that boosts diffusion Transformers with autoregressive models for long video generation.
A latent Vector Quantized Variational Autoencoder (VQ-VAE) compresses the input latent space of the DiT model into compact visual tokens.
An adaptive norm-based semantic injection module integrates the coarse discrete visual units from the AR model into the DiT model.
arXiv Detail & Related papers (2024-10-27T16:28:28Z) - Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.
We introduce a novel quantization framework that includes three strategies.
This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z) - Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World
Video Super-Resolution [65.91317390645163]
Upscale-A-Video is a text-guided latent diffusion framework for video upscaling.
It ensures temporal coherence through two key mechanisms: locally, it integrates temporal layers into U-Net and VAE-Decoder, maintaining consistency within short sequences.
It also offers greater flexibility by allowing text prompts to guide texture creation and adjustable noise levels to balance restoration and generation.
arXiv Detail & Related papers (2023-12-11T18:54:52Z) - Alignment-free HDR Deghosting with Semantics Consistent Transformer [76.91669741684173]
High dynamic range imaging aims to retrieve information from multiple low-dynamic range inputs to generate realistic output.
Existing methods often focus on the spatial misalignment across input frames caused by the foreground and/or camera motion.
We propose a novel alignment-free network with a Semantics Consistent Transformer (SCTNet) with both spatial and channel attention modules.
arXiv Detail & Related papers (2023-05-29T15:03:23Z) - Unbiased Scene Graph Generation in Videos [36.889659781604564]
We introduce TEMPURA: TEmporal consistency and Memory-guided UnceRtainty Attenuation for unbiased dynamic SGG.
TEMPURA employs object-level temporal consistencies via transformer sequence modeling, learns to synthesize unbiased relationship representations.
Our method achieves significant (up to 10% in some cases) performance gain over existing methods.
arXiv Detail & Related papers (2023-04-03T06:10:06Z) - Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For
Advection-Dominated Systems [14.553972457854517]
We present a data-driven, space-time continuous framework to learn surrogatemodels for complex physical systems.
We leverage the expressive power of the network and aspecially designed consistency-inducing regularization to obtain latent trajectories that are both low-dimensional and smooth.
arXiv Detail & Related papers (2023-01-25T03:06:03Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE)
Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures.
Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z) - Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling
via Temporal Basis Learning [6.952039070065292]
We propose a novel Temporal-MPI representation which is able to encode the rich 3D and dynamic variation information throughout the entire video as compact temporal basis.
Our proposed Temporal-MPI framework is able to generate a time-instance MPI with only 0.002 seconds, which is up to 3000 times faster, with 3dB higher average view-synthesis PSNR as compared with other state-of-the-art dynamic scene modelling frameworks.
arXiv Detail & Related papers (2021-11-20T07:34:28Z) - Enabling Continual Learning with Differentiable Hebbian Plasticity [18.12749708143404]
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge.
catastrophic forgetting poses a grand challenge for neural networks performing such learning process.
We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity.
arXiv Detail & Related papers (2020-06-30T06:42:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.