Related papers: AnyView: Synthesizing Any Novel View in Dynamic Scenes

AnyView: Synthesizing Any Novel View in Dynamic Scenes

URL: http://arxiv.org/abs/2601.16982v1
Date: Fri, 23 Jan 2026 18:59:58 GMT
Title: AnyView: Synthesizing Any Novel View in Dynamic Scenes
Authors: Basile Van Hoorick, Dian Chen, Shun Iwase, Pavel Tokmakov, Muhammad Zubair Irshad, Igor Vasiljevic, Swati Gupta, Fangzhou Cheng, Sergey Zakharov, Vitor Campagnolo Guizilini,
Abstract summary: We introduce bfAnyView, a diffusion-based video generation framework for emphdynamic viewtext synthesis with minimal biases or geometric assumptions.<n>We show competitive results with the current state of the art, and propose a new benchmark tailored towards emphextreme dynamic viewtext in diverse real-world scenarios.
Score: 23.16723540943151
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern generative video models excel at producing convincing, high-quality outputs, but struggle to maintain multi-view and spatiotemporal consistency in highly dynamic real-world environments. In this work, we introduce \textbf{AnyView}, a diffusion-based video generation framework for \emph{dynamic view synthesis} with minimal inductive biases or geometric assumptions. We leverage multiple data sources with various levels of supervision, including monocular (2D), multi-view static (3D) and multi-view dynamic (4D) datasets, to train a generalist spatiotemporal implicit representation capable of producing zero-shot novel videos from arbitrary camera locations and trajectories. We evaluate AnyView on standard benchmarks, showing competitive results with the current state of the art, and propose \textbf{AnyViewBench}, a challenging new benchmark tailored towards \emph{extreme} dynamic view synthesis in diverse real-world scenarios. In this more dramatic setting, we find that most baselines drastically degrade in performance, as they require significant overlap between viewpoints, while AnyView maintains the ability to produce realistic, plausible, and spatiotemporally consistent videos when prompted from \emph{any} viewpoint. Results, data, code, and models can be viewed at: https://tri-ml.github.io/AnyView/

Related papers

Stable Virtual Camera: Generative View Synthesis with Diffusion Models [51.71244310522393]
We present Stable Virtual Camera (Seva), a generalist diffusion model that creates novel views of a scene.<n>Our approach overcomes these limitations through simple model design, optimized training recipe, and flexible sampling strategy.<n>Our method can generate high-quality videos lasting up to half a minute with seamless loop closure.
arXiv Detail & Related papers (2025-03-18T17:57:22Z)
SimVS: Simulating World Inconsistencies for Robust View Synthesis [102.83898965828621]
We present an approach for leveraging generative video models to simulate the inconsistencies in the world that can occur during capture.<n>We demonstrate that our world-simulation strategy significantly outperforms traditional augmentation methods in handling real-world scene variations.
arXiv Detail & Related papers (2024-12-10T17:35:12Z)
Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos [110.3924779333809]
We present BTimer, the first motion-aware feed-forward model for real-time reconstruction and novel view synthesis of dynamic scenes.<n>Our approach reconstructs the full scene in a 3D Gaussian Splatting representation at a given target ('bullet') timestamp by aggregating information from all the context frames.<n>Given a casual monocular dynamic video, BTimer reconstructs a bullet-time scene within 150ms while reaching state-of-the-art performance on both static and dynamic scene datasets.
arXiv Detail & Related papers (2024-12-04T18:15:06Z)
D-NPC: Dynamic Neural Point Clouds for Non-Rigid View Synthesis from Monocular Video [53.83936023443193]
This paper contributes to the field by introducing a new synthesis method for dynamic novel view from monocular video, such as smartphone captures.<n>Our approach represents the as a $textitdynamic neural point cloud$, an implicit time-conditioned point cloud that encodes local geometry and appearance in separate hash-encoded neural feature grids.
arXiv Detail & Related papers (2024-06-14T14:35:44Z)
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis [43.02778060969546]
We propose a controllable monocular dynamic view synthesis pipeline. Our model does not require depth as input, and does not explicitly model 3D scene geometry. We believe our framework can potentially unlock powerful applications in rich dynamic scene understanding, perception for robotics, and interactive 3D video viewing experiences for virtual reality.
arXiv Detail & Related papers (2024-05-23T17:59:52Z)
Fast View Synthesis of Casual Videos with Soup-of-Planes [24.35962788109883]
Novel view synthesis from an in-the-wild video is difficult due to challenges like scene dynamics and lack of parallax. This paper revisits explicit video representations to synthesize high-quality novel views from a monocular video efficiently. Our method can render high-quality novel views from an in-the-wild video with comparable quality to state-of-the-art methods while being 100x faster in training and enabling real-time rendering.
arXiv Detail & Related papers (2023-12-04T18:55:48Z)
Dynamic View Synthesis from Dynamic Monocular Video [69.80425724448344]
We present an algorithm for generating views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene. We show extensive quantitative and qualitative results of dynamic view synthesis from casually captured videos.
arXiv Detail & Related papers (2021-05-13T17:59:50Z)
Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video [76.19076002661157]
Non-Rigid Neural Radiance Fields (NR-NeRF) is a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. We show that even a single consumer-grade camera is sufficient to synthesize sophisticated renderings of a dynamic scene from novel virtual camera views.
arXiv Detail & Related papers (2020-12-22T18:46:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.