A new dataset and comparison for multi-camera frame synthesis
- URL: http://arxiv.org/abs/2508.09068v2
- Date: Thu, 18 Sep 2025 15:26:41 GMT
- Title: A new dataset and comparison for multi-camera frame synthesis
- Authors: Conall Daly, Anil Kokaram,
- Abstract summary: We develop a novel multi-camera dataset using a custom-built dense linear camera array.<n>We evaluate classical and deep learning frame interpolators against a view synthesis method for the task of view in-betweening.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many methods exist for frame synthesis in image sequences but can be broadly categorised into frame interpolation and view synthesis techniques. Fundamentally, both frame interpolation and view synthesis tackle the same task, interpolating a frame given surrounding frames in time or space. However, most frame interpolation datasets focus on temporal aspects with single cameras moving through time and space, while view synthesis datasets are typically biased toward stereoscopic depth estimation use cases. This makes direct comparison between view synthesis and frame interpolation methods challenging. In this paper, we develop a novel multi-camera dataset using a custom-built dense linear camera array to enable fair comparison between these approaches. We evaluate classical and deep learning frame interpolators against a view synthesis method (3D Gaussian Splatting) for the task of view in-betweening. Our results reveal that deep learning methods do not significantly outperform classical methods on real image data, with 3D Gaussian Splatting actually underperforming frame interpolators by as much as 3.5 dB PSNR. However, in synthetic scenes, the situation reverses -- 3D Gaussian Splatting outperforms frame interpolation algorithms by almost 5 dB PSNR at a 95% confidence level.
Related papers
- PoseCrafter: Extreme Pose Estimation with Hybrid Video Synthesis [82.87579563469039]
Pairwise camera pose estimation from sparsely overlapping image pairs remains a critical and unsolved challenge in 3D vision.<n>Recent approaches attempt to address this by synthesizing intermediate frames using video and selecting key frames via a self-consistency score.<n>We propose Hybrid Video Generation (HVG) to synthesize clearer intermediate frames by coupling a video model with a pose-conditioned novel view model.<n>We also propose a Feature Matching Selector (FMS) based on feature correspondence to select intermediate frames appropriate for pose estimation from the synthesized results.
arXiv Detail & Related papers (2025-10-22T12:32:37Z) - Enhancing Novel View Synthesis from extremely sparse views with SfM-free 3D Gaussian Splatting Framework [14.927184256861807]
We propose a novel SfM-free 3DGS-based method that jointly estimates camera poses and reconstructs 3D scenes from extremely sparse-view inputs.<n>Our method significantly outperforms other state-of-the-art 3DGS-based approaches, achieving a remarkable 2.75dB improvement in PSNR under extremely sparse-view conditions.
arXiv Detail & Related papers (2025-08-21T11:25:24Z) - Multi-View Object Pose Refinement With Differentiable Renderer [22.040014384283378]
This paper introduces a novel multi-view 6 DoF object pose refinement approach focusing on improving methods trained on synthetic data.
It is based on the DPOD detector, which produces dense 2D-3D correspondences between the model vertices and the image pixels in each frame.
We report excellent performance in comparison to the state-of-the-art methods trained on the synthetic and real data.
arXiv Detail & Related papers (2022-07-06T17:02:22Z) - FILM: Frame Interpolation for Large Motion [20.04001872133824]
We present a frame algorithm that synthesizes multiple intermediate frames from two input images with large in-between motion.
Our approach outperforms state-of-the-art methods on the Xiph large motion benchmark.
arXiv Detail & Related papers (2022-02-10T08:48:18Z) - Video Frame Interpolation without Temporal Priors [91.04877640089053]
Video frame aims to synthesize non-exist intermediate frames in a video sequence.
The temporal priors of videos, i.e. frames per second (FPS) and frame exposure time, may vary from different camera sensors.
We devise a novel optical flow refinement strategy for better synthesizing results.
arXiv Detail & Related papers (2021-12-02T12:13:56Z) - Asymmetric Bilateral Motion Estimation for Video Frame Interpolation [50.44508853885882]
We propose a novel video frame algorithm based on asymmetric bilateral motion estimation (ABME)
We predict symmetric bilateral motion fields to interpolate an anchor frame.
We estimate asymmetric bilateral motions fields from the anchor frame to the input frames.
Third, we use the asymmetric fields to warp the input frames backward and reconstruct the intermediate frame.
arXiv Detail & Related papers (2021-08-15T21:11:35Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z) - ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring [92.40655035360729]
Video deblurring models exploit consecutive frames to remove blurs from camera shakes and object motions.
We propose a novel implicit method to learn spatial correspondence among blurry frames in the feature space.
Our proposed method is evaluated on the widely-adopted DVD dataset, along with a newly collected High-Frame-Rate (1000 fps) dataset for Video Deblurring.
arXiv Detail & Related papers (2021-03-07T04:33:13Z) - Street-view Panoramic Video Synthesis from a Single Satellite Image [92.26826861266784]
We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video.
Existing cross-view synthesis approaches focus more on images, while video synthesis in such a case has not yet received enough attention.
arXiv Detail & Related papers (2020-12-11T20:22:38Z) - ALANET: Adaptive Latent Attention Network forJoint Video Deblurring and
Interpolation [38.52446103418748]
We introduce a novel architecture, Adaptive Latent Attention Network (ALANET), which synthesizes sharp high frame-rate videos.
We employ combination of self-attention and cross-attention module between consecutive frames in the latent space to generate optimized representation for each frame.
Our method performs favorably against various state-of-the-art approaches, even though we tackle a much more difficult problem.
arXiv Detail & Related papers (2020-08-31T21:11:53Z) - Efficient Semantic Video Segmentation with Per-frame Inference [117.97423110566963]
In this work, we process efficient semantic video segmentation in a per-frame fashion during the inference process.
We employ compact models for real-time execution. To narrow the performance gap between compact models and large models, new knowledge distillation methods are designed.
arXiv Detail & Related papers (2020-02-26T12:24:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.