AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation
- URL: http://arxiv.org/abs/2506.01061v1
- Date: Sun, 01 Jun 2025 16:01:24 GMT
- Title: AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation
- Authors: Dahyeon Kye, Changhyun Roh, Sukhun Ko, Chanho Eom, Jihyong Oh,
- Abstract summary: Video Frame Interpolation (VFI) is a fundamental Low-Level Vision (LLV) task that synthesizes intermediate frames between existing ones.<n>We introduce AceVFI, the most comprehensive survey on VFI to date, covering over 250+ papers across these approaches.<n>We categorize the learning paradigm of VFI methods namely, Center-Time Frame Interpolation (CTFI) and Arbitrary-Time Frame Interpolation (ATFI)
- Score: 8.563354084119062
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Video Frame Interpolation (VFI) is a fundamental Low-Level Vision (LLV) task that synthesizes intermediate frames between existing ones while maintaining spatial and temporal coherence. VFI techniques have evolved from classical motion compensation-based approach to deep learning-based approach, including kernel-, flow-, hybrid-, phase-, GAN-, Transformer-, Mamba-, and more recently diffusion model-based approach. We introduce AceVFI, the most comprehensive survey on VFI to date, covering over 250+ papers across these approaches. We systematically organize and describe VFI methodologies, detailing the core principles, design assumptions, and technical characteristics of each approach. We categorize the learning paradigm of VFI methods namely, Center-Time Frame Interpolation (CTFI) and Arbitrary-Time Frame Interpolation (ATFI). We analyze key challenges of VFI such as large motion, occlusion, lighting variation, and non-linear motion. In addition, we review standard datasets, loss functions, evaluation metrics. We examine applications of VFI including event-based, cartoon, medical image VFI and joint VFI with other LLV tasks. We conclude by outlining promising future research directions to support continued progress in the field. This survey aims to serve as a unified reference for both newcomers and experts seeking a deep understanding of modern VFI landscapes.
Related papers
- Continual Learning for VLMs: A Survey and Taxonomy Beyond Forgetting [70.83781268763215]
Vision-language models (VLMs) have achieved impressive performance across diverse multimodal tasks by leveraging large-scale pre-training.<n>VLMs face unique challenges such as cross-modal feature drift, parameter interference due to shared architectures, and zero-shot capability erosion.<n>This survey aims to serve as a comprehensive and diagnostic reference for researchers developing lifelong vision-language systems.
arXiv Detail & Related papers (2025-08-06T09:03:10Z) - Set Pivot Learning: Redefining Generalized Segmentation with Vision Foundation Models [15.321114178936554]
We introduce the concept of Set Pivot Learning, a paradigm shift that redefines domain generalization (DG) based on Vision Foundation Models (VFMs)<n>Traditional DG assumes that the target domain is inaccessible during training, but the emergence of VFMs renders this assumption unclear and obsolete.<n>We propose Set Pivot Learning (SPL), a new definition of domain migration task based on VFMs, which is more suitable for current research and application requirements.
arXiv Detail & Related papers (2025-08-03T04:20:35Z) - A Comprehensive Survey on Video Scene Parsing:Advances, Challenges, and Prospects [53.15503034595476]
Video Scene Parsing (VSP) has emerged as a cornerstone in computer vision.<n>VSP has emerged as a cornerstone in computer vision, facilitating the simultaneous segmentation, recognition, and tracking of diverse visual entities in dynamic scenes.
arXiv Detail & Related papers (2025-06-16T14:39:03Z) - FedVLMBench: Benchmarking Federated Fine-Tuning of Vision-Language Models [15.102237976107645]
Vision-Language Models (VLMs) integrate visual and textual information.<n>Recent efforts have introduced Federated Learning (FL) into VLM fine-tuning to address privacy concerns.<n>We present FedVLMBench, the first systematic benchmark for federated fine-tuning ofVLMs.
arXiv Detail & Related papers (2025-06-11T11:52:27Z) - Vertical Federated Learning for Effectiveness, Security, Applicability: A Survey [67.48187503803847]
Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm.
Recent research has shown promising results addressing various challenges in VFL.
This survey offers a systematic overview of recent developments.
arXiv Detail & Related papers (2024-05-25T16:05:06Z) - Motion-aware Latent Diffusion Models for Video Frame Interpolation [51.78737270917301]
Motion estimation between neighboring frames plays a crucial role in avoiding motion ambiguity.
We propose a novel diffusion framework, motion-aware latent diffusion models (MADiff)
Our method achieves state-of-the-art performance significantly outperforming existing approaches.
arXiv Detail & Related papers (2024-04-21T05:09:56Z) - A Multi-In-Single-Out Network for Video Frame Interpolation without
Optical Flow [14.877766449009119]
deep learning-based video frame (VFI) methods have predominantly focused on estimating motion between two input frames.
We propose a multi-in-single-out (MISO) based VFI method that does not rely on motion vector estimation.
We introduce a novel motion perceptual loss that enables MISO-VFI to better capture the vectors-temporal within the video frames.
arXiv Detail & Related papers (2023-11-20T08:29:55Z) - Boost Video Frame Interpolation via Motion Adaptation [73.42573856943923]
Video frame (VFI) is a challenging task that aims to generate intermediate frames between two consecutive frames in a video.
Existing learning-based VFI methods have achieved great success, but they still suffer from limited generalization ability.
We propose a novel optimization-based VFI method that can adapt to unseen motions at test time.
arXiv Detail & Related papers (2023-06-24T10:44:02Z) - LDMVFI: Video Frame Interpolation with Latent Diffusion Models [3.884484241124158]
We propose latent diffusion model-based VFI, LDMVFI.
This approaches the VFI problem from a generative perspective by formulating it as a conditional generation problem.
Our experiments and user study indicate that LDMVFI is able to interpolate video content with favorable perceptual quality compared to the state of the art, even in the high-resolution regime.
arXiv Detail & Related papers (2023-03-16T17:24:41Z) - Error-Aware Spatial Ensembles for Video Frame Interpolation [50.63021118973639]
Video frame(VFI) algorithms have improved considerably in recent years due to unprecedented progress in both data-driven algorithms and their implementations.
Recent research has introduced advanced motion estimation or novel warping methods as the means to address challenging VFI scenarios.
This work introduces such a solution. By closely examining the correlation between optical flow and IE, the paper proposes novel error prediction metrics that partition the middle frame into distinct regions corresponding to different IE levels.
arXiv Detail & Related papers (2022-07-25T16:15:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.