A Survey of Deep Learning Video Super-Resolution
- URL: http://arxiv.org/abs/2506.03216v1
- Date: Tue, 03 Jun 2025 05:42:19 GMT
- Title: A Survey of Deep Learning Video Super-Resolution
- Authors: Arbind Agrahari Baniya, Tsz-Kwan Lee, Peter Eklund, Sunil Aryal,
- Abstract summary: Video super-resolution (VSR) is a prominent research topic in low-level computer vision.<n>Deep learning technologies have played a significant role in VSR research.
- Score: 1.074960192271861
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Video super-resolution (VSR) is a prominent research topic in low-level computer vision, where deep learning technologies have played a significant role. The rapid progress in deep learning and its applications in VSR has led to a proliferation of tools and techniques in the literature. However, the usage of these methods is often not adequately explained, and decisions are primarily driven by quantitative improvements. Given the significance of VSR's potential influence across multiple domains, it is imperative to conduct a comprehensive analysis of the elements and deep learning methodologies employed in VSR research. This methodical analysis will facilitate the informed development of models tailored to specific application needs. In this paper, we present an overarching overview of deep learning-based video super-resolution models, investigating each component and discussing its implications. Furthermore, we provide a synopsis of key components and technologies employed by state-of-the-art and earlier VSR models. By elucidating the underlying methodologies and categorising them systematically, we identified trends, requirements, and challenges in the domain. As a first-of-its-kind survey of deep learning-based VSR models, this work also establishes a multi-level taxonomy to guide current and future VSR research, enhancing the maturation and interpretation of VSR practices for various practical applications.
Related papers
- Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities [62.05713042908654]
This paper provides a review of advances in Large Language Models (LLMs) alignment through the lens of inverse reinforcement learning (IRL)<n>We highlight the necessity of constructing neural reward models from human data and discuss the formal and practical implications of this paradigm shift.
arXiv Detail & Related papers (2025-07-17T14:22:24Z) - Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z) - Enhancing Retrieval-Augmented Generation: A Study of Best Practices [16.246719783032436]
We develop advanced RAG system designs that incorporate query expansion, various novel retrieval strategies, and a novel Contrastive In-Context Learning RAG.<n>Our study systematically investigates key factors, including language model size, prompt design, document chunk size, knowledge base size, retrieval stride, query expansion techniques, and Focus Mode retrieving relevant context at sentence-level.<n>Our findings offer actionable insights for developing RAG systems, striking a balance between contextual richness and retrieval-generation efficiency.
arXiv Detail & Related papers (2025-01-13T15:07:55Z) - Deep Learning for Video Anomaly Detection: A Review [52.74513211976795]
Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos.
In the era of deep learning, a great variety of deep learning based methods are constantly emerging for the VAD task.
This review covers the spectrum of five different categories, namely, semi-supervised, weakly supervised, fully supervised, unsupervised and open-set supervised VAD.
arXiv Detail & Related papers (2024-09-09T07:31:16Z) - Deep Learning based Visually Rich Document Content Understanding: A Survey [8.788354139674789]
Visually Rich Documents (VRDs) are essential in academia, finance, medical fields, and marketing.
Deep learning has revolutionized this process, introducing models that leverage multimodal information vision, text, and layout.
These models have achieved state-of-the-art performance across various downstream tasks.
arXiv Detail & Related papers (2024-08-02T14:19:34Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - Guided Depth Map Super-resolution: A Survey [88.54731860957804]
Guided depth map super-resolution (GDSR) aims to reconstruct a high-resolution (HR) depth map from a low-resolution (LR) observation with the help of a paired HR color image.
A myriad of novel and effective approaches have been proposed recently, especially with powerful deep learning techniques.
This survey is an effort to present a comprehensive survey of recent progress in GDSR.
arXiv Detail & Related papers (2023-02-19T15:43:54Z) - A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning [53.35317176453194]
Data augmentation (DA) has become a widely used technique in visual RL for acquiring sample-efficient and generalizable policies.
We present a principled taxonomy of the existing augmentation techniques used in visual RL and conduct an in-depth discussion on how to better leverage augmented data.
As the first comprehensive survey of DA in visual RL, this work is expected to offer valuable guidance to this emerging field.
arXiv Detail & Related papers (2022-10-10T11:01:57Z) - Deep Learning Methods for Abstract Visual Reasoning: A Survey on Raven's
Progressive Matrices [0.0]
We focus on the most common type of tasks -- the Raven's Progressive Matrices ( RPMs) -- and provide a review of the learning methods and deep neural models applied to solve RPMs.
We conclude the paper by demonstrating how real-world problems can benefit from the discoveries of RPM studies.
arXiv Detail & Related papers (2022-01-28T19:24:30Z) - Advances and Challenges in Deep Lip Reading [2.930266486910376]
This paper provides a comprehensive survey of the state-of-the-art deep learning based Visual Speech Recognition research.
We focus on data challenges, task-specific complications, and the corresponding solutions.
We also discuss the main modules of a VSR pipeline and the influential datasets.
arXiv Detail & Related papers (2021-10-15T06:18:26Z) - Video Super Resolution Based on Deep Learning: A Comprehensive Survey [87.30395002197344]
We comprehensively investigate 33 state-of-the-art video super-resolution (VSR) methods based on deep learning.
We propose a taxonomy and classify the methods into six sub-categories according to the ways of utilizing inter-frame information.
We summarize and compare the performance of the representative VSR method on some benchmark datasets.
arXiv Detail & Related papers (2020-07-25T13:39:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.