Neural Enhancement in Content Delivery Systems: The State-of-the-Art and
Future Directions
- URL: http://arxiv.org/abs/2010.05838v2
- Date: Thu, 22 Oct 2020 12:42:00 GMT
- Title: Neural Enhancement in Content Delivery Systems: The State-of-the-Art and
Future Directions
- Authors: Royson Lee, Stylianos I. Venieris, Nicholas D. Lane
- Abstract summary: Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones.
We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
- Score: 16.04084457087104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Internet-enabled smartphones and ultra-wide displays are transforming a
variety of visual apps spanning from on-demand movies and 360-degree videos to
video-conferencing and live streaming. However, robustly delivering visual
content under fluctuating networking conditions on devices of diverse
capabilities remains an open problem. In recent years, advances in the field of
deep learning on tasks such as super-resolution and image enhancement have led
to unprecedented performance in generating high-quality images from low-quality
ones, a process we refer to as neural enhancement. In this paper, we survey
state-of-the-art content delivery systems that employ neural enhancement as a
key component in achieving both fast response time and high visual quality. We
first present the deployment challenges of neural enhancement models. We then
cover systems targeting diverse use-cases and analyze their design decisions in
overcoming technical challenges. Moreover, we present promising directions
based on the latest insights from deep learning research to further boost the
quality of experience of these systems.
Related papers
- ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos [71.62145804686062]
We introduce the first Egocentric Spatial Video Quality Assessment Database (ESVQAD), which comprises 600 egocentric spatial videos and their mean opinion scores (MOSs)
We propose a novel multi-dimensional binocular feature fusion model, termed ESVQAnet, which integrates binocular spatial, motion, and semantic features to predict the perceptual quality.
Experimental results demonstrate the ESVQAnet outperforms 16 state-of-the-art VQA models on the embodied perceptual quality assessment task.
arXiv Detail & Related papers (2024-12-29T10:13:30Z) - UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics [74.10447111842504]
UniReal is a unified framework designed to address various image generation and editing tasks.
Inspired by recent video generation models, we propose a unifying approach that treats image-level tasks as discontinuous video generation.
Although designed for image-level tasks, we leverage videos as a scalable source for universal supervision.
arXiv Detail & Related papers (2024-12-10T18:59:55Z) - Video Quality Assessment: A Comprehensive Survey [55.734935003021576]
Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner consistent with human judgments of perceived quality.
We present a survey of recent progress in the development of VQA algorithms and the benchmarking studies and databases that make them possible.
arXiv Detail & Related papers (2024-12-04T05:25:17Z) - Perceptually Optimized Super Resolution [7.728090438152828]
We propose a perceptually inspired and architecture-agnostic approach for controlling the visual quality and efficiency of super-resolution techniques.
The core is a perceptual model that dynamically guides super-resolution methods according to the human's sensitivity to image details.
We demonstrate the application of our proposed model in combination with network branching, and network complexity reduction to improve the computational efficiency of super-resolution methods without visible quality loss.
arXiv Detail & Related papers (2024-11-26T15:24:45Z) - VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception.
Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation.
We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment.
The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - Transformer-based Image and Video Inpainting: Current Challenges and Future Directions [5.2088618044533215]
Inpainting is a viable solution for various applications, including photographic restoration, video editing, and medical imaging.
CNNs and generative adversarial networks (GANs) have significantly enhanced the inpainting task.
Visual transformers have been exploited and offer some improvements to image or video inpainting.
arXiv Detail & Related papers (2024-06-28T20:42:36Z) - Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models [81.71651422951074]
Chain-of-Spot (CoS) method is a novel approach that enhances feature extraction by focusing on key regions of interest.
This technique allows LVLMs to access more detailed visual information without altering the original image resolution.
Our empirical findings demonstrate a significant improvement in LVLMs' ability to understand and reason about visual content.
arXiv Detail & Related papers (2024-03-19T17:59:52Z) - Reimagining Reality: A Comprehensive Survey of Video Inpainting
Techniques [6.36998581871295]
Video inpainting is a process that restores or fills in missing or corrupted portions of video sequences with plausible content.
Our study deconstructs major techniques, their underpinning theories, and their effective applications.
We employ a human-centric approach to assess visual quality, enlisting a panel of annotators to evaluate the output of different video inpainting techniques.
arXiv Detail & Related papers (2024-01-31T14:41:40Z) - A Survey on Super Resolution for video Enhancement Using GAN [0.0]
Recent developments in super-resolution image and video using deep learning algorithms such as Generative Adversarial Networks are covered.
Advancements aim to increase the visual clarity and quality of low-resolution video, have tremendous potential in a variety of sectors ranging from surveillance technology to medical imaging.
This collection delves into the wider field of Generative Adversarial Networks, exploring their principles, training approaches, and applications across a broad range of domains.
arXiv Detail & Related papers (2023-12-27T08:41:38Z) - Deep Neural Network-based Enhancement for Image and Video Streaming
Systems: A Survey and Future Directions [20.835654670825782]
Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones.
We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
arXiv Detail & Related papers (2021-06-07T15:42:36Z) - Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence.
Transformers require minimal inductive biases for their design and are naturally suited as set-functions.
This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.