Neural Enhancement in Content Delivery Systems: The State-of-the-Art and
Future Directions
- URL: http://arxiv.org/abs/2010.05838v2
- Date: Thu, 22 Oct 2020 12:42:00 GMT
- Title: Neural Enhancement in Content Delivery Systems: The State-of-the-Art and
Future Directions
- Authors: Royson Lee, Stylianos I. Venieris, Nicholas D. Lane
- Abstract summary: Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones.
We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
- Score: 16.04084457087104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Internet-enabled smartphones and ultra-wide displays are transforming a
variety of visual apps spanning from on-demand movies and 360-degree videos to
video-conferencing and live streaming. However, robustly delivering visual
content under fluctuating networking conditions on devices of diverse
capabilities remains an open problem. In recent years, advances in the field of
deep learning on tasks such as super-resolution and image enhancement have led
to unprecedented performance in generating high-quality images from low-quality
ones, a process we refer to as neural enhancement. In this paper, we survey
state-of-the-art content delivery systems that employ neural enhancement as a
key component in achieving both fast response time and high visual quality. We
first present the deployment challenges of neural enhancement models. We then
cover systems targeting diverse use-cases and analyze their design decisions in
overcoming technical challenges. Moreover, we present promising directions
based on the latest insights from deep learning research to further boost the
quality of experience of these systems.
Related papers
- Perceptually Optimized Super Resolution [7.728090438152828]
We propose a perceptually inspired and architecture-agnostic approach for controlling the visual quality and efficiency of super-resolution techniques.
The core is a perceptual model that dynamically guides super-resolution methods according to the human's sensitivity to image details.
We demonstrate the application of our proposed model in combination with network branching, and network complexity reduction to improve the computational efficiency of super-resolution methods without visible quality loss.
arXiv Detail & Related papers (2024-11-26T15:24:45Z) - VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception.
Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation.
We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment.
The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z) - Transformer-based Image and Video Inpainting: Current Challenges and Future Directions [5.2088618044533215]
Inpainting is a viable solution for various applications, including photographic restoration, video editing, and medical imaging.
CNNs and generative adversarial networks (GANs) have significantly enhanced the inpainting task.
Visual transformers have been exploited and offer some improvements to image or video inpainting.
arXiv Detail & Related papers (2024-06-28T20:42:36Z) - Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models [81.71651422951074]
Chain-of-Spot (CoS) method is a novel approach that enhances feature extraction by focusing on key regions of interest.
This technique allows LVLMs to access more detailed visual information without altering the original image resolution.
Our empirical findings demonstrate a significant improvement in LVLMs' ability to understand and reason about visual content.
arXiv Detail & Related papers (2024-03-19T17:59:52Z) - Reimagining Reality: A Comprehensive Survey of Video Inpainting
Techniques [6.36998581871295]
Video inpainting is a process that restores or fills in missing or corrupted portions of video sequences with plausible content.
Our study deconstructs major techniques, their underpinning theories, and their effective applications.
We employ a human-centric approach to assess visual quality, enlisting a panel of annotators to evaluate the output of different video inpainting techniques.
arXiv Detail & Related papers (2024-01-31T14:41:40Z) - E2HQV: High-Quality Video Generation from Event Camera via
Theory-Inspired Model-Aided Deep Learning [53.63364311738552]
Bio-inspired event cameras or dynamic vision sensors are capable of capturing per-pixel brightness changes (called event-streams) in high temporal resolution and high dynamic range.
It calls for events-to-video (E2V) solutions which take event-streams as input and generate high quality video frames for intuitive visualization.
We propose textbfE2HQV, a novel E2V paradigm designed to produce high-quality video frames from events.
arXiv Detail & Related papers (2024-01-16T05:10:50Z) - A Survey on Super Resolution for video Enhancement Using GAN [0.0]
Recent developments in super-resolution image and video using deep learning algorithms such as Generative Adversarial Networks are covered.
Advancements aim to increase the visual clarity and quality of low-resolution video, have tremendous potential in a variety of sectors ranging from surveillance technology to medical imaging.
This collection delves into the wider field of Generative Adversarial Networks, exploring their principles, training approaches, and applications across a broad range of domains.
arXiv Detail & Related papers (2023-12-27T08:41:38Z) - Artificial intelligence optical hardware empowers high-resolution
hyperspectral video understanding at 1.2 Tb/s [53.91923493664551]
This work introduces a hardware-accelerated integrated optoelectronic platform for multidimensional video understanding in real-time.
The technology platform combines artificial intelligence hardware, processing information optically, with state-of-the-art machine vision networks.
Such performance surpasses the speed of the closest technologies with similar spectral resolution by three to four orders of magnitude.
arXiv Detail & Related papers (2023-12-17T07:51:38Z) - Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks [55.81577205593956]
Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously.
Deep learning (DL) has been brought to this emerging field and inspired active research endeavors in mining its potential.
arXiv Detail & Related papers (2023-02-17T14:19:28Z) - Deep Neural Network-based Enhancement for Image and Video Streaming
Systems: A Survey and Future Directions [20.835654670825782]
Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones.
We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
arXiv Detail & Related papers (2021-06-07T15:42:36Z) - Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence.
Transformers require minimal inductive biases for their design and are naturally suited as set-functions.
This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.