Related papers: Neural Enhancement in Content Delivery Systems: The State-of-the-Art and Future Directions

Neural Enhancement in Content Delivery Systems: The State-of-the-Art and Future Directions

URL: http://arxiv.org/abs/2010.05838v2
Date: Thu, 22 Oct 2020 12:42:00 GMT
Title: Neural Enhancement in Content Delivery Systems: The State-of-the-Art and Future Directions
Authors: Royson Lee, Stylianos I. Venieris, Nicholas D. Lane
Abstract summary: Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones. We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
Score: 16.04084457087104
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Internet-enabled smartphones and ultra-wide displays are transforming a variety of visual apps spanning from on-demand movies and 360-degree videos to video-conferencing and live streaming. However, robustly delivering visual content under fluctuating networking conditions on devices of diverse capabilities remains an open problem. In recent years, advances in the field of deep learning on tasks such as super-resolution and image enhancement have led to unprecedented performance in generating high-quality images from low-quality ones, a process we refer to as neural enhancement. In this paper, we survey state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality. We first present the deployment challenges of neural enhancement models. We then cover systems targeting diverse use-cases and analyze their design decisions in overcoming technical challenges. Moreover, we present promising directions based on the latest insights from deep learning research to further boost the quality of experience of these systems.

Related papers

Towards Holistic Visual Quality Assessment of AI-Generated Videos: A LLM-Based Multi-Dimensional Evaluation Model [8.866376599966353]
We decompose the visual quality of AIGVs into three dimensions: technical quality, motion quality, and video semantics.<n>Considering the outstanding performance of large language models (LLMs) in various vision and language tasks, we introduce a LLM as the quality regression module.<n>Our proposed method achieved textbfsecond place in the NTIRE 2025 Quality Assessment of AI-Generated Content Challenge.
arXiv Detail & Related papers (2025-06-05T07:40:12Z)
VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow [57.96482272333649]
Feature visualization (FV) is a powerful tool to decode what information neurons are responding to. We propose to guide FV through statistics of prototypical image features combined with measures of relevant network flow to generate images. Our approach yields human-understandable visualizations that both qualitatively and quantitatively improve over state-of-the-art FVs.
arXiv Detail & Related papers (2025-03-28T13:08:18Z)
Perceptual Visual Quality Assessment: Principles, Methods, and Future Directions [82.01265561756105]
perceptual visual quality assessment (PVQA) focuses on evaluating the quality of multimedia content based on human perception. PVQA process includes diverse characteristics of multimedia content such as image, video, VR, point cloud, mesh, multimodality, etc. In addition to traditional images and videos, immersive multimedia and generative artificial intelligence (GenAI) content are also discussed.
arXiv Detail & Related papers (2025-03-01T21:28:12Z)
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics [74.10447111842504]
UniReal is a unified framework designed to address various image generation and editing tasks. Inspired by recent video generation models, we propose a unifying approach that treats image-level tasks as discontinuous video generation. Although designed for image-level tasks, we leverage videos as a scalable source for universal supervision.
arXiv Detail & Related papers (2024-12-10T18:59:55Z)
Video Quality Assessment: A Comprehensive Survey [55.734935003021576]
Video quality assessment (VQA) is an important processing task, aiming at predicting the quality of videos in a manner consistent with human judgments of perceived quality. We present a survey of recent progress in the development of VQA algorithms and the benchmarking studies and databases that make them possible.
arXiv Detail & Related papers (2024-12-04T05:25:17Z)
Perceptually Optimized Super Resolution [7.728090438152828]
We propose a perceptually inspired and architecture-agnostic approach for controlling the visual quality and efficiency of super-resolution techniques. The core is a perceptual model that dynamically guides super-resolution methods according to the human's sensitivity to image details. We demonstrate the application of our proposed model in combination with network branching, and network complexity reduction to improve the computational efficiency of super-resolution methods without visible quality loss.
arXiv Detail & Related papers (2024-11-26T15:24:45Z)
VQA$^2$: Visual Question Answering for Video Quality Assessment [76.81110038738699]
Video Quality Assessment (VQA) is a classic field in low-level visual perception. Recent studies in the image domain have demonstrated that Visual Question Answering (VQA) can enhance markedly low-level visual quality evaluation. We introduce the VQA2 Instruction dataset - the first visual question answering instruction dataset that focuses on video quality assessment. The VQA2 series models interleave visual and motion tokens to enhance the perception of spatial-temporal quality details in videos.
arXiv Detail & Related papers (2024-11-06T09:39:52Z)
Transformer-based Image and Video Inpainting: Current Challenges and Future Directions [5.2088618044533215]
Inpainting is a viable solution for various applications, including photographic restoration, video editing, and medical imaging. CNNs and generative adversarial networks (GANs) have significantly enhanced the inpainting task. Visual transformers have been exploited and offer some improvements to image or video inpainting.
arXiv Detail & Related papers (2024-06-28T20:42:36Z)
Chain-of-Spot: Interactive Reasoning Improves Large Vision-Language Models [81.71651422951074]
Chain-of-Spot (CoS) method is a novel approach that enhances feature extraction by focusing on key regions of interest. This technique allows LVLMs to access more detailed visual information without altering the original image resolution. Our empirical findings demonstrate a significant improvement in LVLMs' ability to understand and reason about visual content.
arXiv Detail & Related papers (2024-03-19T17:59:52Z)
Reimagining Reality: A Comprehensive Survey of Video Inpainting Techniques [6.36998581871295]
Video inpainting is a process that restores or fills in missing or corrupted portions of video sequences with plausible content. Our study deconstructs major techniques, their underpinning theories, and their effective applications. We employ a human-centric approach to assess visual quality, enlisting a panel of annotators to evaluate the output of different video inpainting techniques.
arXiv Detail & Related papers (2024-01-31T14:41:40Z)
E2HQV: High-Quality Video Generation from Event Camera via Theory-Inspired Model-Aided Deep Learning [53.63364311738552]
Bio-inspired event cameras or dynamic vision sensors are capable of capturing per-pixel brightness changes (called event-streams) in high temporal resolution and high dynamic range. It calls for events-to-video (E2V) solutions which take event-streams as input and generate high quality video frames for intuitive visualization. We propose textbfE2HQV, a novel E2V paradigm designed to produce high-quality video frames from events.
arXiv Detail & Related papers (2024-01-16T05:10:50Z)
A Survey on Super Resolution for video Enhancement Using GAN [0.0]
Recent developments in super-resolution image and video using deep learning algorithms such as Generative Adversarial Networks are covered. Advancements aim to increase the visual clarity and quality of low-resolution video, have tremendous potential in a variety of sectors ranging from surveillance technology to medical imaging. This collection delves into the wider field of Generative Adversarial Networks, exploring their principles, training approaches, and applications across a broad range of domains.
arXiv Detail & Related papers (2023-12-27T08:41:38Z)
Artificial intelligence optical hardware empowers high-resolution hyperspectral video understanding at 1.2 Tb/s [53.91923493664551]
This work introduces a hardware-accelerated integrated optoelectronic platform for multidimensional video understanding in real-time. The technology platform combines artificial intelligence hardware, processing information optically, with state-of-the-art machine vision networks. Such performance surpasses the speed of the closest technologies with similar spectral resolution by three to four orders of magnitude.
arXiv Detail & Related papers (2023-12-17T07:51:38Z)
Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks [55.81577205593956]
Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously. Deep learning (DL) has been brought to this emerging field and inspired active research endeavors in mining its potential.
arXiv Detail & Related papers (2023-02-17T14:19:28Z)
Deep Neural Network-based Enhancement for Image and Video Streaming Systems: A Survey and Future Directions [20.835654670825782]
Deep learning has led to unprecedented performance in generating high-quality images from low-quality ones. We present state-of-the-art content delivery systems that employ neural enhancement as a key component in achieving both fast response time and high visual quality.
arXiv Detail & Related papers (2021-06-07T15:42:36Z)
Transformers in Vision: A Survey [101.07348618962111]
Transformers enable modeling long dependencies between input sequence elements and support parallel processing of sequence. Transformers require minimal inductive biases for their design and are naturally suited as set-functions. This survey aims to provide a comprehensive overview of the Transformer models in the computer vision discipline.
arXiv Detail & Related papers (2021-01-04T18:57:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.