Learnt Deep Hyperparameter selection in Adversarial Training for
compressed video enhancement with perceptual critic
- URL: http://arxiv.org/abs/2302.14516v1
- Date: Tue, 28 Feb 2023 12:10:55 GMT
- Title: Learnt Deep Hyperparameter selection in Adversarial Training for
compressed video enhancement with perceptual critic
- Authors: Darren Ramsook, Anil Kokaram
- Abstract summary: Deep Feature Quality Metrics (DFQMs) have been shown to better correlate with subjective perceptual scores over traditional metrics.
We present a new method for selecting perceptually relevant layers from such a network, based on a neuroscience interpretation of layer behaviour.
Our results show that the introduction of these selected features into the critic yields up to 10% (FID) and 15% (KID) performance increase.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image based Deep Feature Quality Metrics (DFQMs) have been shown to better
correlate with subjective perceptual scores over traditional metrics. The
fundamental focus of these DFQMs is to exploit internal representations from a
large scale classification network as the metric feature space. Previously, no
attention has been given to the problem of identifying which layers are most
perceptually relevant. In this paper we present a new method for selecting
perceptually relevant layers from such a network, based on a neuroscience
interpretation of layer behaviour. The selected layers are treated as a
hyperparameter to the critic network in a W-GAN. The critic uses the output
from these layers in the preliminary stages to extract perceptual information.
A video enhancement network is trained adversarially with this critic. Our
results show that the introduction of these selected features into the critic
yields up to 10% (FID) and 15% (KID) performance increase against other critic
networks that do not exploit the idea of optimised feature selection.
Related papers
- TOPIQ: A Top-down Approach from Semantics to Distortions for Image
Quality Assessment [53.72721476803585]
Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks.
We propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions.
A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features.
arXiv Detail & Related papers (2023-08-06T09:08:37Z) - A Deep Learning based No-reference Quality Assessment Model for UGC
Videos [44.00578772367465]
Previous video quality assessment (VQA) studies either use the image recognition model or the image quality assessment (IQA) models to extract frame-level features of videos for quality regression.
We propose a very simple but effective VQA model, which trains an end-to-end spatial feature extraction network to learn the quality-aware spatial feature representation from raw pixels of the video frames.
With the better quality-aware features, we only use the simple multilayer perception layer (MLP) network to regress them into the chunk-level quality scores, and then the temporal average pooling strategy is adopted to obtain the video
arXiv Detail & Related papers (2022-04-29T12:45:21Z) - Learning Target-aware Representation for Visual Tracking via Informative
Interactions [49.552877881662475]
We introduce a novel backbone architecture to improve target-perception ability of feature representation for tracking.
The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer.
arXiv Detail & Related papers (2022-01-07T16:22:27Z) - No-Reference Point Cloud Quality Assessment via Domain Adaptation [31.280188860021248]
We present a novel no-reference quality assessment metric, the image transferred point cloud quality assessment (IT-PCQA) for 3D point clouds.
In particular, we treat natural images as the source domain and point clouds as the target domain, and infer point cloud quality via unsupervised adversarial domain adaptation.
Experimental results show that the proposed method can achieve higher performance than traditional no-reference metrics, even comparable results with full-reference metrics.
arXiv Detail & Related papers (2021-12-06T08:20:40Z) - Weakly-supervised fire segmentation by visualizing intermediate CNN
layers [82.75113406937194]
Fire localization in images and videos is an important step for an autonomous system to combat fire incidents.
We consider weakly supervised segmentation of fire in images, in which only image labels are used to train the network.
We show that in the case of fire segmentation, which is a binary segmentation problem, the mean value of features in a mid-layer of classification CNN can perform better than conventional Class Activation Mapping (CAM) method.
arXiv Detail & Related papers (2021-11-16T11:56:28Z) - (ASNA) An Attention-based Siamese-Difference Neural Network with
Surrogate Ranking Loss function for Perceptual Image Quality Assessment [0.0]
Deep convolutional neural networks (DCNN) that leverage the adversarial training framework for image restoration and enhancement have significantly improved the processed images' sharpness.
It is necessary to develop a quantitative metric to reflect their performances, which is well-aligned with the perceived quality of an image.
This paper has proposed a convolutional neural network using an extension architecture of the traditional Siamese network.
arXiv Detail & Related papers (2021-05-06T09:04:21Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Improving Action Quality Assessment using ResNets and Weighted
Aggregation [0.0]
Action quality assessment (AQA) aims at automatically judging human action based on a video of the said action and assigning a performance score to it.
The majority of works in the existing literature on AQA transform RGB videos to higher-level representations using C3D networks.
Due to the relatively shallow nature of C3D, the quality of extracted features is lower than what could be extracted using a deeper convolutional neural network.
arXiv Detail & Related papers (2021-02-21T08:36:22Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - Unsupervised Learning of Video Representations via Dense Trajectory
Clustering [86.45054867170795]
This paper addresses the task of unsupervised learning of representations for action recognition in videos.
We first propose to adapt two top performing objectives in this class - instance recognition and local aggregation.
We observe promising performance, but qualitative analysis shows that the learned representations fail to capture motion patterns.
arXiv Detail & Related papers (2020-06-28T22:23:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.