On the Impact of Lossy Image and Video Compression on the Performance of
Deep Convolutional Neural Network Architectures
- URL: http://arxiv.org/abs/2007.14314v1
- Date: Tue, 28 Jul 2020 15:37:37 GMT
- Title: On the Impact of Lossy Image and Video Compression on the Performance of
Deep Convolutional Neural Network Architectures
- Authors: Matt Poyser, Amir Atapour-Abarghouei, Toby P. Breckon
- Abstract summary: This study investigates the impact of commonplace image and video compression techniques on the performance of deep learning architectures.
We examine the impact on performance across five discrete tasks: human pose estimation, semantic segmentation, object detection, action recognition, and monocular depth estimation.
Results show a non-linear and non-uniform relationship between network performance and the level of lossy compression applied.
- Score: 17.349420462716886
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in generalized image understanding have seen a surge in the
use of deep convolutional neural networks (CNN) across a broad range of
image-based detection, classification and prediction tasks. Whilst the reported
performance of these approaches is impressive, this study investigates the
hitherto unapproached question of the impact of commonplace image and video
compression techniques on the performance of such deep learning architectures.
Focusing on the JPEG and H.264 (MPEG-4 AVC) as a representative proxy for
contemporary lossy image/video compression techniques that are in common use
within network-connected image/video devices and infrastructure, we examine the
impact on performance across five discrete tasks: human pose estimation,
semantic segmentation, object detection, action recognition, and monocular
depth estimation. As such, within this study we include a variety of network
architectures and domains spanning end-to-end convolution, encoder-decoder,
region-based CNN (R-CNN), dual-stream, and generative adversarial networks
(GAN). Our results show a non-linear and non-uniform relationship between
network performance and the level of lossy compression applied. Notably,
performance decreases significantly below a JPEG quality (quantization) level
of 15% and a H.264 Constant Rate Factor (CRF) of 40. However, retraining said
architectures on pre-compressed imagery conversely recovers network performance
by up to 78.4% in some cases. Furthermore, there is a correlation between
architectures employing an encoder-decoder pipeline and those that demonstrate
resilience to lossy image compression. The characteristics of the relationship
between input compression to output task performance can be used to inform
design decisions within future image/video devices and infrastructure.
Related papers
- Releasing the Parameter Latency of Neural Representation for High-Efficiency Video Compression [18.769136361963472]
implicit neural representation (INR) technique models entire videos as basic units, automatically capturing intra-frame and inter-frame correlations.
In this paper, we show that our method significantly enhances the rate-distortion performance of INR video compression.
arXiv Detail & Related papers (2024-10-02T15:19:31Z) - Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression [58.618625678054826]
This study presents an enhanced neural compression method designed for optimal visual fidelity.
We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss.
Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression.
arXiv Detail & Related papers (2024-01-25T08:11:27Z) - Transferable Learned Image Compression-Resistant Adversarial Perturbations [66.46470251521947]
Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks.
We introduce a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules.
arXiv Detail & Related papers (2024-01-06T03:03:28Z) - Analysis of the Effect of Low-Overhead Lossy Image Compression on the
Performance of Visual Crowd Counting for Smart City Applications [78.55896581882595]
Lossy image compression techniques can reduce the quality of the images, leading to accuracy degradation.
In this paper, we analyze the effect of applying low-overhead lossy image compression methods on the accuracy of visual crowd counting.
arXiv Detail & Related papers (2022-07-20T19:20:03Z) - Reducing Redundancy in the Bottleneck Representation of the Autoencoders [98.78384185493624]
Autoencoders are a type of unsupervised neural networks, which can be used to solve various tasks.
We propose a scheme to explicitly penalize feature redundancies in the bottleneck representation.
We tested our approach across different tasks: dimensionality reduction using three different dataset, image compression using the MNIST dataset, and image denoising using fashion MNIST.
arXiv Detail & Related papers (2022-02-09T18:48:02Z) - Exploring Structural Sparsity in Neural Image Compression [14.106763725475469]
We propose a plug-in adaptive binary channel masking(ABCM) to judge the importance of each convolution channel and introduce sparsity during training.
During inference, the unimportant channels are pruned to obtain slimmer network and less computation.
Experiment results show that up to 7x computation reduction and 3x acceleration can be achieved with negligible performance drop.
arXiv Detail & Related papers (2022-02-09T17:46:49Z) - Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG
Encoder-Decoder [73.48927855855219]
We propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends.
Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics.
arXiv Detail & Related papers (2022-01-27T20:20:03Z) - NeighCNN: A CNN based SAR Speckle Reduction using Feature preserving
Loss Function [1.7188280334580193]
NeighCNN is a deep learning-based speckle reduction algorithm that handles multiplicative noise.
Various synthetic, as well as real SAR images, are used for testing the NeighCNN architecture.
arXiv Detail & Related papers (2021-08-26T04:20:07Z) - Generic Perceptual Loss for Modeling Structured Output Dependencies [78.59700528239141]
We show that, what matters is the network structure instead of the trained weights.
We demonstrate that a randomly-weighted deep CNN can be used to model the structured dependencies of outputs.
arXiv Detail & Related papers (2021-03-18T23:56:07Z) - Efficient CNN-LSTM based Image Captioning using Neural Network
Compression [0.0]
We present an unconventional end to end compression pipeline of a CNN-LSTM based Image Captioning model.
We then examine the effects of different compression architectures on the model and design a compression architecture that achieves a 73.1% reduction in model size.
arXiv Detail & Related papers (2020-12-17T16:25:09Z) - End-to-End JPEG Decoding and Artifacts Suppression Using Heterogeneous
Residual Convolutional Neural Network [0.0]
Existing deep learning models separate JPEG artifacts suppression from the decoding protocol as independent task.
We take one step forward to design a true end-to-end heterogeneous residual convolutional neural network (HR-CNN) with spectrum decomposition and heterogeneous reconstruction mechanism.
arXiv Detail & Related papers (2020-07-01T17:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.