A Modular System for Enhanced Robustness of Multimedia Understanding
Networks via Deep Parametric Estimation
- URL: http://arxiv.org/abs/2402.18402v2
- Date: Thu, 29 Feb 2024 09:14:17 GMT
- Title: A Modular System for Enhanced Robustness of Multimedia Understanding
Networks via Deep Parametric Estimation
- Authors: Francesco Barbato, Umberto Michieli, Mehmet Kerim Yucel, Pietro
Zanuttigh, Mete Ozay
- Abstract summary: In multimedia understanding tasks, corrupted samples pose a critical challenge, because when fed to machine learning models they lead to performance degradation.
We propose SyMPIE to enhance input data for robust downstream multimedia understanding with minimal computational cost.
Our key insight is that most input corruptions can be modeled through global operations on color channels of images or spatial filters with small kernels.
- Score: 30.904034138920057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multimedia understanding tasks, corrupted samples pose a critical
challenge, because when fed to machine learning models they lead to performance
degradation. In the past, three groups of approaches have been proposed to
handle noisy data: i) enhancer and denoiser modules to improve the quality of
the noisy data, ii) data augmentation approaches, and iii) domain adaptation
strategies. All the aforementioned approaches come with drawbacks that limit
their applicability; the first has high computational costs and requires pairs
of clean-corrupted data for training, while the others only allow deployment of
the same task/network they were trained on (\ie, when upstream and downstream
task/network are the same). In this paper, we propose SyMPIE to solve these
shortcomings. To this end, we design a small, modular, and efficient (just
2GFLOPs to process a Full HD image) system to enhance input data for robust
downstream multimedia understanding with minimal computational cost. Our SyMPIE
is pre-trained on an upstream task/network that should not match the downstream
ones and does not need paired clean-corrupted samples. Our key insight is that
most input corruptions found in real-world tasks can be modeled through global
operations on color channels of images or spatial filters with small kernels.
We validate our approach on multiple datasets and tasks, such as image
classification (on ImageNetC, ImageNetC-Bar, VizWiz, and a newly proposed mixed
corruption benchmark named ImageNetC-mixed) and semantic segmentation (on
Cityscapes, ACDC, and DarkZurich) with consistent improvements of about 5\%
relative accuracy gain across the board. The code of our approach and the new
ImageNetC-mixed benchmark will be made available upon publication.
Related papers
- Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation [12.62718910894575]
Point cloud segmentation (PCS) plays an essential role in robot perception and navigation tasks.
To efficiently understand large-scale outdoor point clouds, their range image representation is commonly adopted.
However, undesirable missing values in the range images damage the shapes and patterns of objects.
This problem creates difficulty for the models in learning coherent and complete geometric information from the objects.
arXiv Detail & Related papers (2024-05-16T15:13:42Z) - Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation.
We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks.
Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z) - DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image
Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments.
Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features.
Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z) - Learning to See Low-Light Images via Feature Domain Adaptation [17.033219611079165]
We propose a single-stage network empowered by Feature Domain Adaptation (FDA) to decouple the denoising and color mapping tasks in raw LLIE.
FDA can explore the global and local correlations with fewer line buffers.
Our method achieves state-of-the-art performance with fewer computing costs.
arXiv Detail & Related papers (2023-12-11T03:38:26Z) - PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling [83.67628239775878]
Masked Image Modeling (MIM) has achieved promising progress with the advent of Masked Autoencoders (MAE) and BEiT.
This paper undertakes a fundamental analysis of MIM from the perspective of pixel reconstruction.
We propose a remarkably simple and effective method, ourmethod, that entails two strategies.
arXiv Detail & Related papers (2023-03-04T13:38:51Z) - Multi-Stage Progressive Image Restoration [167.6852235432918]
We propose a novel synergistic design that can optimally balance these competing goals.
Our main proposal is a multi-stage architecture, that progressively learns restoration functions for the degraded inputs.
The resulting tightly interlinked multi-stage architecture, named as MPRNet, delivers strong performance gains on ten datasets.
arXiv Detail & Related papers (2021-02-04T18:57:07Z) - DeFlow: Learning Complex Image Degradations from Unpaired Data with
Conditional Flows [145.83812019515818]
We propose DeFlow, a method for learning image degradations from unpaired data.
We model the degradation process in the latent space of a shared flow-decoder network.
We validate our DeFlow formulation on the task of joint image restoration and super-resolution.
arXiv Detail & Related papers (2021-01-14T18:58:01Z) - Mixed-Privacy Forgetting in Deep Networks [114.3840147070712]
We show that the influence of a subset of the training samples can be removed from the weights of a network trained on large-scale image classification tasks.
Inspired by real-world applications of forgetting techniques, we introduce a novel notion of forgetting in mixed-privacy setting.
We show that our method allows forgetting without having to trade off the model accuracy.
arXiv Detail & Related papers (2020-12-24T19:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.