PAC-FNO: Parallel-Structured All-Component Fourier Neural Operators for Recognizing Low-Quality Images
- URL: http://arxiv.org/abs/2402.12721v4
- Date: Thu, 14 Mar 2024 04:41:03 GMT
- Title: PAC-FNO: Parallel-Structured All-Component Fourier Neural Operators for Recognizing Low-Quality Images
- Authors: Jinsung Jeon, Hyundong Jin, Jonghyun Choi, Sanghyun Hong, Dongeun Lee, Kookjin Lee, Noseong Park,
- Abstract summary: We propose a novel neural network model, parallel-structured and all-component Fourier neural operator (PAC-FNO)
Unlike conventional feed-forward neural networks, PAC-FNO operates in the frequency domain, allowing it to handle images of varying resolutions within a single model.
We show that the proposed PAC-FNO improves the performance of existing baseline models on images with various resolutions by up to 77.1% and various types of natural variations in the images at inference.
- Score: 38.773390121161924
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: A standard practice in developing image recognition models is to train a model on a specific image resolution and then deploy it. However, in real-world inference, models often encounter images different from the training sets in resolution and/or subject to natural variations such as weather changes, noise types and compression artifacts. While traditional solutions involve training multiple models for different resolutions or input variations, these methods are computationally expensive and thus do not scale in practice. To this end, we propose a novel neural network model, parallel-structured and all-component Fourier neural operator (PAC-FNO), that addresses the problem. Unlike conventional feed-forward neural networks, PAC-FNO operates in the frequency domain, allowing it to handle images of varying resolutions within a single model. We also propose a two-stage algorithm for training PAC-FNO with a minimal modification to the original, downstream model. Moreover, the proposed PAC-FNO is ready to work with existing image recognition models. Extensively evaluating methods with seven image recognition benchmarks, we show that the proposed PAC-FNO improves the performance of existing baseline models on images with various resolutions by up to 77.1% and various types of natural variations in the images at inference.
Related papers
- A Simple Approach to Unifying Diffusion-based Conditional Generation [63.389616350290595]
We introduce a simple, unified framework to handle diverse conditional generation tasks.
Our approach enables versatile capabilities via different inference-time sampling schemes.
Our model supports additional capabilities like non-spatially aligned and coarse conditioning.
arXiv Detail & Related papers (2024-10-15T09:41:43Z) - Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model [4.096453902709292]
We propose a variable-rate generative NIC model to compress images to different bit rates.
By incorporating the newly proposed multi-realism technique, our method allows the users to adjust the bit rate, distortion, and realism with a single model.
Our method matches or surpasses the performance of state-of-the-art single-rate generative NIC models.
arXiv Detail & Related papers (2024-05-27T04:22:25Z) - FAS-UNet: A Novel FAS-driven Unet to Learn Variational Image
Segmentation [3.741136641573471]
We propose a novel variational-model-informed network (FAS-Unet) that exploits the model and algorithm priors to extract the multi-scale features.
The proposed network integrates image data and mathematical models, and implements them through learning a few convolution kernels.
Experimental results show that the proposed FAS-Unet is very competitive with other state-of-the-art methods in qualitative, quantitative and model complexity evaluations.
arXiv Detail & Related papers (2022-10-27T04:15:16Z) - A training-free recursive multiresolution framework for diffeomorphic
deformable image registration [6.929709872589039]
We propose a novel diffeomorphic training-free approach for deformable image registration.
The proposed architecture is simple in design. The moving image is warped successively at each resolution and finally aligned to the fixed image.
The entire system is end-to-end and optimized for each pair of images from scratch.
arXiv Detail & Related papers (2022-02-01T15:17:17Z) - Dynamic Proximal Unrolling Network for Compressive Sensing Imaging [29.00266254916676]
We present a dynamic proximal unrolling network (dubbed DPUNet), which can handle a variety of measurement matrices via one single model without retraining.
Specifically, DPUNet can exploit both embedded physical model via gradient descent and imposing image prior with learned dynamic proximal mapping.
Experimental results demonstrate that the proposed DPUNet can effectively handle multiple CSI modalities under varying sampling ratios and noise levels with only one model.
arXiv Detail & Related papers (2021-07-23T03:04:44Z) - SIR: Self-supervised Image Rectification via Seeing the Same Scene from
Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same.
We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters.
Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z) - Learning to Learn Parameterized Classification Networks for Scalable
Input Images [76.44375136492827]
Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change.
We employ meta learners to generate convolutional weights of main networks for various input scales.
We further utilize knowledge distillation on the fly over model predictions based on different input resolutions.
arXiv Detail & Related papers (2020-07-13T04:27:25Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.