Neural Networks with Divisive normalization for image segmentation with
application in cityscapes dataset
- URL: http://arxiv.org/abs/2203.13558v1
- Date: Fri, 25 Mar 2022 10:26:39 GMT
- Title: Neural Networks with Divisive normalization for image segmentation with
application in cityscapes dataset
- Authors: Pablo Hern\'andez-C\'amara, Valero Laparra, Jes\'us Malo (Image
Processing Lab., Universitat de Val\`encia)
- Abstract summary: We show that including divisive normalization in current deep networks makes them more invariant to non-informative changes in the images.
Experiments show that the inclusion of divisive normalization in the U-Net architecture leads to better segmentation results with respect to conventional U-Net.
- Score: 2.960890352853005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the key problems in computer vision is adaptation: models are too
rigid to follow the variability of the inputs. The canonical computation that
explains adaptation in sensory neuroscience is divisive normalization, and it
has appealing effects on image manifolds. In this work we show that including
divisive normalization in current deep networks makes them more invariant to
non-informative changes in the images. In particular, we focus on U-Net
architectures for image segmentation. Experiments show that the inclusion of
divisive normalization in the U-Net architecture leads to better segmentation
results with respect to conventional U-Net. The gain increases steadily when
dealing with images acquired in bad weather conditions. In addition to the
results on the Cityscapes and Foggy Cityscapes datasets, we explain these
advantages through visualization of the responses: the equalization induced by
the divisive normalization leads to more invariant features to local changes in
contrast and illumination.
Related papers
- The Overfocusing Bias of Convolutional Neural Networks: A Saliency-Guided Regularization Approach [11.524573224123905]
CNNs make decisions based on narrow, specific regions of input images.
This behavior can severely compromise the model's generalization capabilities.
We introduce Saliency Guided Dropout (SGDrop) to address this specific issue.
arXiv Detail & Related papers (2024-09-25T21:30:16Z) - Image Segmentation via Divisive Normalization: dealing with environmental diversity [0.8796261172196743]
We put segmentation U-nets augmented with Divisive Normalization to work far from training conditions.
We categorize scenes according to their radiance level and dynamic range (day/night), and according to their achromatic/chromatic contrasts.
Results show that neural networks with Divisive Normalization get better results in all the scenarios.
arXiv Detail & Related papers (2024-07-25T07:38:27Z) - Deformable Convolution Based Road Scene Semantic Segmentation of Fisheye Images in Autonomous Driving [4.720434481945155]
This study investigates the effectiveness of modern Deformable Convolutional Neural Networks (DCNNs) for semantic segmentation tasks.
Our experiments focus on segmenting the WoodScape fisheye image dataset into ten distinct classes, assessing the Deformable Networks' ability to capture intricate spatial relationships.
The significant improvement in mIoU score resulting from integrating Deformable CNNs demonstrates their effectiveness in handling the geometric distortions present in fisheye imagery.
arXiv Detail & Related papers (2024-07-23T17:02:24Z) - Distractors-Immune Representation Learning with Cross-modal Contrastive Regularization for Change Captioning [71.14084801851381]
Change captioning aims to succinctly describe the semantic change between a pair of similar images.
Most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
We propose a distractors-immune representation learning network that correlates the corresponding channels of two image representations.
arXiv Detail & Related papers (2024-07-16T13:00:33Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Accurate Image Restoration with Attention Retractable Transformer [50.05204240159985]
We propose Attention Retractable Transformer (ART) for image restoration.
ART presents both dense and sparse attention modules in the network.
We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks.
arXiv Detail & Related papers (2022-10-04T07:35:01Z) - Graph Reasoning Transformer for Image Parsing [67.76633142645284]
We propose a novel Graph Reasoning Transformer (GReaT) for image parsing to enable image patches to interact following a relation reasoning pattern.
Compared to the conventional transformer, GReaT has higher interaction efficiency and a more purposeful interaction pattern.
Results show that GReaT achieves consistent performance gains with slight computational overheads on the state-of-the-art transformer baselines.
arXiv Detail & Related papers (2022-09-20T08:21:37Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Attentive Normalization for Conditional Image Generation [126.08247355367043]
We characterize long-range dependence with attentive normalization (AN), which is an extension to traditional instance normalization.
Compared with self-attention GAN, our attentive normalization does not need to measure the correlation of all locations.
Experiments on class-conditional image generation and semantic inpainting verify the efficacy of our proposed module.
arXiv Detail & Related papers (2020-04-08T06:12:25Z) - $\P$ILCRO: Making Importance Landscapes Flat Again [7.047473967702792]
This paper shows that most of the existing convolutional architectures define, at initialisation, a specific feature importance landscape.
We derive the P-objective, or PILCRO for Pixel-wise Landscape Curvature Regularised Objective.
We show that P-regularised versions of popular computer vision networks have a flat importance landscape, train faster, result in a better accuracy and are more robust to noise at test time.
arXiv Detail & Related papers (2020-01-27T11:20:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.