Related papers: On the Texture Bias for Few-Shot CNN Segmentation

On the Texture Bias for Few-Shot CNN Segmentation

URL: http://arxiv.org/abs/2003.04052v3
Date: Wed, 23 Dec 2020 22:37:09 GMT
Title: On the Texture Bias for Few-Shot CNN Segmentation
Authors: Reza Azad, Abdur R Fayjie, Claude Kauffman, Ismail Ben Ayed, Marco Pedersoli, Jose Dolz
Abstract summary: Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks. Recent evidence suggests texture bias in CNNs provides higher performing models when learning on large labeled training datasets. We propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space.
Score: 21.349705243254423
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the initial belief that Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks, recent evidence suggests that texture bias in CNNs provides higher performing models when learning on large labeled training datasets. This contrasts with the perceptual bias in the human visual cortex, which has a stronger preference towards shape components. Perceptual differences may explain why CNNs achieve human-level performance when large labeled datasets are available, but their performance significantly degrades in lowlabeled data scenarios, such as few-shot semantic segmentation. To remove the texture bias in the context of few-shot learning, we propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space. This produces a set of modified feature maps, whose high-frequency components are diminished at different standard deviation values of the Gaussian distribution in the spatial domain. As this results in multiple feature maps for a single image, we employ a bi-directional convolutional long-short-term-memory to efficiently merge the multi scale-space representations. We perform extensive experiments on three well-known few-shot segmentation benchmarks -- Pascal i5, COCO-20i and FSS-1000 -- and demonstrate that our method outperforms state-of-the-art approaches in two datasets under the same conditions. The code is available at: https://github.com/rezazad68/fewshot-segmentation

Related papers

Heuristical Comparison of Vision Transformers Against Convolutional Neural Networks for Semantic Segmentation on Remote Sensing Imagery [0.0]
Vision Transformers (ViT) have brought a new wave of research in the field of computer vision. This paper focuses on the comparison of three key factors of using (or not using) ViT for semantic segmentation of aerial images. We show that the novel combined weighted loss function significantly boosts the CNN model's performance compared to transfer learning with ViT.
arXiv Detail & Related papers (2024-11-14T00:18:04Z)
SIGMA:Sinkhorn-Guided Masked Video Modeling [69.31715194419091]
Sinkhorn-guided Masked Video Modelling ( SIGMA) is a novel video pretraining method. We distribute features of space-time tubes evenly across a limited number of learnable clusters. Experimental results on ten datasets validate the effectiveness of SIGMA in learning more performant, temporally-aware, and robust video representations.
arXiv Detail & Related papers (2024-07-22T08:04:09Z)
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation [2.0901574458380403]
We propose a new lightweight but efficient model, namely LiteNeXt, for medical image segmentation. LiteNeXt is trained from scratch with small amount of parameters (0.71M) and Giga Floating Point Operations Per Second (0.42).
arXiv Detail & Related papers (2024-04-04T01:59:19Z)
Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge. We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z)
Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition. Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models. Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
A Novel Hand Gesture Detection and Recognition system based on ensemble-based Convolutional Neural Network [3.5665681694253903]
Detection of hand portion has become a challenging task in computer vision and pattern recognition communities. Deep learning algorithm like convolutional neural network (CNN) architecture has become a very popular choice for classification tasks. In this paper, an ensemble of CNN-based approaches is presented to overcome some problems like high variance during prediction, overfitting problem and also prediction errors.
arXiv Detail & Related papers (2022-02-25T06:46:58Z)
Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs) In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers. The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z)
Learning from Small Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales [11.210266084524998]
This paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful.
arXiv Detail & Related papers (2021-09-27T04:02:43Z)
Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for Scene Segmentation [1.713291434132985]
We propose a novel multi-scale attention network for scene segmentation by using contextual information from an image. This network can map local features with their global counterparts with improved accuracy and emphasize on discriminative image regions. We have evaluated our model on two standard datasets named PascalVOC2012 and ADE20k.
arXiv Detail & Related papers (2020-09-15T08:03:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.