Related papers: Foveated Retinotopy Improves Classification and Localization in CNNs

Foveated Retinotopy Improves Classification and Localization in CNNs

URL: http://arxiv.org/abs/2402.15480v3
Date: Sun, 29 Dec 2024 20:13:50 GMT
Title: Foveated Retinotopy Improves Classification and Localization in CNNs
Authors: Jean-Nicolas Jérémie, Emmanuel Daucé, Laurent U Perrinet,
Abstract summary: We show how incorporating foveated retinotopy may benefit deep convolutional neural networks (CNNs) in image classification tasks.<n>Our findings suggest that foveated retinotopic mapping encodes implicit knowledge about visual object geometry.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: From a falcon detecting prey to humans recognizing faces, many species exhibit extraordinary abilities in rapid visual localization and classification. These are made possible by a specialized retinal region called the fovea, which provides high acuity at the center of vision while maintaining lower resolution in the periphery. This distinctive spatial organization, preserved along the early visual pathway through retinotopic mapping, is fundamental to biological vision, yet remains largely unexplored in machine learning. Our study investigates how incorporating foveated retinotopy may benefit deep convolutional neural networks (CNNs) in image classification tasks. By implementing a foveated retinotopic transformation in the input layer of standard ResNet models and re-training them, we maintain comparable classification accuracy while enhancing the network's robustness to scale and rotational perturbations. Although this architectural modification introduces increased sensitivity to fixation point shifts, we demonstrate how this apparent limitation becomes advantageous: variations in classification probabilities across different gaze positions serve as effective indicators for object localization. Our findings suggest that foveated retinotopic mapping encodes implicit knowledge about visual object geometry, offering an efficient solution to the visual search problem - a capability crucial for many living species.

Related papers

Gaze-Guided Learning: Avoiding Shortcut Bias in Visual Classification [3.1208151315473622]
We introduce Gaze-CIFAR-10, a human gaze time-series dataset, along with a dual-sequence gaze encoder. In parallel, a Vision Transformer (ViT) is employed to learn the sequential representation of image content. Our framework integrates human gaze priors with machine-derived visual sequences, effectively correcting inaccurate localization in image feature representations.
arXiv Detail & Related papers (2025-04-08T00:40:46Z)
Progressive Retinal Image Registration via Global and Local Deformable Transformations [49.032894312826244]
We propose a hybrid registration framework called HybridRetina. We use a keypoint detector and a deformation network called GAMorph to estimate the global transformation and local deformable transformation. Experiments on two widely-used datasets, FIRE and FLoRI21, show that our proposed HybridRetina significantly outperforms some state-of-the-art methods.
arXiv Detail & Related papers (2024-09-02T08:43:50Z)
Region Guided Attention Network for Retinal Vessel Segmentation [19.587662416331682]
We present a lightweight retinal vessel segmentation network based on the encoder-decoder mechanism with region-guided attention. Dice loss penalises false positives and false negatives equally, encouraging the model to generate more accurate segmentation. Experiments on a benchmark dataset show better performance (0.8285, 0.8098, 0.9677, and 0.8166 recall, precision, accuracy and F1 score respectively) compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-07-22T00:08:18Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Affine-Consistent Transformer for Multi-Class Cell Nuclei Detection [76.11864242047074]
We propose a novel Affine-Consistent Transformer (AC-Former), which directly yields a sequence of nucleus positions. We introduce an Adaptive Affine Transformer (AAT) module, which can automatically learn the key spatial transformations to warp original images for local network training. Experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art algorithms on various benchmarks.
arXiv Detail & Related papers (2023-10-22T02:27:02Z)
Unleashing the Power of Depth and Pose Estimation Neural Networks by Designing Compatible Endoscopic Images [12.412060445862842]
We conduct a detail analysis of the properties of endoscopic images and improve the compatibility of images and neural networks. First, we introcude the Mask Image Modelling (MIM) module, which inputs partial image information instead of complete image information. Second, we propose a lightweight neural network to enhance the endoscopic images, to explicitly improve the compatibility between images and neural networks.
arXiv Detail & Related papers (2023-09-14T02:19:38Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Deep Angiogram: Trivializing Retinal Vessel Segmentation [1.8479315677380455]
We propose a contrastive variational auto-encoder that can filter out irrelevant features and synthesize a latent image, named deep angiogram. The generalizability of the synthetic network is improved by the contrastive loss that makes the model less sensitive to variations of image contrast and noisy features.
arXiv Detail & Related papers (2023-07-01T06:13:10Z)
Increasing the Accuracy of a Neural Network Using Frequency Selective Mesh-to-Grid Resampling [4.211128681972148]
We propose the use of keypoint frequency selective mesh-to-grid resampling (FSMR) for the processing of input data for neural networks. We show that depending on the network architecture and classification task the application of FSMR during training aids learning process. The classification accuracy can be increased by up to 4.31 percentage points for ResNet50 and the Oxflower17 dataset.
arXiv Detail & Related papers (2022-09-28T21:34:47Z)
Saccade Mechanisms for Image Classification, Object Detection and Tracking [12.751552698602744]
We examine how the saccade mechanism from biological vision can be used to make deep neural networks more efficient for classification and object detection problems. Our proposed approach is based on the ideas of attention-driven visual processing and saccades, miniature eye movements influenced by attention.
arXiv Detail & Related papers (2022-06-10T13:50:34Z)
Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex. Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex. We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z)
Biologically inspired deep residual networks for computer vision applications [0.0]
We propose a biologically inspired deep residual neural network where the hexagonal convolutions are introduced along the skip connections. The proposed approach advances the baseline image classification accuracy of vanilla ResNet architectures.
arXiv Detail & Related papers (2022-05-05T10:23:43Z)
Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology [5.164102666113966]
We conduct a search for good representations in pathology by training a variety of self-supervised models with validation on a variety of weakly-supervised and patch-level tasks. Our key finding is in discovering that Vision Transformers using DINO-based knowledge distillation are able to learn data-efficient and interpretable features in histology images.
arXiv Detail & Related papers (2022-03-01T16:14:41Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Deep Spiking Convolutional Neural Network for Single Object Localization Based On Deep Continuous Local Learning [0.0]
We propose a deep convolutional spiking neural network for the localization of a single object in a grayscale image. Results reported on Oxford-IIIT-Pet validates the exploitation of spiking neural networks with a supervised learning approach.
arXiv Detail & Related papers (2021-05-12T12:02:05Z)
Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets) Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network" Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z)
Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights. It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness. In recent years, there has been a significant effort to automate the diagnosis using deep learning. This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN) Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.