On the use of Cortical Magnification and Saccades as Biological Proxies
for Data Augmentation
- URL: http://arxiv.org/abs/2112.07173v1
- Date: Tue, 14 Dec 2021 05:38:26 GMT
- Title: On the use of Cortical Magnification and Saccades as Biological Proxies
for Data Augmentation
- Authors: Binxu Wang, David Mayo, Arturo Deza, Andrei Barbu, Colin Conwell
- Abstract summary: Most self-supervised methods encourage the system to learn an invariant representation of different transformations of the same image.
In this paper, we attempt to reverse-engineer these augmentations to be more biologically or perceptually plausible.
We find that random cropping can be substituted by cortical magnification, and saccade-like sampling of the image could also assist the representation learning.
- Score: 9.848635287149355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning is a powerful way to learn useful representations
from natural data. It has also been suggested as one possible means of building
visual representation in humans, but the specific objective and algorithm are
unknown. Currently, most self-supervised methods encourage the system to learn
an invariant representation of different transformations of the same image in
contrast to those of other images. However, such transformations are generally
non-biologically plausible, and often consist of contrived perceptual schemes
such as random cropping and color jittering. In this paper, we attempt to
reverse-engineer these augmentations to be more biologically or perceptually
plausible while still conferring the same benefits for encouraging robust
representation. Critically, we find that random cropping can be substituted by
cortical magnification, and saccade-like sampling of the image could also
assist the representation learning. The feasibility of these transformations
suggests a potential way that biological visual systems could implement
self-supervision. Further, they break the widely accepted spatially-uniform
processing assumption used in many computer vision algorithms, suggesting a
role for spatially-adaptive computation in humans and machines alike. Our code
and demo can be found here.
Related papers
- Estimating the distribution of numerosity and non-numerical visual magnitudes in natural scenes using computer vision [0.08192907805418582]
We show that in natural visual scenes the frequency of appearance of different numerosities follows a power law distribution.
We show that the correlational structure for numerosity and continuous magnitudes is stable across datasets and scene types.
arXiv Detail & Related papers (2024-09-17T09:49:29Z) - Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z) - Perception Over Time: Temporal Dynamics for Robust Image Understanding [5.584060970507506]
Deep learning surpasses human-level performance in narrow and specific vision tasks.
Human visual perception is orders of magnitude more robust to changes in the input stimulus.
We introduce a novel method of incorporating temporal dynamics into static image understanding.
arXiv Detail & Related papers (2022-03-11T21:11:59Z) - Improving Transferability of Representations via Augmentation-Aware
Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples.
Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability.
AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z) - Residual Relaxation for Multi-view Representation Learning [64.40142301026805]
Multi-view methods learn by aligning multiple views of the same image.
Some useful augmentations, such as image rotation, are harmful for multi-view methods because they cause a semantic shift.
We develop a generic approach, Pretext-aware Residual Relaxation (Prelax), that relaxes the exact alignment.
arXiv Detail & Related papers (2021-10-28T17:57:17Z) - Focus on the Positives: Self-Supervised Learning for Biodiversity
Monitoring [9.086207853136054]
We address the problem of learning self-supervised representations from unlabeled image collections.
We exploit readily available context data that encodes information such as the spatial and temporal relationships between the input images.
For the critical task of global biodiversity monitoring, this results in image features that can be adapted to challenging visual species classification tasks with limited human supervision.
arXiv Detail & Related papers (2021-08-14T01:12:41Z) - This is not the Texture you are looking for! Introducing Novel
Counterfactual Explanations for Non-Experts using Generative Adversarial
Learning [59.17685450892182]
counterfactual explanation systems try to enable a counterfactual reasoning by modifying the input image.
We present a novel approach to generate such counterfactual image explanations based on adversarial image-to-image translation techniques.
Our results show that our approach leads to significantly better results regarding mental models, explanation satisfaction, trust, emotions, and self-efficacy than two state-of-the art systems.
arXiv Detail & Related papers (2020-12-22T10:08:05Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - Orientation-Disentangled Unsupervised Representation Learning for
Computational Pathology [6.468635277309852]
We propose to extend the Variational Auto-Encoder framework by leveraging the group structure of rotation-equivariant convolutional networks.
We show that the trained models efficiently disentangle the inherent orientation information of single-cell images.
arXiv Detail & Related papers (2020-08-26T16:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.