Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation
- URL: http://arxiv.org/abs/2503.23212v2
- Date: Tue, 01 Apr 2025 00:57:54 GMT
- Title: Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation
- Authors: Max Gupta, Sunayana Rane, R. Thomas McCoy, Thomas L. Griffiths,
- Abstract summary: Humans remain vastly superior to CNNs in visual tasks involving relations.<n>We show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning.
- Score: 8.075796717801985
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While convolutional neural networks (CNNs) have come to match and exceed human performance in many settings, the tasks these models optimize for are largely constrained to the level of individual objects, such as classification and captioning. Humans remain vastly superior to CNNs in visual tasks involving relations, including the ability to identify two objects as `same' or `different'. A number of studies have shown that while CNNs can be coaxed into learning the same-different relation in some settings, they tend to generalize poorly to other instances of this relation. In this work we show that the same CNN architectures that fail to generalize the same-different relation with conventional training are able to succeed when trained via meta-learning, which explicitly encourages abstraction and generalization across tasks.
Related papers
- Deep Neural Networks Can Learn Generalizable Same-Different Visual
Relations [22.205838756057314]
We study whether deep neural networks can acquire and generalize same-different relations both within and out-of-distribution.
We find that certain pretrained transformers can learn a same-different relation that generalizes with near perfect accuracy to out-of-distribution stimuli.
arXiv Detail & Related papers (2023-10-14T16:28:57Z) - How Deep Neural Networks Learn Compositional Data: The Random Hierarchy Model [47.617093812158366]
We introduce the Random Hierarchy Model: a family of synthetic tasks inspired by the hierarchical structure of language and images.
We find that deep networks learn the task by developing internal representations invariant to exchanging equivalent groups.
Our results indicate how deep networks overcome the curse of dimensionality by building invariant representations.
arXiv Detail & Related papers (2023-07-05T09:11:09Z) - A novel feature-scrambling approach reveals the capacity of
convolutional neural networks to learn spatial relations [0.0]
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition.
Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
arXiv Detail & Related papers (2022-12-12T16:40:29Z) - Towards a General Purpose CNN for Long Range Dependencies in
$\mathrm{N}$D [49.57261544331683]
We propose a single CNN architecture equipped with continuous convolutional kernels for tasks on arbitrary resolution, dimensionality and length without structural changes.
We show the generality of our approach by applying the same CCNN to a wide set of tasks on sequential (1$mathrmD$) and visual data (2$mathrmD$)
Our CCNN performs competitively and often outperforms the current state-of-the-art across all tasks considered.
arXiv Detail & Related papers (2022-06-07T15:48:02Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - CG-CNN: Self-Supervised Feature Extraction Through Contextual Guidance and Transfer Learning [0.0]
Contextually Guided Convolutional Neural Networks (CG-CNNs) employ self-supervision and contextual information to develop transferable features across diverse domains.
This work showcases the adaptability of CG-CNNs through applications to various datasets such as Caltech and Brodatz textures, the VibTac-12 tactile dataset, hyperspectral images, and challenges like the XOR problem and text analysis.
arXiv Detail & Related papers (2021-03-02T08:41:12Z) - Exploring the Interchangeability of CNN Embedding Spaces [0.5735035463793008]
We map between 10 image-classification CNNs and between 4 facial-recognition CNNs.
For CNNs trained to the same classes and sharing a common backend-logit architecture, a linear-mapping may always be calculated directly from the backend layer weights.
The implications are far-reaching, suggesting an underlying commonality between representations learned by networks designed and trained for a common task.
arXiv Detail & Related papers (2020-10-05T20:32:40Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Teaching CNNs to mimic Human Visual Cognitive Process & regularise
Texture-Shape bias [18.003188982585737]
Recent experiments in computer vision demonstrate texture bias as the primary reason for supreme results in models employing Convolutional Neural Networks (CNNs)
It is believed that the cost function forces the CNN to take a greedy approach and develop a proclivity for local information like texture to increase accuracy, thus failing to explore any global statistics.
We propose CognitiveCNN, a new intuitive architecture, inspired from feature integration theory in psychology to utilise human interpretable feature like shape, texture, edges etc. to reconstruct, and classify the image.
arXiv Detail & Related papers (2020-06-25T22:32:54Z) - Neural Additive Models: Interpretable Machine Learning with Neural Nets [77.66871378302774]
Deep neural networks (DNNs) are powerful black-box predictors that have achieved impressive performance on a wide variety of tasks.
We propose Neural Additive Models (NAMs) which combine some of the expressivity of DNNs with the inherent intelligibility of generalized additive models.
NAMs learn a linear combination of neural networks that each attend to a single input feature.
arXiv Detail & Related papers (2020-04-29T01:28:32Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.