Deep Neural Networks Can Learn Generalizable Same-Different Visual
Relations
- URL: http://arxiv.org/abs/2310.09612v1
- Date: Sat, 14 Oct 2023 16:28:57 GMT
- Title: Deep Neural Networks Can Learn Generalizable Same-Different Visual
Relations
- Authors: Alexa R. Tartaglini, Sheridan Feucht, Michael A. Lepori, Wai Keen
Vong, Charles Lovering, Brenden M. Lake, and Ellie Pavlick
- Abstract summary: We study whether deep neural networks can acquire and generalize same-different relations both within and out-of-distribution.
We find that certain pretrained transformers can learn a same-different relation that generalizes with near perfect accuracy to out-of-distribution stimuli.
- Score: 22.205838756057314
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although deep neural networks can achieve human-level performance on many
object recognition benchmarks, prior work suggests that these same models fail
to learn simple abstract relations, such as determining whether two objects are
the same or different. Much of this prior work focuses on training
convolutional neural networks to classify images of two same or two different
abstract shapes, testing generalization on within-distribution stimuli. In this
article, we comprehensively study whether deep neural networks can acquire and
generalize same-different relations both within and out-of-distribution using a
variety of architectures, forms of pretraining, and fine-tuning datasets. We
find that certain pretrained transformers can learn a same-different relation
that generalizes with near perfect accuracy to out-of-distribution stimuli.
Furthermore, we find that fine-tuning on abstract shapes that lack texture or
color provides the strongest out-of-distribution generalization. Our results
suggest that, with the right approach, deep neural networks can learn
generalizable same-different visual relations.
Related papers
- Expressivity of Neural Networks with Random Weights and Learned Biases [44.02417750529102]
Recent work has pushed the bounds of universal approximation by showing that arbitrary functions can similarly be learned by tuning smaller subsets of parameters.
We provide theoretical and numerical evidence demonstrating that feedforward neural networks with fixed random weights can be trained to perform multiple tasks by learning biases only.
Our results are relevant to neuroscience, where they demonstrate the potential for behaviourally relevant changes in dynamics without modifying synaptic weights.
arXiv Detail & Related papers (2024-07-01T04:25:49Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - On Privileged and Convergent Bases in Neural Network Representations [7.888192939262696]
We show that even in wide networks such as WideResNets, neural networks do not converge to a unique basis.
We also analyze Linear Mode Connectivity, which has been studied as a measure of basis correlation.
arXiv Detail & Related papers (2023-07-24T17:11:39Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Similarity and Matching of Neural Network Representations [0.0]
We employ a toolset -- dubbed Dr. Frankenstein -- to analyse the similarity of representations in deep neural networks.
We aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer.
arXiv Detail & Related papers (2021-10-27T17:59:46Z) - Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis.
We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Neural Rule Ensembles: Encoding Sparse Feature Interactions into Neural
Networks [3.7277730514654555]
We use decision trees to capture relevant features and their interactions and define a mapping to encode extracted relationships into a neural network.
At the same time through feature selection it enables learning of compact representations compared to state of the art tree-based approaches.
arXiv Detail & Related papers (2020-02-11T11:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.