Unraveling the geometry of visual relational reasoning
- URL: http://arxiv.org/abs/2502.17382v1
- Date: Mon, 24 Feb 2025 18:07:54 GMT
- Title: Unraveling the geometry of visual relational reasoning
- Authors: Jiaqi Shang, Gabriel Kreiman, Haim Sompolinsky,
- Abstract summary: Humans and other animals readily generalize abstract relations, such as recognizing constant in shape or color, whereas neural networks struggle.<n>Building on a geometric theory of neural representations, we show representational geometries that predict generalization.<n>Our findings offer geometric insights into how neural networks generalize abstract relations, paving the way for more human-like visual reasoning in AI.
- Score: 11.82509693248749
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans and other animals readily generalize abstract relations, such as recognizing constant in shape or color, whereas neural networks struggle. To investigate how neural networks generalize abstract relations, we introduce SimplifiedRPM, a novel benchmark for systematic evaluation. In parallel, we conduct human experiments to benchmark relational difficulty, enabling direct model-human comparisons. Testing four architectures--ResNet-50, Vision Transformer, Wild Relation Network, and Scattering Compositional Learner (SCL)--we find that SCL best aligns with human behavior and generalizes best. Building on a geometric theory of neural representations, we show representational geometries that predict generalization. Layer-wise analysis reveals distinct relational reasoning strategies across models and suggests a trade-off where unseen rule representations compress into training-shaped subspaces. Guided by our geometric perspective, we propose and evaluate SNRloss, a novel objective balancing representation geometry. Our findings offer geometric insights into how neural networks generalize abstract relations, paving the way for more human-like visual reasoning in AI.
Related papers
- Revealing Bias Formation in Deep Neural Networks Through the Geometric Mechanisms of Human Visual Decoupling [9.609083308026786]
Deep neural networks (DNNs) often exhibit biases toward certain categories during object recognition.<n>We propose a geometric analysis framework linking the geometric complexity of class-specific perceptual Manifolds to model bias.<n>We present the Perceptual-Manifold-Geometry library, designed for calculating the geometric properties of perceptual Manifolds.
arXiv Detail & Related papers (2025-02-17T13:54:02Z) - Exploring Geometric Representational Alignment through Ollivier-Ricci Curvature and Ricci Flow [0.0]
We use Ollivier-Ricci curvature and Ricci flow as tools to study the alignment of representations between humans and artificial neural systems.<n>As a proof-of-principle study, we compared the representations of face stimuli between VGG-Face, a human-aligned version of VGG-Face, and corresponding human similarity judgments from a large online study.
arXiv Detail & Related papers (2025-01-01T18:33:48Z) - Graph Neural Networks Uncover Geometric Neural Representations in Reinforcement-Based Motor Learning [3.379988469252273]
Graph Neural Networks (GNN) can capture the geometric properties of neural representations in EEG data.
We study how reinforcement-based motor learning affects neural activity patterns during motor planning.
arXiv Detail & Related papers (2024-10-31T10:54:50Z) - Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey [4.013324399289249]
We survey the model merging literature through the lens of loss landscape geometry to connect observations from empirical studies on model merging and loss landscape analysis to phenomena that govern neural network training and the emergence of their inner representations.
We distill repeated empirical observations from the literature in these fields into descriptions of four major characteristics of loss landscape geometry: mode convexity, determinism, directedness, and connectivity.
arXiv Detail & Related papers (2024-10-16T18:14:05Z) - Human-Like Geometric Abstraction in Large Pre-trained Neural Networks [6.650735854030166]
We revisit empirical results in cognitive science on geometric visual processing.
We identify three key biases in geometric visual processing.
We test tasks from the literature that probe these biases in humans and find that large pre-trained neural network models used in AI demonstrate more human-like abstract geometric processing.
arXiv Detail & Related papers (2024-02-06T17:59:46Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.<n>We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.<n>In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - pix2rule: End-to-end Neuro-symbolic Rule Learning [84.76439511271711]
This paper presents a complete neuro-symbolic method for processing images into objects, learning relations and logical rules.
The main contribution is a differentiable layer in a deep learning architecture from which symbolic relations and rules can be extracted.
We demonstrate that our model scales beyond state-of-the-art symbolic learners and outperforms deep relational neural network architectures.
arXiv Detail & Related papers (2021-06-14T15:19:06Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.