Unraveling the geometry of visual relational reasoning
- URL: http://arxiv.org/abs/2502.17382v1
- Date: Mon, 24 Feb 2025 18:07:54 GMT
- Title: Unraveling the geometry of visual relational reasoning
- Authors: Jiaqi Shang, Gabriel Kreiman, Haim Sompolinsky,
- Abstract summary: Humans and other animals readily generalize abstract relations, such as recognizing constant in shape or color, whereas neural networks struggle.<n>Building on a geometric theory of neural representations, we show representational geometries that predict generalization.<n>Our findings offer geometric insights into how neural networks generalize abstract relations, paving the way for more human-like visual reasoning in AI.
- Score: 11.82509693248749
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans and other animals readily generalize abstract relations, such as recognizing constant in shape or color, whereas neural networks struggle. To investigate how neural networks generalize abstract relations, we introduce SimplifiedRPM, a novel benchmark for systematic evaluation. In parallel, we conduct human experiments to benchmark relational difficulty, enabling direct model-human comparisons. Testing four architectures--ResNet-50, Vision Transformer, Wild Relation Network, and Scattering Compositional Learner (SCL)--we find that SCL best aligns with human behavior and generalizes best. Building on a geometric theory of neural representations, we show representational geometries that predict generalization. Layer-wise analysis reveals distinct relational reasoning strategies across models and suggests a trade-off where unseen rule representations compress into training-shaped subspaces. Guided by our geometric perspective, we propose and evaluate SNRloss, a novel objective balancing representation geometry. Our findings offer geometric insights into how neural networks generalize abstract relations, paving the way for more human-like visual reasoning in AI.
Related papers
- Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - Scalable Geometric Learning with Correlation-Based Functional Brain Networks [0.0]
The correlation matrix is a central representation of functional brain networks in neuroimaging.<n>Traditional analyses often treat pairwise interactions independently in a Euclidean setting.<n>This paper presents a novel geometric framework that embeds correlation matrices into a Euclidean space.
arXiv Detail & Related papers (2025-03-31T01:35:50Z) - Revealing Bias Formation in Deep Neural Networks Through the Geometric Mechanisms of Human Visual Decoupling [9.609083308026786]
Deep neural networks (DNNs) often exhibit biases toward certain categories during object recognition.<n>We propose a geometric analysis framework linking the geometric complexity of class-specific perceptual Manifolds to model bias.<n>We present the Perceptual-Manifold-Geometry library, designed for calculating the geometric properties of perceptual Manifolds.
arXiv Detail & Related papers (2025-02-17T13:54:02Z) - Exploring Geometric Representational Alignment through Ollivier-Ricci Curvature and Ricci Flow [0.0]
We use Ollivier-Ricci curvature and Ricci flow as tools to study the alignment of representations between humans and artificial neural systems.<n>As a proof-of-principle study, we compared the representations of face stimuli between VGG-Face, a human-aligned version of VGG-Face, and corresponding human similarity judgments from a large online study.
arXiv Detail & Related papers (2025-01-01T18:33:48Z) - Graph Neural Networks Uncover Geometric Neural Representations in Reinforcement-Based Motor Learning [3.379988469252273]
Graph Neural Networks (GNN) can capture the geometric properties of neural representations in EEG data.
We study how reinforcement-based motor learning affects neural activity patterns during motor planning.
arXiv Detail & Related papers (2024-10-31T10:54:50Z) - Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey [4.013324399289249]
We survey the model merging literature through the lens of loss landscape geometry to connect observations from empirical studies on model merging and loss landscape analysis to phenomena that govern neural network training and the emergence of their inner representations.
We distill repeated empirical observations from the literature in these fields into descriptions of four major characteristics of loss landscape geometry: mode convexity, determinism, directedness, and connectivity.
arXiv Detail & Related papers (2024-10-16T18:14:05Z) - Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation [59.138470433237615]
We introduce statistical metrics that quantify both the linguistic and visual skew of a dataset for relational learning.
We show that systematically controlled metrics are strongly predictive of generalization performance.
This work informs an important direction towards quality-enhancing the data diversity or balance to scaling up the absolute size.
arXiv Detail & Related papers (2024-03-25T03:18:39Z) - A Relational Inductive Bias for Dimensional Abstraction in Neural
Networks [3.5063551678446494]
This paper investigates the impact of the relational bottleneck on the learning of factorized representations conducive to compositional coding.
We demonstrate that such a bottleneck not only improves generalization and learning efficiency, but also aligns network performance with human-like behavioral biases.
arXiv Detail & Related papers (2024-02-28T15:51:05Z) - Human-Like Geometric Abstraction in Large Pre-trained Neural Networks [6.650735854030166]
We revisit empirical results in cognitive science on geometric visual processing.
We identify three key biases in geometric visual processing.
We test tasks from the literature that probe these biases in humans and find that large pre-trained neural network models used in AI demonstrate more human-like abstract geometric processing.
arXiv Detail & Related papers (2024-02-06T17:59:46Z) - Neural Causal Abstractions [63.21695740637627]
We develop a new family of causal abstractions by clustering variables and their domains.
We show that such abstractions are learnable in practical settings through Neural Causal Models.
Our experiments support the theory and illustrate how to scale causal inferences to high-dimensional settings involving image data.
arXiv Detail & Related papers (2024-01-05T02:00:27Z) - Riemannian Residual Neural Networks [58.925132597945634]
We show how to extend the residual neural network (ResNet)
ResNets have become ubiquitous in machine learning due to their beneficial learning properties, excellent empirical results, and easy-to-incorporate nature when building varied neural networks.
arXiv Detail & Related papers (2023-10-16T02:12:32Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - A Cognitively-Inspired Neural Architecture for Visual Abstract Reasoning
Using Contrastive Perceptual and Conceptual Processing [14.201935774784632]
We introduce a new neural architecture for solving visual abstract reasoning tasks inspired by human cognition.
Inspired by this principle, our architecture models visual abstract reasoning as an iterative, self-contrasting learning process.
Experiments on the machine learning dataset RAVEN show that CPCNet achieves higher accuracy than all previously published models.
arXiv Detail & Related papers (2023-09-19T11:18:01Z) - Evaluating alignment between humans and neural network representations in image-based learning tasks [5.657101730705275]
We tested how well the representations of $86$ pretrained neural network models mapped to human learning trajectories.<n>We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation.<n>In conclusion, pretrained neural networks can serve to extract representations for cognitive models, as they appear to capture some fundamental aspects of cognition that are transferable across tasks.
arXiv Detail & Related papers (2023-06-15T08:18:29Z) - Language Knowledge-Assisted Representation Learning for Skeleton-Based
Action Recognition [71.35205097460124]
How humans understand and recognize the actions of others is a complex neuroscientific problem.
LA-GCN proposes a graph convolution network using large-scale language models (LLM) knowledge assistance.
arXiv Detail & Related papers (2023-05-21T08:29:16Z) - GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from
Multi-view Images [79.39247661907397]
We introduce an effective framework Generalizable Model-based Neural Radiance Fields to synthesize free-viewpoint images.
Specifically, we propose a geometry-guided attention mechanism to register the appearance code from multi-view 2D images to a geometry proxy.
arXiv Detail & Related papers (2023-03-24T03:32:02Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z) - pix2rule: End-to-end Neuro-symbolic Rule Learning [84.76439511271711]
This paper presents a complete neuro-symbolic method for processing images into objects, learning relations and logical rules.
The main contribution is a differentiable layer in a deep learning architecture from which symbolic relations and rules can be extracted.
We demonstrate that our model scales beyond state-of-the-art symbolic learners and outperforms deep relational neural network architectures.
arXiv Detail & Related papers (2021-06-14T15:19:06Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.