Understanding Cross-Model Perceptual Invariances Through Ensemble Metamers
- URL: http://arxiv.org/abs/2504.01739v2
- Date: Fri, 04 Apr 2025 09:10:25 GMT
- Title: Understanding Cross-Model Perceptual Invariances Through Ensemble Metamers
- Authors: Lukas Boehm, Jonas Leo Mueller, Christoffer Loeffler, Leo Schwinn, Bjoern Eskofier, Dario Zanca,
- Abstract summary: We introduce a novel approach to metamer generation by leveraging ensembles of artificial neural networks.<n>We employ a suite of image-based metrics that assess factors such as semantic fidelity and naturalness.<n>Our findings show that convolutional neural networks generate more recognizable and human-like metamers, while vision transformers produce realistic but less transferable metamers.
- Score: 2.9687381456164004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding the perceptual invariances of artificial neural networks is essential for improving explainability and aligning models with human vision. Metamers - stimuli that are physically distinct yet produce identical neural activations - serve as a valuable tool for investigating these invariances. We introduce a novel approach to metamer generation by leveraging ensembles of artificial neural networks, capturing shared representational subspaces across diverse architectures, including convolutional neural networks and vision transformers. To characterize the properties of the generated metamers, we employ a suite of image-based metrics that assess factors such as semantic fidelity and naturalness. Our findings show that convolutional neural networks generate more recognizable and human-like metamers, while vision transformers produce realistic but less transferable metamers, highlighting the impact of architectural biases on representational invariances.
Related papers
- Improving VisNet for Object Recognition [0.40048696135519796]
This study investigates a biologically inspired neural network model, and several enhanced variants incorporating radial basis function neurons, Mahalanobis distance based learning, and retinal like preprocessing for both general object recognition and symmetry classification.<n> Experimental results across multiple datasets, including MNIST, CIFAR10, and custom symmetric object sets, show that these enhanced VisNet variants substantially improve recognition accuracy compared with the baseline model.<n>These findings underscore the adaptability and biological relevance of VisNet inspired architectures, offering a powerful and interpretable framework for visual recognition in both neuroscience and artificial intelligence.
arXiv Detail & Related papers (2025-11-12T02:15:02Z) - Model Metamers Reveal Invariances in Graph Neural Networks [8.901234530419387]
In recent years, deep neural networks have been extensively employed in perceptual systems to learn representations endowed with invariances.<n>These networks aim to emulate the invariance mechanisms observed in the human brain.<n>However, studies in the visual and auditory domains have confirmed that significant gaps remain between the invariance properties of artificial neural networks and those of humans.
arXiv Detail & Related papers (2025-10-20T10:13:55Z) - Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - MAME: Multidimensional Adaptive Metamer Exploration with Human Perceptual Feedback [1.1317941257922182]
A widely adopted approach to explore functional alignment is to identify metamers for both humans and models.
We propose the Multidimensional Adaptive Metamer Exploration framework, enabling direct high-dimensional exploration of human metameric space.
Our framework has the potential to contribute to developing interpretable AI and understanding of brain function in neuroscience.
arXiv Detail & Related papers (2025-03-17T14:23:04Z) - Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities.<n>We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities.<n>We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z) - Inverting Transformer-based Vision Models [0.8124699127636158]
We apply a modular approach of training inverse models to reconstruct input images from intermediate layers within a Detection Transformer and a Vision Transformer.<n>Our analysis illustrates how these properties emerge within the models, contributing to a deeper understanding of transformer-based vision models.
arXiv Detail & Related papers (2024-12-09T14:43:06Z) - Artificial Kuramoto Oscillatory Neurons [65.16453738828672]
It has long been known in both neuroscience and AI that ''binding'' between neurons leads to a form of competitive learning.
We introduce Artificial rethinking together with arbitrary connectivity designs such as fully connected convolutional, or attentive mechanisms.
We show that this idea provides performance improvements across a wide spectrum of tasks such as unsupervised object discovery, adversarial robustness, uncertainty, and reasoning.
arXiv Detail & Related papers (2024-10-17T17:47:54Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - Connecting metrics for shape-texture knowledge in computer vision [1.7785095623975342]
Deep neural networks remain brittle and susceptible to many changes in the image that do not cause humans to misclassify images.
Part of this different behavior may be explained by the type of features humans and deep neural networks use in vision tasks.
arXiv Detail & Related papers (2023-01-25T14:37:42Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Drop, Swap, and Generate: A Self-Supervised Approach for Generating
Neural Activity [33.06823702945747]
We introduce a novel unsupervised approach for learning disentangled representations of neural activity called Swap-VAE.
Our approach combines a generative modeling framework with an instance-specific alignment loss.
We show that it is possible to build representations that disentangle neural datasets along relevant latent dimensions linked to behavior.
arXiv Detail & Related papers (2021-11-03T16:39:43Z) - Discrete-Valued Neural Communication [85.3675647398994]
We show that restricting the transmitted information among components to discrete representations is a beneficial bottleneck.
Even though individuals have different understandings of what a "cat" is based on their specific experiences, the shared discrete token makes it possible for communication among individuals to be unimpeded by individual differences in internal representation.
We extend the quantization mechanism from the Vector-Quantized Variational Autoencoder to multi-headed discretization with shared codebooks and use it for discrete-valued neural communication.
arXiv Detail & Related papers (2021-07-06T03:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.