Related papers: Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

URL: http://arxiv.org/abs/2312.06716v2
Date: Thu, 20 Jun 2024 15:57:36 GMT
Title: Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Authors: Xiao Zhang, David Yunis, Michael Maire,
Abstract summary: We present an approach for analyzing grouping information contained within a neural network's activations. We exploit features from all layers and obviating the need to guess which part of the model contains relevant information.
Score: 15.59251297818324
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present an approach for analyzing grouping information contained within a neural network's activations, permitting extraction of spatial layout and semantic segmentation from the behavior of large pre-trained vision models. Unlike prior work, our method conducts a holistic analysis of a network's activation state, leveraging features from all layers and obviating the need to guess which part of the model contains relevant information. Motivated by classic spectral clustering, we formulate this analysis in terms of an optimization objective involving a set of affinity matrices, each formed by comparing features within a different layer. Solving this optimization problem using gradient descent allows our technique to scale from single images to dataset-level analysis, including, in the latter, both intra- and inter-image relationships. Analyzing a pre-trained generative transformer provides insight into the computational strategy learned by such models. Equating affinity with key-query similarity across attention layers yields eigenvectors encoding scene spatial layout, whereas defining affinity by value vector similarity yields eigenvectors encoding object identity. This result suggests that key and query vectors coordinate attentional information flow according to spatial proximity (a `where' pathway), while value vectors refine a semantic category representation (a `what' pathway).

Related papers

Navigating the Latent Space Dynamics of Neural Models [35.39092540369244]
We present an alternative interpretation of neural models as dynamical systems acting on the latent manifold.<n>We show that autoencoder models implicitly define a latent vector field on the manifold, derived by iteratively applying the encoding-decoding map.<n>We propose to leverage the vector field as a representation for the network, providing a novel tool to analyze the properties of the model and the data.
arXiv Detail & Related papers (2025-05-28T18:57:41Z)
Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis. We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data. FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z)
Self-supervised Learning of Contextualized Local Visual Embeddings [0.0]
Contextualized Local Visual Embeddings (CLoVE) is a self-supervised convolutional-based method that learns representations suited for dense prediction tasks. We benchmark CLoVE's pre-trained representations on multiple datasets. CLoVE reaches state-of-the-art performance for CNN-based architectures in 4 dense prediction downstream tasks.
arXiv Detail & Related papers (2023-10-01T00:13:06Z)
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images. We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z)
DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering [6.447863458841379]
This study introduces a lightweight Graph Neural Network (GNN) to replace classical clustering methods. Unlike existing methods, our GNN takes both the pair-wise affinities between local image features and the raw features as input. We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training an image segmentation GNN.
arXiv Detail & Related papers (2022-12-12T12:31:46Z)
The SVD of Convolutional Weights: A CNN Interpretability Framework [3.5783190448496343]
We propose a framework against which interpretability methods might be applied using hypergraphs to model class separation. Rather than looking to the activations to explain the network, we use the singular vectors with the greatest corresponding singular values for each linear layer to identify those features most important to the network.
arXiv Detail & Related papers (2022-08-14T18:23:02Z)
flow-based clustering and spectral clustering: a comparison [0.688204255655161]
We study a novel graph clustering method for data with an intrinsic network structure. We exploit an intrinsic network structure of data to construct Euclidean feature vectors. Our results indicate that our clustering methods can cope with certain graph structures.
arXiv Detail & Related papers (2022-06-20T21:49:52Z)
SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation. We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z)
Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies. A neural network is employed to act as structure prior and reveal the underlying signal interdependencies. Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)
Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning. Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector. We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)
Similarity of Neural Networks with Gradients [8.804507286438781]
We propose to leverage both feature vectors and gradient ones into designing the representation of a neural network. We show that the proposed approach provides a state-of-the-art method for computing similarity of neural networks.
arXiv Detail & Related papers (2020-03-25T17:04:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.