Related papers: Training objective drives the consistency of representational similarity across datasets

Training objective drives the consistency of representational similarity across datasets

URL: http://arxiv.org/abs/2411.05561v1
Date: Fri, 08 Nov 2024 13:35:45 GMT
Title: Training objective drives the consistency of representational similarity across datasets
Authors: Laure Ciernik, Lorenz Linhardt, Marco Morik, Jonas Dippel, Simon Kornblith, Lukas Muttenthaler,
Abstract summary: The Platonic Representation Hypothesis claims that recent foundation models are converging to a shared representation space as a function of their downstream task performance. Here, we propose a systematic way to measure how representational similarity between models varies with the set of stimuli used to construct the representations. We find that the objective function is the most crucial factor in determining the consistency of representational similarities across datasets.
Score: 19.99817888941361
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Platonic Representation Hypothesis claims that recent foundation models are converging to a shared representation space as a function of their downstream task performance, irrespective of the objectives and data modalities used to train these models. Representational similarity is generally measured for individual datasets and is not necessarily consistent across datasets. Thus, one may wonder whether this convergence of model representations is confounded by the datasets commonly used in machine learning. Here, we propose a systematic way to measure how representational similarity between models varies with the set of stimuli used to construct the representations. We find that the objective function is the most crucial factor in determining the consistency of representational similarities across datasets. Specifically, self-supervised vision models learn representations whose relative pairwise similarities generalize better from one dataset to another compared to those of image classification or image-text models. Moreover, the correspondence between representational similarities and the models' task behavior is dataset-dependent, being most strongly pronounced for single-domain datasets. Our work provides a framework for systematically measuring similarities of model representations across datasets and linking those similarities to differences in task behavior.

Related papers

Connecting Neural Models Latent Geometries with Relative Geodesic Representations [21.71782603770616]
We show that when a latent structure is shared between distinct latent spaces, relative distances between representations can be preserved, up to distortions.<n>We assume that distinct neural models parametrize approximately the same underlying manifold, and introduce a representation based on the pullback metric.<n>We validate our method on model stitching and retrieval tasks, covering autoencoders and vision foundation discriminative models.
arXiv Detail & Related papers (2025-06-02T12:34:55Z)
Investigating Fine- and Coarse-grained Structural Correspondences Between Deep Neural Networks and Human Object Image Similarity Judgments Using Unsupervised Alignment [0.14999444543328289]
We employ an unsupervised alignment method based on Gromov-Wasserstein Optimal Transport to compare human and model object representations.<n>We find that models trained with CLIP consistently achieve strong fine- and coarse-grained matching with human object representations.<n>Our results offer new insights into the role of linguistic information in acquiring precise object representations.
arXiv Detail & Related papers (2025-05-22T09:06:06Z)
Exploring Causes of Representational Similarity in Machine Learning Models [5.976564391252357]
This work explores how two possible causal factors -- dataset overlap and task overlap -- influence downstream model similarity.<n>The exploration of dataset overlap is motivated by the reality that large-scale generative AI models are often trained on overlapping datasets of scraped internet data.<n>We find that both positively correlate with higher representational similarity and that combining them provides the strongest effect.
arXiv Detail & Related papers (2025-05-20T04:03:11Z)
Generalized Visual Relation Detection with Diffusion Models [94.62313788626128]
Visual relation detection (VRD) aims to identify relationships (or interactions) between object pairs in an image.<n>We propose to model visual relations as continuous embeddings, and design diffusion models to achieve generalized VRD in a conditional generative manner.<n>Our Diff-VRD is able to generate visual relations beyond the pre-defined category labels of datasets.
arXiv Detail & Related papers (2025-04-16T14:03:24Z)
Enhancing Model Fairness and Accuracy with Similarity Networks: A Methodological Approach [0.20718016474717196]
We use different techniques to map instances into a similarity feature space. Our method's ability to adjust the resolution of pairwise similarity provides clear insights into the relationship between the dataset classification complexity and model fairness.
arXiv Detail & Related papers (2024-11-08T15:43:01Z)
Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization. We introduce a benchmark comprising eight different synthetic and real-world datasets. We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z)
Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z)
Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models. In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z)
Through a fair looking-glass: mitigating bias in image datasets [1.0323063834827415]
We present a fast and effective model to de-bias an image dataset through reconstruction and minimizing the statistical dependence between intended variables. We evaluate our proposed model on CelebA dataset, compare the results with a state-of-the-art de-biasing method, and show that the model achieves a promising fairness-accuracy combination.
arXiv Detail & Related papers (2022-09-18T20:28:36Z)
ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering. The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z)
IMACS: Image Model Attribution Comparison Summaries [16.80986701058596]
We introduce IMACS, a method that combines gradient-based model attributions with aggregation and visualization techniques. IMACS extracts salient input features from an evaluation dataset, clusters them based on similarity, then visualizes differences in model attributions for similar input features. We show how our technique can uncover behavioral differences caused by domain shift between two models trained on satellite images.
arXiv Detail & Related papers (2022-01-26T21:35:14Z)
Image Synthesis via Semantic Composition [74.68191130898805]
We present a novel approach to synthesize realistic images based on their semantic layouts. It hypothesizes that for objects with similar appearance, they share similar representation. Our method establishes dependencies between regions according to their appearance correlation, yielding both spatially variant and associated representations.
arXiv Detail & Related papers (2021-09-15T02:26:07Z)
Unified Graph Structured Models for Video Understanding [93.72081456202672]
We propose a message passing graph neural network that explicitly models relational-temporal relations. We show how our method is able to more effectively model relationships between relevant entities in the scene.
arXiv Detail & Related papers (2021-03-29T14:37:35Z)
Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning. We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning. We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z)
Similarity of Neural Networks with Gradients [8.804507286438781]
We propose to leverage both feature vectors and gradient ones into designing the representation of a neural network. We show that the proposed approach provides a state-of-the-art method for computing similarity of neural networks.
arXiv Detail & Related papers (2020-03-25T17:04:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.