Related papers: Reliability of CKA as a Similarity Measure in Deep Learning

Reliability of CKA as a Similarity Measure in Deep Learning

URL: http://arxiv.org/abs/2210.16156v1
Date: Fri, 28 Oct 2022 14:32:52 GMT
Title: Reliability of CKA as a Similarity Measure in Deep Learning
Authors: MohammadReza Davari, Stefan Horoi, Amine Natik, Guillaume Lajoie, Guy Wolf, Eugene Belilovsky
Abstract summary: We present analysis that characterizes CKA sensitivity to a large class of simple transformations. We investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counter-intuitive results. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models.
Score: 17.555458413538233
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Comparing learned neural representations in neural networks is a challenging but important problem, which has been approached in different ways. The Centered Kernel Alignment (CKA) similarity metric, particularly its linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of architecturally similar networks trained differently, or of models with different architectures trained on the same data. A wide variety of conclusions about similarity and dissimilarity of these various representations have been made using CKA. In this work we present analysis that formally characterizes CKA sensitivity to a large class of simple transformations, which can naturally occur in the context of modern machine learning. This provides a concrete explanation of CKA sensitivity to outliers, which has been observed in past works, and to transformations that preserve the linear separability of the data, an important generalization attribute. We empirically investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counter-intuitive results. Finally we study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Our results illustrate that, in many cases, the CKA value can be easily manipulated without substantial changes to the functional behaviour of the models, and call for caution when leveraging activation alignment metrics.

Related papers

Differentiable Optimization of Similarity Scores Between Models and Brains [1.5391321019692434]
Similarity measures such as linear regression, Centered Kernel Alignment (CKA), Normalized Bures Similarity (NBS), and angular Procrustes distance are often used to quantify this similarity. Here, we introduce a novel tool to investigate what drives high similarity scores and what constitutes a "good" score. Surprisingly, we find that high similarity scores do not guarantee encoding task-relevant information in a manner consistent with neural data.
arXiv Detail & Related papers (2024-07-09T17:31:47Z)
Correcting Biased Centered Kernel Alignment Measures in Biological and Artificial Neural Networks [4.437949196235149]
Centred Kernel Alignment (CKA) has recently emerged as a popular metric to compare activations from biological and artificial neural networks (ANNs) In this paper we highlight issues that the community should take into account if using CKA as an alignment metric with neural data.
arXiv Detail & Related papers (2024-05-02T05:27:12Z)
Towards out-of-distribution generalization in large-scale astronomical surveys: robust networks learn similar representations [3.653721769378018]
We use Centered Kernel Alignment (CKA), a similarity measure metric of neural network representations, to examine the relationship between representation similarity and performance. We find that when models are robust to a distribution shift, they produce substantially different representations across their layers on OOD data. We discuss the potential application of similarity representation in guiding model design, training strategy, and mitigating the OOD problem by incorporating CKA as an inductive bias during training.
arXiv Detail & Related papers (2023-11-29T19:00:05Z)
Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z)
On the Versatile Uses of Partial Distance Correlation in Deep Learning [47.11577420740119]
This paper revisits a (less widely known) from statistics, called distance correlation (and its partial variant), designed to evaluate correlation between feature spaces of different dimensions. We describe the steps necessary to carry out its deployment for large scale models. This opens the door to a surprising array of applications ranging from conditioning one deep model w.r.t. another, learning disentangled representations as well as optimizing diverse models that would directly be more robust to adversarial attacks.
arXiv Detail & Related papers (2022-07-20T06:36:11Z)
Invariant Causal Mechanisms through Distribution Matching [86.07327840293894]
In this work we provide a causal perspective and a new algorithm for learning invariant representations. Empirically we show that this algorithm works well on a diverse set of tasks and in particular we observe state-of-the-art performance on domain generalization.
arXiv Detail & Related papers (2022-06-23T12:06:54Z)
Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them. We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z)
Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric. Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z)
Grounding Representation Similarity with Statistical Testing [8.296135566684065]
We argue that measures should have sensitivity to changes that affect functional behavior, and specificity against changes that do not. We quantify this through a variety of functional behaviors including probing accuracy and robustness to distribution shift. We find that current metrics exhibit different weaknesses, note that a classical baseline performs surprisingly well, and highlight settings where all metrics appear to fail.
arXiv Detail & Related papers (2021-08-03T17:58:16Z)
ECINN: Efficient Counterfactuals from Invertible Neural Networks [80.94500245955591]
We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently. ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations. Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals.
arXiv Detail & Related papers (2021-03-25T09:23:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.