LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL
Architectures
- URL: http://arxiv.org/abs/2312.04000v1
- Date: Thu, 7 Dec 2023 02:31:28 GMT
- Title: LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL
Architectures
- Authors: Vimal Thilak and Chen Huang and Omid Saremi and Laurent Dinh and
Hanlin Goh and Preetum Nakkiran and Joshua M. Susskind and Etai Littwin
- Abstract summary: LiDAR is a metric designed to measure the quality of representations within Joint embedding architectures.
Our proposed criterion presents a more robust and intuitive means of assessing the quality of representations within JE architectures.
- Score: 24.40012454562582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Joint embedding (JE) architectures have emerged as a promising avenue for
acquiring transferable data representations. A key obstacle to using JE
methods, however, is the inherent challenge of evaluating learned
representations without access to a downstream task, and an annotated dataset.
Without efficient and reliable evaluation, it is difficult to iterate on
architectural and training choices for JE methods. In this paper, we introduce
LiDAR (Linear Discriminant Analysis Rank), a metric designed to measure the
quality of representations within JE architectures. Our metric addresses
several shortcomings of recent approaches based on feature covariance rank by
discriminating between informative and uninformative features. In essence,
LiDAR quantifies the rank of the Linear Discriminant Analysis (LDA) matrix
associated with the surrogate SSL task -- a measure that intuitively captures
the information content as it pertains to solving the SSL task. We empirically
demonstrate that LiDAR significantly surpasses naive rank based approaches in
its predictive power of optimal hyperparameters. Our proposed criterion
presents a more robust and intuitive means of assessing the quality of
representations within JE architectures, which we hope facilitates broader
adoption of these powerful techniques in various domains.
Related papers
- NormXLogit: The Head-on-Top Never Lies [15.215985417763472]
Transformer architecture has emerged as the dominant choice for building large language models.
We propose a novel technique, called NormXLogit, for assessing the significance of individual input tokens.
We show that our approach consistently outperforms existing gradient-based methods in terms of faithfulness.
arXiv Detail & Related papers (2024-11-25T10:12:27Z) - CODES: Benchmarking Coupled ODE Surrogates [0.0]
CODES is a benchmark for comprehensive evaluation of surrogate architectures for coupled ODE systems.
It emphasizes usability through features such as integrated parallel training, a web-based configuration generator, and pre-implemented baseline models and datasets.
arXiv Detail & Related papers (2024-10-28T10:12:06Z) - Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [56.24431208419858]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset.
We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.
arXiv Detail & Related papers (2024-10-10T16:01:51Z) - T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data [0.0]
Self-supervised learning (SSL) generally involves generating different views of the same sample and thus requires data augmentations.
In the present work, we propose a novel augmentation-free SSL method for structured data.
Our approach, T-JEPA, relies on a Joint Embedding Predictive Architecture (JEPA) and is akin to mask reconstruction in the latent space.
arXiv Detail & Related papers (2024-10-07T13:15:07Z) - Position: LLM Unlearning Benchmarks are Weak Measures of Progress [31.957968729934745]
We find that existing benchmarks provide an overly optimistic and potentially misleading view on the effectiveness of candidate unlearning methods.
We identify that existing benchmarks are particularly vulnerable to modifications that introduce even loose dependencies between the forget and retain information.
arXiv Detail & Related papers (2024-10-03T18:07:25Z) - Discriminant Distance-Aware Representation on Deterministic Uncertainty
Quantification Methods [2.309984352134254]
We introduce a novel and efficient method for deterministic uncertainty estimation called Discriminant Distance-Awareness Representation (DDAR)
By leveraging a distinction layer over optimal trainable prototypes, DDAR can learn a discriminant distance-awareness representation.
Our experiments show that DDAR is a flexible and architecture-agnostic method that can be easily integrated as a pluggable layer with distance-sensitive metrics.
arXiv Detail & Related papers (2024-02-20T02:26:48Z) - Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model.
Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Uncertainty Estimation by Fisher Information-based Evidential Deep
Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications.
We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL)
In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.