Related papers: LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures

URL: http://arxiv.org/abs/2312.04000v1
Date: Thu, 7 Dec 2023 02:31:28 GMT
Title: LiDAR: Sensing Linear Probing Performance in Joint Embedding SSL Architectures
Authors: Vimal Thilak and Chen Huang and Omid Saremi and Laurent Dinh and Hanlin Goh and Preetum Nakkiran and Joshua M. Susskind and Etai Littwin
Abstract summary: LiDAR is a metric designed to measure the quality of representations within Joint embedding architectures. Our proposed criterion presents a more robust and intuitive means of assessing the quality of representations within JE architectures.
Score: 24.40012454562582
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Joint embedding (JE) architectures have emerged as a promising avenue for acquiring transferable data representations. A key obstacle to using JE methods, however, is the inherent challenge of evaluating learned representations without access to a downstream task, and an annotated dataset. Without efficient and reliable evaluation, it is difficult to iterate on architectural and training choices for JE methods. In this paper, we introduce LiDAR (Linear Discriminant Analysis Rank), a metric designed to measure the quality of representations within JE architectures. Our metric addresses several shortcomings of recent approaches based on feature covariance rank by discriminating between informative and uninformative features. In essence, LiDAR quantifies the rank of the Linear Discriminant Analysis (LDA) matrix associated with the surrogate SSL task -- a measure that intuitively captures the information content as it pertains to solving the SSL task. We empirically demonstrate that LiDAR significantly surpasses naive rank based approaches in its predictive power of optimal hyperparameters. Our proposed criterion presents a more robust and intuitive means of assessing the quality of representations within JE architectures, which we hope facilitates broader adoption of these powerful techniques in various domains.

Related papers

DSAI: Unbiased and Interpretable Latent Feature Extraction for Data-Centric AI [24.349800949355465]
Large language models (LLMs) often struggle to objectively identify latent characteristics in large datasets. We propose Data Scientist AI (DSAI), a framework that enables unbiased and interpretable feature extraction.
arXiv Detail & Related papers (2024-12-09T08:47:05Z)
NormXLogit: The Head-on-Top Never Lies [15.215985417763472]
Transformer architecture has emerged as the dominant choice for building large language models. We propose a novel technique, called NormXLogit, for assessing the significance of individual input tokens. We show that our approach consistently outperforms existing gradient-based methods in terms of faithfulness.
arXiv Detail & Related papers (2024-11-25T10:12:27Z)
CODES: Benchmarking Coupled ODE Surrogates [0.0]
CODES is a benchmark for comprehensive evaluation of surrogate architectures for coupled ODE systems. It emphasizes usability through features such as integrated parallel training, a web-based configuration generator, and pre-implemented baseline models and datasets.
arXiv Detail & Related papers (2024-10-28T10:12:06Z)
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs [56.24431208419858]
We introduce reward-conditioned Large Language Models (LLMs) that learn from the entire spectrum of response quality within the dataset. We propose an effective yet simple data relabeling method that conditions the preference pairs on quality scores to construct a reward-augmented dataset.
arXiv Detail & Related papers (2024-10-10T16:01:51Z)
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data [0.0]
Self-supervised learning (SSL) generally involves generating different views of the same sample and thus requires data augmentations. In the present work, we propose a novel augmentation-free SSL method for structured data. Our approach, T-JEPA, relies on a Joint Embedding Predictive Architecture (JEPA) and is akin to mask reconstruction in the latent space.
arXiv Detail & Related papers (2024-10-07T13:15:07Z)
Position: LLM Unlearning Benchmarks are Weak Measures of Progress [31.957968729934745]
We find that existing benchmarks provide an overly optimistic and potentially misleading view on the effectiveness of candidate unlearning methods. We identify that existing benchmarks are particularly vulnerable to modifications that introduce even loose dependencies between the forget and retain information.
arXiv Detail & Related papers (2024-10-03T18:07:25Z)
Discriminant Distance-Aware Representation on Deterministic Uncertainty Quantification Methods [2.309984352134254]
We introduce a novel and efficient method for deterministic uncertainty estimation called Discriminant Distance-Awareness Representation (DDAR) By leveraging a distinction layer over optimal trainable prototypes, DDAR can learn a discriminant distance-awareness representation. Our experiments show that DDAR is a flexible and architecture-agnostic method that can be easily integrated as a pluggable layer with distance-sensitive metrics.
arXiv Detail & Related papers (2024-02-20T02:26:48Z)
Synergistic eigenanalysis of covariance and Hessian matrices for enhanced binary classification [72.77513633290056]
We present a novel approach that combines the eigenanalysis of a covariance matrix evaluated on a training set with a Hessian matrix evaluated on a deep learning model. Our method captures intricate patterns and relationships, enhancing classification performance.
arXiv Detail & Related papers (2024-02-14T16:10:42Z)
A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels. We present a generative latent variable model for self-supervised learning. We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning [52.257422715393574]
We introduce a self-guided methodology for Large Language Models (LLMs) to autonomously discern and select cherry samples from open-source datasets. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability.
arXiv Detail & Related papers (2023-08-23T09:45:29Z)
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL) In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z)
A Survey of Learning on Small Data: Generalization, Optimization, and Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI. This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data. Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.