Related papers: Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning

Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning

URL: http://arxiv.org/abs/2310.01012v4
Date: Wed, 1 May 2024 16:02:30 GMT
Title: Unconstrained Stochastic CCA: Unifying Multiview and Self-Supervised Learning
Authors: James Chapman, Lennie Wells, Ana Lawry Aguila,
Abstract summary: We present a family of fast algorithms for PLS, CCA, and Deep CCA on all standard CCA and Deep CCA benchmarks. Our algorithms show far faster convergence and recover higher correlations than the previous state-of-the-art benchmarks. These improvements allow us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset.
Score: 0.13654846342364307
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Canonical Correlation Analysis (CCA) family of methods is foundational in multiview learning. Regularised linear CCA methods can be seen to generalise Partial Least Squares (PLS) and be unified with a Generalized Eigenvalue Problem (GEP) framework. However, classical algorithms for these linear methods are computationally infeasible for large-scale data. Extensions to Deep CCA show great promise, but current training procedures are slow and complicated. First we propose a novel unconstrained objective that characterizes the top subspace of GEPs. Our core contribution is a family of fast algorithms for stochastic PLS, stochastic CCA, and Deep CCA, simply obtained by applying stochastic gradient descent (SGD) to the corresponding CCA objectives. Our algorithms show far faster convergence and recover higher correlations than the previous state-of-the-art on all standard CCA and Deep CCA benchmarks. These improvements allow us to perform a first-of-its-kind PLS analysis of an extremely large biomedical dataset from the UK Biobank, with over 33,000 individuals and 500,000 features. Finally, we apply our algorithms to match the performance of `CCA-family' Self-Supervised Learning (SSL) methods on CIFAR-10 and CIFAR-100 with minimal hyper-parameter tuning, and also present theory to clarify the links between these methods and classical CCA, laying the groundwork for future insights.

Related papers

On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization. This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD) We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z)
FastSurvival: Hidden Computational Blessings in Training Cox Proportional Hazards Models [24.041562124587262]
Cox proportional hazards (CPH) model is widely used for interpretability, flexibility, and predictive performance. Current algorithms to train the CPH model have drawbacks, preventing us from using the CPH model at its full potential. We propose new optimization methods by constructing and minimizing surrogate functions that exploit hidden mathematical structures of the CPH model.
arXiv Detail & Related papers (2024-10-24T18:36:59Z)
SLCA++: Unleash the Power of Sequential Fine-tuning for Continual Learning with Pre-training [68.7896349660824]
We present an in-depth analysis of the progressive overfitting problem from the lens of Seq FT. Considering that the overly fast representation learning and the biased classification layer constitute this particular problem, we introduce the advanced Slow Learner with Alignment (S++) framework. Our approach involves a Slow Learner to selectively reduce the learning rate of backbone parameters, and a Alignment to align the disjoint classification layers in a post-hoc fashion.
arXiv Detail & Related papers (2024-08-15T17:50:07Z)
A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data. We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z)
Provably Efficient UCB-type Algorithms For Learning Predictive State Representations [55.00359893021461]
The sequential decision-making problem is statistically learnable if it admits a low-rank structure modeled by predictive state representations (PSRs) This paper proposes the first known UCB-type approach for PSRs, featuring a novel bonus term that upper bounds the total variation distance between the estimated and true models. In contrast to existing approaches for PSRs, our UCB-type algorithms enjoy computational tractability, last-iterate guaranteed near-optimal policy, and guaranteed model accuracy.
arXiv Detail & Related papers (2023-07-01T18:35:21Z)
SLCA: Slow Learner with Classifier Alignment for Continual Learning on a Pre-trained Model [73.80068155830708]
We present an extensive analysis for continual learning on a pre-trained model (CLPM) We propose a simple but extremely effective approach named Slow Learner with Alignment (SLCA) Across a variety of scenarios, our proposal provides substantial improvements for CLPM.
arXiv Detail & Related papers (2023-03-09T08:57:01Z)
Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm. Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z)
Communication-Efficient Federated Linear and Deep Generalized Canonical Correlation Analysis [13.04301271535511]
This work puts forth a communication-efficient federated learning framework for both linear and deep GCCA. Compared to the unquantized version, our empirical study shows that the proposed algorithm enjoys a substantial reduction of communication overheads with virtually no loss in accuracy and convergence speed.
arXiv Detail & Related papers (2021-09-25T16:43:10Z)
Spike and slab Bayesian sparse principal component analysis [0.6599344783327054]
We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm. We demonstrate that the PX-CAVI algorithm outperforms two popular SPCA approaches. The algorithm is then applied to study a lung cancer gene expression dataset.
arXiv Detail & Related papers (2021-01-30T20:28:30Z)
Approximation Algorithms for Sparse Principal Component Analysis [57.5357874512594]
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and statistics. Various approaches to obtain sparse principal direction loadings have been proposed, which are termed Sparse Principal Component Analysis. We present thresholding as a provably accurate, time, approximation algorithm for the SPCA problem.
arXiv Detail & Related papers (2020-06-23T04:25:36Z)
Adversarial Canonical Correlation Analysis [0.0]
Canonical Correlation Analysis (CCA) is a technique used to extract common information from multiple data sources or views. Recent work has given CCA probabilistic footing in a deep learning context. Or, adversarial techniques have arisen as a powerful alternative to variational Bayesian methods in autoencoders.
arXiv Detail & Related papers (2020-05-20T20:46:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.