Target-Embedding Autoencoders for Supervised Representation Learning
- URL: http://arxiv.org/abs/2001.08345v1
- Date: Thu, 23 Jan 2020 02:37:10 GMT
- Title: Target-Embedding Autoencoders for Supervised Representation Learning
- Authors: Daniel Jarrett, Mihaela van der Schaar
- Abstract summary: This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
- Score: 111.07204912245841
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autoencoder-based learning has emerged as a staple for disciplining
representations in unsupervised and semi-supervised settings. This paper
analyzes a framework for improving generalization in a purely supervised
setting, where the target space is high-dimensional. We motivate and formalize
the general framework of target-embedding autoencoders (TEA) for supervised
prediction, learning intermediate latent representations jointly optimized to
be both predictable from features as well as predictive of targets---encoding
the prior that variations in targets are driven by a compact set of underlying
factors. As our theoretical contribution, we provide a guarantee of
generalization for linear TEAs by demonstrating uniform stability, interpreting
the benefit of the auxiliary reconstruction task as a form of regularization.
As our empirical contribution, we extend validation of this approach beyond
existing static classification applications to multivariate sequence
forecasting, verifying their advantage on both linear and nonlinear recurrent
architectures---thereby underscoring the further generality of this framework
beyond feedforward instantiations.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Revisiting the Robust Generalization of Adversarial Prompt Tuning [4.033827046965844]
We propose an adaptive Consistency-guided Adrial Prompt Tuning (i.e., CAPT) framework to enhance the alignment of image and text features for adversarial examples.
We conduct experiments across 14 datasets and 4 data sparsity schemes to show the superiority of CAPT over other state-of-the-art adaption methods.
arXiv Detail & Related papers (2024-05-18T02:54:41Z) - On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization.
This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase.
Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z) - On the Optimization and Generalization of Multi-head Attention [28.33164313549433]
We investigate the potential optimization and generalization advantages of using multiple attention heads.
We derive convergence and generalization guarantees for gradient-descent training of a single-layer multi-head self-attention model.
arXiv Detail & Related papers (2023-10-19T12:18:24Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Predicting Deep Neural Network Generalization with Perturbation Response
Curves [58.8755389068888]
We propose a new framework for evaluating the generalization capabilities of trained networks.
Specifically, we introduce two new measures for accurately predicting generalization gaps.
We attain better predictive scores than the current state-of-the-art measures on a majority of tasks in the Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition.
arXiv Detail & Related papers (2021-06-09T01:37:36Z) - Towards Uncovering the Intrinsic Data Structures for Unsupervised Domain
Adaptation using Structurally Regularized Deep Clustering [119.88565565454378]
Unsupervised domain adaptation (UDA) is to learn classification models that make predictions for unlabeled data on a target domain.
We propose a hybrid model of Structurally Regularized Deep Clustering, which integrates the regularized discriminative clustering of target data with a generative one.
Our proposed H-SRDC outperforms all the existing methods under both the inductive and transductive settings.
arXiv Detail & Related papers (2020-12-08T08:52:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.