Low-Rank Plus Sparse Matrix Transfer Learning under Growing Representations and Ambient Dimensions
- URL: http://arxiv.org/abs/2601.21873v1
- Date: Thu, 29 Jan 2026 15:40:05 GMT
- Title: Low-Rank Plus Sparse Matrix Transfer Learning under Growing Representations and Ambient Dimensions
- Authors: Jinhang Chai, Xuyuan Liu, Elynn Chen, Yujun Yan,
- Abstract summary: We study transfer learning for structured matrix estimation under simultaneous growth of the ambient dimension and the intrinsic representation.<n>We propose a general transfer framework in which the target parameter decomposes into an embedded source component, low-dimensional low-rank innovations, and sparse edits.<n>We establish deterministic error bounds that separate target noise, representation growth, and source estimation error, yielding strictly improved rates when rank and sparsity increments are small.
- Score: 6.949116398973296
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning systems often expand their ambient features or latent representations over time, embedding earlier representations into larger spaces with limited new latent structure. We study transfer learning for structured matrix estimation under simultaneous growth of the ambient dimension and the intrinsic representation, where a well-estimated source task is embedded as a subspace of a higher-dimensional target task. We propose a general transfer framework in which the target parameter decomposes into an embedded source component, low-dimensional low-rank innovations, and sparse edits, and develop an anchored alternating projection estimator that preserves transferred subspaces while estimating only low-dimensional innovations and sparse modifications. We establish deterministic error bounds that separate target noise, representation growth, and source estimation error, yielding strictly improved rates when rank and sparsity increments are small. We demonstrate the generality of the framework by applying it to two canonical problems. For Markov transition matrix estimation from a single trajectory, we derive end-to-end theoretical guarantees under dependent noise. For structured covariance estimation under enlarged dimensions, we provide complementary theoretical analysis in the appendix and empirically validate consistent transfer gains.
Related papers
- The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization [2.496574213989531]
We propose an information-theoretic framework that systematically accounts for the effect of augmentation on generalization and invariance learning.<n>Our approach builds upon mutual information-based bounds, which relate the generalization gap to the amount of information a learning algorithm retains about its training data.<n>Under mild sub-Gaussian assumptions on the loss function and the augmentation process, we derive a new generalization bound that decompose the expected generalization gap into three interpretable terms.
arXiv Detail & Related papers (2026-02-16T03:18:39Z) - Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning [56.87989363424]
We show that a low-rank structure naturally emerges in the shifted successor measure.<n>We quantify the amount of shift needed for effective low-rank approximation and estimation.
arXiv Detail & Related papers (2025-09-05T15:48:20Z) - Beyond Progress Measures: Theoretical Insights into the Mechanism of Grokking [50.465604300990904]
Grokking refers to the abrupt improvement in test accuracy after extended overfitting.<n>In this work, we investigate the grokking mechanism underlying the Transformer in the task of prime number operations.
arXiv Detail & Related papers (2025-04-04T04:42:38Z) - Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution.<n>Our contribution is to provide the first general estimation technique for transportability problems.<n>We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z) - Disentangled Interleaving Variational Encoding [1.132458063021286]
We propose a principled approach to disentangle the original input into marginal and conditional probability distributions in the latent space of a variational autoencoder.<n>Our proposed model, Deep Disentangled Interleaving Variational.<n>coder (DeepDIVE), learns disentangled features from the original input to form clusters in the embedding space.<n>Experiments on two public datasets show that DeepDIVE disentangles the original input and yields forecast accuracies better than the original VAE.
arXiv Detail & Related papers (2025-01-15T10:50:54Z) - Measuring Orthogonality in Representations of Generative Models [81.13466637365553]
In unsupervised representation learning, models aim to distill essential features from high-dimensional data into lower-dimensional learned representations.
Disentanglement of independent generative processes has long been credited with producing high-quality representations.
We propose two novel metrics: Importance-Weighted Orthogonality (IWO) and Importance-Weighted Rank (IWR)
arXiv Detail & Related papers (2024-07-04T08:21:54Z) - Uniform Transformation: Refining Latent Representation in Variational Autoencoders [7.4316292428754105]
We introduce a novel adaptable three-stage Uniform Transformation (UT) module to address irregular latent distributions.
By reconfiguring irregular distributions into a uniform distribution in the latent space, our approach significantly enhances the disentanglement and interpretability of latent representations.
Empirical evaluations demonstrated the efficacy of our proposed UT module in improving disentanglement metrics across benchmark datasets.
arXiv Detail & Related papers (2024-07-02T21:46:23Z) - UGMAE: A Unified Framework for Graph Masked Autoencoders [67.75493040186859]
We propose UGMAE, a unified framework for graph masked autoencoders.
We first develop an adaptive feature mask generator to account for the unique significance of nodes.
We then design a ranking-based structure reconstruction objective joint with feature reconstruction to capture holistic graph information.
arXiv Detail & Related papers (2024-02-12T19:39:26Z) - Subspace-Guided Feature Reconstruction for Unsupervised Anomaly
Localization [5.085309164633571]
Unsupervised anomaly localization plays a critical role in industrial manufacturing.
Most recent methods perform feature matching or reconstruction for the target sample with pre-trained deep neural networks.
We propose a novel subspace-guided feature reconstruction framework to pursue adaptive feature approximation for anomaly localization.
arXiv Detail & Related papers (2023-09-25T06:58:57Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - A Class of Geometric Structures in Transfer Learning: Minimax Bounds and
Optimality [5.2172436904905535]
We exploit the geometric structure of the source and target domains for transfer learning.
Our proposed estimator outperforms state-of-the-art transfer learning methods in both moderate- and high-dimensional settings.
arXiv Detail & Related papers (2022-02-23T18:47:53Z) - Distributionally Robust Fair Principal Components via Geodesic Descents [16.440434996206623]
In consequential domains such as college admission, healthcare and credit approval, it is imperative to take into account emerging criteria such as the fairness and the robustness of the learned projection.
We propose a distributionally robust optimization problem for principal component analysis which internalizes a fairness criterion in the objective function.
Our experimental results on real-world datasets show the merits of our proposed method over state-of-the-art baselines.
arXiv Detail & Related papers (2022-02-07T11:08:13Z) - Which Invariance Should We Transfer? A Causal Minimax Learning Approach [18.71316951734806]
We present a comprehensive minimax analysis from a causal perspective.
We propose an efficient algorithm to search for the subset with minimal worst-case risk.
The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
arXiv Detail & Related papers (2021-07-05T09:07:29Z) - Target-Embedding Autoencoders for Supervised Representation Learning [111.07204912245841]
This paper analyzes a framework for improving generalization in a purely supervised setting, where the target space is high-dimensional.
We motivate and formalize the general framework of target-embedding autoencoders (TEA) for supervised prediction, learning intermediate latent representations jointly optimized to be both predictable from features as well as predictive of targets.
arXiv Detail & Related papers (2020-01-23T02:37:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.