Spectral Tensor Train Parameterization of Deep Learning Layers
- URL: http://arxiv.org/abs/2103.04217v1
- Date: Sun, 7 Mar 2021 00:15:44 GMT
- Title: Spectral Tensor Train Parameterization of Deep Learning Layers
- Authors: Anton Obukhov, Maxim Rakhuba, Alexander Liniger, Zhiwu Huang,
Stamatios Georgoulis, Dengxin Dai, Luc Van Gool
- Abstract summary: We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context.
We show the effects of neural network compression in the classification setting and both compression and improved stability training in the generative adversarial training setting.
- Score: 136.4761580842396
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We study low-rank parameterizations of weight matrices with embedded spectral
properties in the Deep Learning context. The low-rank property leads to
parameter efficiency and permits taking computational shortcuts when computing
mappings. Spectral properties are often subject to constraints in optimization
problems, leading to better models and stability of optimization. We start by
looking at the compact SVD parameterization of weight matrices and identifying
redundancy sources in the parameterization. We further apply the Tensor Train
(TT) decomposition to the compact SVD components, and propose a non-redundant
differentiable parameterization of fixed TT-rank tensor manifolds, termed the
Spectral Tensor Train Parameterization (STTP). We demonstrate the effects of
neural network compression in the image classification setting and both
compression and improved training stability in the generative adversarial
training setting.
Related papers
- Low Tensor-Rank Adaptation of Kolmogorov--Arnold Networks [70.06682043272377]
Kolmogorov--Arnold networks (KANs) have demonstrated their potential as an alternative to multi-layer perceptions (MLPs) in various domains.
We develop low tensor-rank adaptation (LoTRA) for fine-tuning KANs.
We explore the application of LoTRA for efficiently solving various partial differential equations (PDEs) by fine-tuning KANs.
arXiv Detail & Related papers (2025-02-10T04:57:07Z) - tCURLoRA: Tensor CUR Decomposition Based Low-Rank Parameter Adaptation and Its Application in Medical Image Segmentation [1.3281936946796913]
Transfer learning, by leveraging knowledge from pre-trained models, has significantly enhanced the performance of target tasks.
As deep neural networks scale up, full fine-tuning introduces substantial computational and storage challenges.
We propose tCURLoRA, a novel fine-tuning method based on tensor CUR decomposition.
arXiv Detail & Related papers (2025-01-04T08:25:32Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method.
We propose a higher-order Candecomp/Parafac (CP) decomposition, enabling a more compact and flexible representation.
Our method can achieve a reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation [12.07880147193174]
We show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of over parameterization without the computational burdens.
We demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models.
arXiv Detail & Related papers (2024-06-06T14:29:49Z) - Spectral Adapter: Fine-Tuning in Spectral Space [45.72323731094864]
We study the enhancement of current PEFT methods by incorporating the spectral information of pretrained weight matrices into the fine-tuning procedure.
We show through extensive experiments that the proposed fine-tuning model enables better parameter efficiency and tuning performance as well as benefits multi-adapter fusion.
arXiv Detail & Related papers (2024-05-22T19:36:55Z) - Optimizing Training Trajectories in Variational Autoencoders via Latent
Bayesian Optimization Approach [0.0]
Unsupervised and semi-supervised ML methods have become widely adopted across multiple areas of physics, chemistry, and materials sciences.
We propose a latent Bayesian optimization (zBO) approach for the hyper parameter trajectory optimization for the unsupervised and semi-supervised ML.
We demonstrate an application of this method for finding joint discrete and continuous rotationally invariant representations for MNIST and experimental data of a plasmonic nanoparticles material system.
arXiv Detail & Related papers (2022-06-30T23:41:47Z) - Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation [105.33409035876691]
This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling.
We design a novel structured tensor low-rank norm tailored to MVSC.
We show that the proposed method outperforms state-of-the-art methods to a significant extent.
arXiv Detail & Related papers (2020-04-30T11:52:12Z) - Supervised Learning for Non-Sequential Data: A Canonical Polyadic
Decomposition Approach [85.12934750565971]
Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks.
To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor.
For enhanced expressiveness, we generalize the framework to allow feature mapping to arbitrarily high-dimensional feature vectors.
arXiv Detail & Related papers (2020-01-27T22:38:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.