Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
- URL: http://arxiv.org/abs/2410.22952v1
- Date: Wed, 30 Oct 2024 12:08:30 GMT
- Title: Efficient Adaptation of Pre-trained Vision Transformer via Householder Transformation
- Authors: Wei Dong, Yuan Sun, Yiting Yang, Xing Zhang, Zhijun Lin, Qingsen Yan, Haokui Zhang, Peng Wang, Yang Yang, Hengtao Shen,
- Abstract summary: A common strategy for.
Efficient Fine-Tuning (PEFT) of pre-trained Vision Transformers (ViTs) involves adapting the model to downstream tasks.
We propose a novel PEFT approach inspired by Singular Value Decomposition (SVD) for representing the adaptation matrix.
SVD decomposes a matrix into the product of a left unitary matrix, a diagonal matrix of scaling values, and a right unitary matrix.
- Score: 53.88562288388169
- License:
- Abstract: A common strategy for Parameter-Efficient Fine-Tuning (PEFT) of pre-trained Vision Transformers (ViTs) involves adapting the model to downstream tasks by learning a low-rank adaptation matrix. This matrix is decomposed into a product of down-projection and up-projection matrices, with the bottleneck dimensionality being crucial for reducing the number of learnable parameters, as exemplified by prevalent methods like LoRA and Adapter. However, these low-rank strategies typically employ a fixed bottleneck dimensionality, which limits their flexibility in handling layer-wise variations. To address this limitation, we propose a novel PEFT approach inspired by Singular Value Decomposition (SVD) for representing the adaptation matrix. SVD decomposes a matrix into the product of a left unitary matrix, a diagonal matrix of scaling values, and a right unitary matrix. We utilize Householder transformations to construct orthogonal matrices that efficiently mimic the unitary matrices, requiring only a vector. The diagonal values are learned in a layer-wise manner, allowing them to flexibly capture the unique properties of each layer. This approach enables the generation of adaptation matrices with varying ranks across different layers, providing greater flexibility in adapting pre-trained models. Experiments on standard downstream vision tasks demonstrate that our method achieves promising fine-tuning performance.
Related papers
- SVFit: Parameter-Efficient Fine-Tuning of Large Pre-Trained Models Using Singular Values [12.137869917556415]
Large pre-trained models (LPMs) have demonstrated exceptional performance in diverse natural language processing and computer vision tasks.
fully fine-tuning these models poses substantial memory challenges, particularly in resource-constrained environments.
We propose SVFit, a novel PEFT approach that leverages singular value decomposition (SVD) to low-rank matrices using critical singular values as trainable parameters.
arXiv Detail & Related papers (2024-09-09T08:44:53Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - Matrix-Transformation Based Low-Rank Adaptation (MTLoRA): A Brain-Inspired Method for Parameter-Efficient Fine-Tuning [11.037221461758806]
Matrix-Transformation based Low-Rank Adaptation (MTLoRA) is inspired by the idea that the functions of the brain are shaped by its geometric structure.
MTLoRA achieves an overall performance increase of about 1.0% across eight tasks.
arXiv Detail & Related papers (2024-03-12T09:32:25Z) - Vision Transformer Pruning Via Matrix Decomposition [0.0]
The purpose of Vision Transformer Pruning is to prune the dimension of the linear projection of the dataset by learning their associated importance score.
In this paper we further reduce dimension and complexity of the linear projection by implementing and comparing several matrix decomposition methods.
arXiv Detail & Related papers (2023-08-21T16:40:51Z) - Singular value decomposition based matrix surgery [0.0]
We develop a procedure to reduce and control the condition number of random matrices.
We investigate the effect on the persistent homology (PH) of point clouds of well- and ill-conditioned matrices.
arXiv Detail & Related papers (2023-02-22T15:30:08Z) - Numerical Optimizations for Weighted Low-rank Estimation on Language
Model [73.12941276331316]
Singular value decomposition (SVD) is one of the most popular compression methods that approximates a target matrix with smaller matrices.
Standard SVD treats the parameters within the matrix with equal importance, which is a simple but unrealistic assumption.
We show that our method can perform better than current SOTA methods in neural-based language models.
arXiv Detail & Related papers (2022-11-02T00:58:02Z) - Manifold Gaussian Variational Bayes on the Precision Matrix [70.44024861252554]
We propose an optimization algorithm for Variational Inference (VI) in complex models.
We develop an efficient algorithm for Gaussian Variational Inference whose updates satisfy the positive definite constraint on the variational covariance matrix.
Due to its black-box nature, MGVBP stands as a ready-to-use solution for VI in complex models.
arXiv Detail & Related papers (2022-10-26T10:12:31Z) - Memory-Efficient Backpropagation through Large Linear Layers [107.20037639738433]
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass.
This study proposes a memory reduction approach to perform backpropagation through linear layers.
arXiv Detail & Related papers (2022-01-31T13:02:41Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.