Related papers: Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning

Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning

URL: http://arxiv.org/abs/2405.14739v2
Date: Thu, 06 Feb 2025 05:57:37 GMT
Title: Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning
Authors: Chongjie Si, Xuehui Wang, Xue Yang, Zhengqin Xu, Qingyun Li, Jifeng Dai, Yu Qiao, Xiaokang Yang, Wei Shen,
Abstract summary: Adapting pre-trained foundation models for various downstream tasks has been prevalent in artificial intelligence.<n>To mitigate this, several fine-tuning techniques have been developed to update the pre-trained model weights in a more resource-efficient manner.<n>This paper introduces a generalized parameter-efficient fine-tuning framework, designed for various dimensional parameter space.
Score: 78.39310274926535
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adapting pre-trained foundation models for various downstream tasks has been prevalent in artificial intelligence. Due to the vast number of tasks and high costs, adjusting all parameters becomes unfeasible. To mitigate this, several fine-tuning techniques have been developed to update the pre-trained model weights in a more resource-efficient manner, such as through low-rank adjustments. Yet, almost all of these methods focus on linear weights, neglecting the intricacies of parameter spaces in higher dimensions like 4D. Alternatively, some methods can be adapted for high-dimensional parameter space by compressing changes in the original space into two dimensions and then employing low-rank matrix adaptations. However, these approaches destructs the structural integrity of the involved high-dimensional spaces. To tackle the diversity of dimensional spaces across different foundation models and provide a more precise representation of the changes within these spaces, this paper introduces a generalized parameter-efficient fine-tuning framework, designed for various dimensional parameter space. Specifically, our method asserts that changes in each dimensional parameter space are based on a low-rank core space which maintains the consistent topological structure with the original space. It then models the changes through this core space alongside corresponding weights to reconstruct alterations in the original space. It effectively preserves the structural integrity of the change of original N-dimensional parameter space, meanwhile models it via low-rank tensor adaptation. Extensive experiments on computer vision, natural language processing and multi-modal tasks validate the effectiveness of our method.

Related papers

Weight Spectra Induced Efficient Model Adaptation [54.8615621415845]
Fine-tuning large-scale foundation models incurs prohibitive computational costs.<n>We show that fine-tuning predominantly amplifies the top singular values while leaving the remainder largely intact.<n>We propose a novel method that leverages learnable rescaling of top singular directions.
arXiv Detail & Related papers (2025-05-29T05:03:29Z)
Memory-Efficient Orthogonal Fine-Tuning with Principal Subspace Adaptation [40.69348434971122]
We propose Memory-efficient Orthogonal Fine-Tuning (MOFT) with principal subspace adaptation.<n>We show that MOFT consistently outperforms key baselines while significantly reducing the memory footprint of orthogonal fine-tuning.
arXiv Detail & Related papers (2025-05-16T13:26:48Z)
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations [50.010924231754856]
Adapting pre-trained foundation models for diverse downstream tasks is a core practice in artificial intelligence. To overcome this, parameter-efficient fine-tuning (PEFT) methods like LoRA have emerged and are becoming a growing research focus. We propose a generalization that extends matrix-based PEFT methods to higher-dimensional parameter spaces without compromising their structural properties.
arXiv Detail & Related papers (2025-04-01T14:36:45Z)
Shape Generation via Weight Space Learning [12.429026910048528]
We show that submanifolds within a large 3D shape-generative model can modulate topological properties or fine-grained part features separately. Results highlight the potential of weight space learning to unlock new approaches for 3D shape generation and specialized fine-tuning.
arXiv Detail & Related papers (2025-03-26T15:49:27Z)
Deep unrolling for learning optimal spatially varying regularisation parameters for Total Generalised Variation [0.393259574660092]
The framework combines a deep convolutional neural network (CNN) inferring the two spatially varying TGV parameters with an unrolled scheme that solves the corresponding variational problem. Numerical results in image denoising and MRI reconstruction show a significant qualitative and quantitative improvement compared to the best TGV scalar parameter case. In particular, the parameter that weighs the first-order TGV term has a triple-edge structure with alternating high-low-high values whereas the one that weighs the second-order term attains small values in a large neighbourhood around the edges.
arXiv Detail & Related papers (2025-02-23T10:48:11Z)
Modeling All Response Surfaces in One for Conditional Search Spaces [69.90317997694218]
This paper proposes a novel approach to model the response surfaces of all subspaces in one. We introduce an attention-based deep feature extractor, capable of projecting configurations with different structures from various subspaces into a unified feature space.
arXiv Detail & Related papers (2025-01-08T03:56:06Z)
Parameter-Efficient Fine-Tuning via Selective Discrete Cosine Transform [10.565509997395504]
We propose a novel Selective Discrete Cosine Transformation (sDCTFT) fine-tuning scheme to push this frontier. Its general idea is to exploit the superior energy compaction and decorrelation properties of DCT. Experiments on four benchmark datasets demonstrate the superior accuracy, reduced computational cost, and lower storage requirements.
arXiv Detail & Related papers (2024-10-09T16:07:42Z)
Scaling Exponents Across Parameterizations and Optimizers [94.54718325264218]
We propose a new perspective on parameterization by investigating a key assumption in prior work. Our empirical investigation includes tens of thousands of models trained with all combinations of threes. We find that the best learning rate scaling prescription would often have been excluded by the assumptions in prior work.
arXiv Detail & Related papers (2024-07-08T12:32:51Z)
MCNC: Manifold Constrained Network Compression [21.70510507535041]
We present MCNC as a novel model compression method that constrains the parameter space to low-dimensional pre-defined and frozen nonlinear manifold. We show that our method, MCNC, significantly outperforms state-of-the-art baselines in terms of compression, accuracy, and/or model reconstruction time.
arXiv Detail & Related papers (2024-06-27T16:17:26Z)
Sparsifying dimensionality reduction of PDE solution data with Bregman learning [1.2016264781280588]
We propose a multistep algorithm that induces sparsity in the encoder-decoder networks for effective reduction in the number of parameters and additional compression of the latent space. Compared to conventional training methods like Adam, the proposed method achieves similar accuracy with 30% less parameters and a significantly smaller latent space.
arXiv Detail & Related papers (2024-06-18T14:45:30Z)
Data-freeWeight Compress and Denoise for Large Language Models [101.53420111286952]
We propose a novel approach termed Data-free Joint Rank-k Approximation for compressing the parameter matrices. We achieve a model pruning of 80% parameters while retaining 93.43% of the original performance without any calibration data.
arXiv Detail & Related papers (2024-02-26T05:51:47Z)
Data-Free Learning of Reduced-Order Kinematics [54.85157881323157]
We produce a low-dimensional map whose image parameterizes a diverse yet low-energy submanifold of configurations. We represent subspaces as neural networks that map a low-dimensional latent vector to the full configuration space. This formulation is effective across a very general range of physical systems.
arXiv Detail & Related papers (2023-05-05T20:53:36Z)
Automatic Parameterization for Aerodynamic Shape Optimization via Deep Geometric Learning [60.69217130006758]
We propose two deep learning models that fully automate shape parameterization for aerodynamic shape optimization. Both models are optimized to parameterize via deep geometric learning to embed human prior knowledge into learned geometric patterns. We perform shape optimization experiments on 2D airfoils and discuss the applicable scenarios for the two models.
arXiv Detail & Related papers (2023-05-03T13:45:40Z)
RENs: Relevance Encoding Networks [0.0]
This paper proposes relevance encoding networks (RENs): a novel probabilistic VAE-based framework that uses the automatic relevance determination (ARD) prior in the latent space to learn the data-specific bottleneck dimensionality. We show that the proposed model learns the relevant latent bottleneck dimensionality without compromising the representation and generation quality of the samples.
arXiv Detail & Related papers (2022-05-25T21:53:48Z)
Parametric Generative Schemes with Geometric Constraints for Encoding and Synthesizing Airfoils [25.546237636065182]
Two deep learning-based generative schemes are proposed to capture the complexity of the design space while satisfying specific constraints. The soft-constrained scheme generates airfoils with slight deviations from the expected geometric constraints, yet still converge to the reference airfoil. The hard-constrained scheme produces airfoils with a wider range of geometric diversity while strictly adhering to the geometric constraints.
arXiv Detail & Related papers (2022-05-05T05:58:08Z)
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning [52.624194343095304]
We argue that analyzing fine-tuning through the lens of intrinsic dimension provides us with empirical and theoretical intuitions. We empirically show that common pre-trained models have a very low intrinsic dimension.
arXiv Detail & Related papers (2020-12-22T07:42:30Z)
Dense Non-Rigid Structure from Motion: A Manifold Viewpoint [162.88686222340962]
Non-Rigid Structure-from-Motion (NRSfM) problem aims to recover 3D geometry of a deforming object from its 2D feature correspondences across multiple frames. We show that our approach significantly improves accuracy, scalability, and robustness against noise.
arXiv Detail & Related papers (2020-06-15T09:15:54Z)
Dimensionality Reduction of Movement Primitives in Parameter Space [34.16700176918835]
Movement primitives are an important policy class for real-world robotics. The high dimensionality of their parametrization makes the policy optimization expensive both in terms of samples and computation. We propose the application of dimensionality reduction in the parameter space, identifying principal movements.
arXiv Detail & Related papers (2020-02-26T16:38:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.