Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning
- URL: http://arxiv.org/abs/2412.02759v1
- Date: Tue, 03 Dec 2024 19:00:34 GMT
- Title: Mixture of Physical Priors Adapter for Parameter-Efficient Fine-Tuning
- Authors: Zhaozhi Wang, Conghu Li, Qixiang Ye, Tong Zhang,
- Abstract summary: We propose a novel approach that models network weights by leveraging a combination of physical priors.
We use three foundational equations -- heat diffusion, wave propagation, and Poisson's steady-state equation -- each contributing distinctive modeling properties.
MoPPA improves PEFT accuracy by up to 2.1% on VTAB-1K image classification with a comparable number of trainable parameters.
- Score: 41.19870454097444
- License:
- Abstract: Most parameter-efficient fine-tuning (PEFT) methods rely on low-rank representations to adapt models. However, these approaches often oversimplify representations, particularly when the underlying data has high-rank or high-frequency components. This limitation hinders the model's ability to capture complex data interactions effectively. In this paper, we propose a novel approach that models network weights by leveraging a combination of physical priors, enabling more accurate approximations. We use three foundational equations -- heat diffusion, wave propagation, and Poisson's steady-state equation -- each contributing distinctive modeling properties: heat diffusion enforces local smoothness, wave propagation facilitates long-range interactions, and Poisson's equation captures global equilibrium. To combine these priors effectively, we introduce the Mixture of Physical Priors Adapter (MoPPA), using an efficient Discrete Cosine Transform (DCT) implementation. To dynamically balance these priors, a route regularization mechanism is designed to adaptively tune their contributions. MoPPA serves as a lightweight, plug-and-play module that seamlessly integrates into transformer architectures, with adaptable complexity depending on the local context. Specifically, using MAE pre-trained ViT-B, MoPPA improves PEFT accuracy by up to 2.1% on VTAB-1K image classification with a comparable number of trainable parameters, and advantages are further validated through experiments across various vision backbones, showcasing MoPPA's effectiveness and adaptability. The code will be made public available.
Related papers
- Hyper Compressed Fine-Tuning of Large Foundation Models with Quantum Inspired Adapters [0.0]
emphQuantum-Inspired Adapters, a PEFT approach inspired by Hamming-weight quantum circuits from quantum machine learning literature.
We test our proposed adapters by adapting large language models and large vision transformers on benchmark datasets.
arXiv Detail & Related papers (2025-02-10T13:06:56Z) - ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts.
Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z) - Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models [73.88009808326387]
We propose a novel spectrum-aware adaptation framework for generative models.
Our method adjusts both singular values and their basis vectors of pretrained weights.
We introduce Spectral Ortho Decomposition Adaptation (SODA), which balances computational efficiency and representation capacity.
arXiv Detail & Related papers (2024-05-31T17:43:35Z) - ETHER: Efficient Finetuning of Large-Scale Models with Hyperplane Reflections [59.839926875976225]
We propose the ETHER transformation family, which performs Efficient fineTuning via HypErplane Reflections.
In particular, we introduce ETHER and its relaxation ETHER+, which match or outperform existing PEFT methods with significantly fewer parameters.
arXiv Detail & Related papers (2024-05-30T17:26:02Z) - Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo
Matching [77.133400999703]
Correlation based stereo matching has achieved outstanding performance.
Current methods with a fixed model do not work uniformly well across various datasets.
This paper proposes a new perspective to dynamically calculate correlation for robust stereo matching.
arXiv Detail & Related papers (2023-07-26T09:47:37Z) - Asynchronous Multi-Model Dynamic Federated Learning over Wireless
Networks: Theory, Modeling, and Optimization [20.741776617129208]
Federated learning (FL) has emerged as a key technique for distributed machine learning (ML)
We first formulate rectangular scheduling steps and functions to capture the impact of system parameters on learning performance.
Our analysis sheds light on the joint impact of device training variables and asynchronous scheduling decisions.
arXiv Detail & Related papers (2023-05-22T21:39:38Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - Residual Pathway Priors for Soft Equivariance Constraints [44.19582621065543]
We introduce Residual Pathway Priors (RPPs) as a method for converting hard architectural constraints into soft priors.
RPPs are resilient to approximate or misspecified symmetries, and are as effective as fully constrained models even when symmetries are exact.
arXiv Detail & Related papers (2021-12-02T16:18:17Z) - A Variational Infinite Mixture for Probabilistic Inverse Dynamics
Learning [34.90240171916858]
We develop an efficient variational Bayes inference technique for infinite mixtures of probabilistic local models.
We highlight the model's power in combining data-driven adaptation, fast prediction and the ability to deal with discontinuous functions and heteroscedastic noise.
We use the learned models for online dynamics control of a Barrett-WAM manipulator, significantly improving the trajectory tracking performance.
arXiv Detail & Related papers (2020-11-10T16:15:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.