Related papers: UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory

URL: http://arxiv.org/abs/2308.14316v2
Date: Mon, 11 Mar 2024 10:28:41 GMT
Title: UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory
Authors: Haiwen Diao, Bo Wan, Ying Zhang, Xu Jia, Huchuan Lu, Long Chen
Abstract summary: PETL is an effective strategy for adapting pre-trained models to downstream domains. Recent PETL works focus on the more valuable memory-efficient characteristic. We propose a new memory-efficient PETL strategy, Universal Parallel Tuning (UniPT)
Score: 69.33445217944029
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parameter-efficient transfer learning (PETL), i.e., fine-tuning a small portion of parameters, is an effective strategy for adapting pre-trained models to downstream domains. To further reduce the memory demand, recent PETL works focus on the more valuable memory-efficient characteristic. In this paper, we argue that the scalability, adaptability, and generalizability of state-of-the-art methods are hindered by structural dependency and pertinency on specific pre-trained backbones. To this end, we propose a new memory-efficient PETL strategy, Universal Parallel Tuning (UniPT), to mitigate these weaknesses. Specifically, we facilitate the transfer process via a lightweight and learnable parallel network, which consists of: 1) A parallel interaction module that decouples the sequential connections and processes the intermediate activations detachedly from the pre-trained network. 2) A confidence aggregation module that learns optimal strategies adaptively for integrating cross-layer features. We evaluate UniPT with different backbones (e.g., T5, VSE$\infty$, CLIP4Clip, Clip-ViL, and MDETR) on various vision-and-language and pure NLP tasks. Extensive ablations on 18 datasets have validated that UniPT can not only dramatically reduce memory consumption and outperform the best competitor, but also achieve competitive performance over other plain PETL methods with lower training memory overhead. Our code is publicly available at: https://github.com/Paranioar/UniPT.

Related papers

ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts [71.91042186338163]
ALoRE is a novel PETL method that reuses the hypercomplex parameterized space constructed by Kronecker product to Aggregate Low Rank Experts. Thanks to the artful design, ALoRE maintains negligible extra parameters and can be effortlessly merged into the frozen backbone.
arXiv Detail & Related papers (2024-12-11T12:31:30Z)
FPT+: A Parameter and Memory Efficient Transfer Learning Method for High-resolution Medical Image Classification [1.5791081894226173]
Fine-grained Prompt Tuning plus (FPT+) is a PETL method designed for high-resolution medical image classification. FPT+ performs transfer learning by training a lightweight side network and accessing pre-trained knowledge from a large pre-trained model. Experimental results demonstrate that FPT+ outperforms other PETL methods, using only 1.03% of the learnable parameters and 3.18% of the memory required for fine-tuning an entire ViT-B model.
arXiv Detail & Related papers (2024-08-05T12:33:07Z)
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences [49.14535254003683]
PaLoRA is a novel parameter-efficient method that augments the original model with task-specific low-rank adapters. Our experimental results show that PaLoRA outperforms MTL and PFL baselines across various datasets.
arXiv Detail & Related papers (2024-07-10T21:25:51Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead. Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z)
Time-, Memory- and Parameter-Efficient Visual Adaptation [75.28557015773217]
We propose an adaptation method which does not backpropagate gradients through the backbone. We achieve this by designing a lightweight network in parallel that operates on features from the frozen, pretrained backbone.
arXiv Detail & Related papers (2024-02-05T10:55:47Z)
Make Pre-trained Model Reversible: From Parameter to Memory Efficient Fine-Tuning [6.451743797015637]
We propose memory-efficient fine-tuning (MEFT) for pre-trained language models. MEFT inserts adapters into a PLM, preserving the PLM's starting point and making it reversible without additional pre-training. MEFT significantly reduces the activation memory up to 84% of full fine-tuning with a negligible amount of trainable parameters.
arXiv Detail & Related papers (2023-06-01T09:26:17Z)
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning [82.93130407930762]
It is costly to update the entire parameter set of large pre-trained models. PETL techniques allow updating a small subset of parameters inside a pre-trained backbone network for a new task. We propose Ladder Side-Tuning (LST), a new PETL technique that reduces training memory requirements by more substantial amounts.
arXiv Detail & Related papers (2022-06-13T23:51:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.