Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
- URL: http://arxiv.org/abs/2407.08056v1
- Date: Wed, 10 Jul 2024 21:25:51 GMT
- Title: Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
- Authors: Nikolaos Dimitriadis, Pascal Frossard, Francois Fleuret,
- Abstract summary: PaLoRA is a novel parameter-efficient method that augments the original model with task-specific low-rank adapters.
Our experimental results show that PaLoRA outperforms MTL and PFL baselines across various datasets.
- Score: 49.14535254003683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dealing with multi-task trade-offs during inference can be addressed via Pareto Front Learning (PFL) methods that parameterize the Pareto Front with a single model, contrary to traditional Multi-Task Learning (MTL) approaches that optimize for a single trade-off which has to be decided prior to training. However, recent PFL methodologies suffer from limited scalability, slow convergence and excessive memory requirements compared to MTL approaches while exhibiting inconsistent mappings from preference space to objective space. In this paper, we introduce PaLoRA, a novel parameter-efficient method that augments the original model with task-specific low-rank adapters and continuously parameterizes the Pareto Front in their convex hull. Our approach dedicates the original model and the adapters towards learning general and task-specific features, respectively. Additionally, we propose a deterministic sampling schedule of preference vectors that reinforces this division of labor, enabling faster convergence and scalability to real world networks. Our experimental results show that PaLoRA outperforms MTL and PFL baselines across various datasets, scales to large networks and provides a continuous parameterization of the Pareto Front, reducing the memory overhead $23.8-31.7$ times compared with competing PFL baselines in scene understanding benchmarks.
Related papers
- SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion [53.33473557562837]
Solving multi-objective optimization problems for large deep neural networks is a challenging task due to the complexity of the loss landscape and the expensive computational cost.
We propose a practical and scalable approach to solve this problem via mixture of experts (MoE) based model fusion.
By ensembling the weights of specialized single-task models, the MoE module can effectively capture the trade-offs between multiple objectives.
arXiv Detail & Related papers (2024-06-14T07:16:18Z) - Parameter and Computation Efficient Transfer Learning for
Vision-Language Pre-trained Models [79.34513906324727]
In this paper, we aim at parameter and efficient transfer learning (PCETL) for vision-language pre-trained models.
We propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL.
arXiv Detail & Related papers (2023-09-04T09:34:33Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Improving Pareto Front Learning via Multi-Sample Hypernetworks [4.129225533930966]
We propose a novel PFL framework namely PHN-HVI, which employs a hypernetwork to generate multiple solutions from a set of diverse trade-off preferences.
The experimental results on several MOO machine learning tasks show that the proposed framework significantly outperforms the baselines.
arXiv Detail & Related papers (2022-12-02T12:19:12Z) - Pareto Manifold Learning: Tackling multiple tasks via ensembles of
single-task models [50.33956216274694]
In Multi-Task Learning (MTL), tasks may compete and limit the performance achieved on each other, rather than guiding the optimization to a solution.
We propose textitPareto Manifold Learning, an ensembling method in weight space.
arXiv Detail & Related papers (2022-10-18T11:20:54Z) - Self-Evolutionary Optimization for Pareto Front Learning [34.17125297176668]
Multi-objective optimization (MOO) approaches have been proposed for multitasking problems.
Recent MOO methods approximate multiple optimal solutions (Pareto front) with a single unified model.
We show that PFL can be re-formulated into another MOO problem with multiple objectives, each of which corresponds to different preference weights for the tasks.
arXiv Detail & Related papers (2021-10-07T13:38:57Z) - Learning the Pareto Front with Hypernetworks [44.72371822514582]
Multi-objective optimization (MOO) problems are prevalent in machine learning.
These problems have a set of optimal solutions, where each point on the front represents a different trade-off between possibly conflicting objectives.
Recent MOO methods can target a specific desired ray in loss space however, most approaches still face two grave limitations.
arXiv Detail & Related papers (2020-10-08T16:39:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.