Controllable Prompt Tuning For Balancing Group Distributional Robustness
- URL: http://arxiv.org/abs/2403.02695v2
- Date: Tue, 4 Jun 2024 21:25:20 GMT
- Title: Controllable Prompt Tuning For Balancing Group Distributional Robustness
- Authors: Hoang Phan, Andrew Gordon Wilson, Qi Lei,
- Abstract summary: We introduce an optimization scheme to achieve good performance across groups and find a good solution for all without severely sacrificing performance on any of them.
We propose Controllable Prompt Tuning (CPT), which couples our approach with prompt-tuning techniques.
On spurious correlation benchmarks, our procedures achieve state-of-the-art results across both transformer and non-transformer architectures, as well as unimodal and multimodal data.
- Score: 53.336515056479705
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find a good solution for all without severely sacrificing performance on any of them. However, directly applying such optimization involves updating the parameters of the entire network, making it both computationally expensive and challenging. Thus, we introduce Controllable Prompt Tuning (CPT), which couples our approach with prompt-tuning techniques. On spurious correlation benchmarks, our procedures achieve state-of-the-art results across both transformer and non-transformer architectures, as well as unimodal and multimodal data, while requiring only 0.4% tunable parameters.
Related papers
- Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling [16.112708478263745]
We present a unified framework combine the main strengths of optimization-based methods for learning.
Our approach entails embedding high-capacity, transformer-based neural network models within optimization process.
Compared to purely optimization-based approaches, results show that our approach can improve performance by up to 75%.
arXiv Detail & Related papers (2024-10-31T13:23:10Z) - Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z) - Towards General and Efficient Online Tuning for Spark [55.30868031221838]
We present a general and efficient Spark tuning framework that can deal with the three issues simultaneously.
We have implemented this framework as an independent cloud service, and applied it to the data platform in Tencent.
arXiv Detail & Related papers (2023-09-05T02:16:45Z) - Robust Prompt Optimization for Large Language Models Against
Distribution Shifts [80.6757997074956]
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks.
We propose a new problem of robust prompt optimization for LLMs against distribution shifts.
This problem requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group.
arXiv Detail & Related papers (2023-05-23T11:30:43Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Consolidated learning -- a domain-specific model-free optimization
strategy with examples for XGBoost and MIMIC-IV [4.370097023410272]
This paper proposes a new formulation of the tuning problem, called consolidated learning.
In such settings, we are interested in the total optimization time rather than tuning for a single task.
We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database.
arXiv Detail & Related papers (2022-01-27T21:38:53Z) - Multi-Objectivizing Software Configuration Tuning (for a single
performance concern) [7.285442358509729]
We propose a meta-objectivization model (MMO) that considers an auxiliary performance objective.
Our model is statistically more effective than state-of-the-art single-objective counterparts in overcoming local optima.
arXiv Detail & Related papers (2021-05-31T03:03:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.