Related papers: Controllable Prompt Tuning For Balancing Group Distributional Robustness

Controllable Prompt Tuning For Balancing Group Distributional Robustness

URL: http://arxiv.org/abs/2403.02695v2
Date: Tue, 4 Jun 2024 21:25:20 GMT
Title: Controllable Prompt Tuning For Balancing Group Distributional Robustness
Authors: Hoang Phan, Andrew Gordon Wilson, Qi Lei,
Abstract summary: We introduce an optimization scheme to achieve good performance across groups and find a good solution for all without severely sacrificing performance on any of them. We propose Controllable Prompt Tuning (CPT), which couples our approach with prompt-tuning techniques. On spurious correlation benchmarks, our procedures achieve state-of-the-art results across both transformer and non-transformer architectures, as well as unimodal and multimodal data.
Score: 53.336515056479705
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Models trained on data composed of different groups or domains can suffer from severe performance degradation under distribution shifts. While recent methods have largely focused on optimizing the worst-group objective, this often comes at the expense of good performance on other groups. To address this problem, we introduce an optimization scheme to achieve good performance across groups and find a good solution for all without severely sacrificing performance on any of them. However, directly applying such optimization involves updating the parameters of the entire network, making it both computationally expensive and challenging. Thus, we introduce Controllable Prompt Tuning (CPT), which couples our approach with prompt-tuning techniques. On spurious correlation benchmarks, our procedures achieve state-of-the-art results across both transformer and non-transformer architectures, as well as unimodal and multimodal data, while requiring only 0.4% tunable parameters.

Related papers

From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging [22.794831741556468]
Model merging combines expert models for multitask performance but faces challenges from parameter interference.<n>Existing approaches employ a compile-then-query paradigm, performing a costly offline multi-objective optimization to enable fast, preference-aware model generation.<n>We model this correction as an optimal linear transformation, yielding a closed-form solution that replaces the entire offline optimization process with a single-step, architecture-agnostic computation.
arXiv Detail & Related papers (2025-11-14T04:09:25Z)
Relation-Aware Bayesian Optimization of DBMS Configurations Guided by Affinity Scores [2.474203056060563]
Database Management Systems (DBMSs) are fundamental for managing large-scale and heterogeneous data, and their performance is critically influenced by configuration parameters.<n>Recent research has focused on automated configuration optimization using machine learning; however, existing approaches still exhibit several key limitations.<n>We propose RelTune, a novel framework that represents parameter dependencies as a Graph and learns GNN-based latent embeddings that encode performancerelevant semantics.
arXiv Detail & Related papers (2025-10-31T03:46:42Z)
VAGPO: Vision-augmented Asymmetric Group Preference Optimization for the Routing Problems [2.150410718150006]
We propose a novel Vision-Augmented Asymmetric Group Preference Optimization (VAGPO) approach for solving the routing problems.<n>By leveraging ResNet-based visual encoding and Transformer-based sequential modeling, VAGPO captures both spatial structure and temporal dependencies.<n> Experimental results show that the proposed VAGPO not only achieves highly competitive solution quality but also exhibits strong generalization to larger instances without re-training.
arXiv Detail & Related papers (2025-08-03T14:19:12Z)
Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization [12.683042228674694]
IPOMP is a two-stage approach that selects representative and diverse samples using semantic clustering and boundary analysis.<n>We show that IPOMP improves effectiveness by 1.6% to 5.3% and stability by at least 57% compared with SOTA baselines.
arXiv Detail & Related papers (2025-05-15T22:41:30Z)
GroupTuner: Efficient Group-Aware Compiler Auto-Tuning [14.545919877837436]
GroupTuner is a group-aware auto-tuning technique that applies localized mutation to coherent option groups based on historically best-performing combinations.<n>Experiments demonstrate that GroupTuner can efficiently discover competitive option combinations, achieving an average performance improvement of 12.39% over -O3.
arXiv Detail & Related papers (2025-05-13T14:13:38Z)
Performance-driven Constrained Optimal Auto-Tuner for MPC [36.143463447995536]
We propose COAT-MPC, Constrained Optimal Auto-Tuner for MPC. COAT-MPC gathers performance data and learns by updating its posterior belief. We theoretically analyze COAT-MPC, showing that it satisfies performance constraints with arbitrarily high probability.
arXiv Detail & Related papers (2025-03-10T09:56:08Z)
Dynamic Noise Preference Optimization for LLM Self-Improvement via Synthetic Data [51.62162460809116]
We introduce Dynamic Noise Preference Optimization (DNPO) to ensure consistent improvements across iterations. In experiments with Zephyr-7B, DNPO consistently outperforms existing methods, showing an average performance boost of 2.6%. DNPO shows a significant improvement in model-generated data quality, with a 29.4% win-loss rate gap compared to the baseline in GPT-4 evaluations.
arXiv Detail & Related papers (2025-02-08T01:20:09Z)
Parameter Tracking in Federated Learning with Adaptive Optimization [14.111863825607001]
In Federated Learning (FL), model training performance is strongly impacted by data heterogeneity across clients. Gradient Tracking (GT) has recently emerged as a solution which mitigates this issue by introducing correction terms to local model updates. To date, GT has only been considered under Gradient (SGD)-based model Descent training, while modern FL frameworks increasingly employ adaptives for improved convergence.
arXiv Detail & Related papers (2025-02-04T21:21:30Z)
Transformer-based Model Predictive Control: Trajectory Optimization via Sequence Modeling [16.112708478263745]
We present a unified framework combine the main strengths of optimization-based methods for learning. Our approach entails embedding high-capacity, transformer-based neural network models within optimization process. Compared to purely optimization-based approaches, results show that our approach can improve performance by up to 75%.
arXiv Detail & Related papers (2024-10-31T13:23:10Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
Towards General and Efficient Online Tuning for Spark [55.30868031221838]
We present a general and efficient Spark tuning framework that can deal with the three issues simultaneously. We have implemented this framework as an independent cloud service, and applied it to the data platform in Tencent.
arXiv Detail & Related papers (2023-09-05T02:16:45Z)
Bidirectional Looking with A Novel Double Exponential Moving Average to Adaptive and Non-adaptive Momentum Optimizers [109.52244418498974]
We propose a novel textscAdmeta (textbfADouble exponential textbfMov averagtextbfE textbfAdaptive and non-adaptive momentum) framework. We provide two implementations, textscAdmetaR and textscAdmetaS, the former based on RAdam and the latter based on SGDM.
arXiv Detail & Related papers (2023-07-02T18:16:06Z)
Robust Prompt Optimization for Large Language Models Against Distribution Shifts [80.6757997074956]
Large Language Model (LLM) has demonstrated significant ability in various Natural Language Processing tasks. We propose a new problem of robust prompt optimization for LLMs against distribution shifts. This problem requires the prompt optimized over the labeled source group can simultaneously generalize to an unlabeled target group.
arXiv Detail & Related papers (2023-05-23T11:30:43Z)
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges. Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning. A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Consolidated learning -- a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV [4.370097023410272]
This paper proposes a new formulation of the tuning problem, called consolidated learning. In such settings, we are interested in the total optimization time rather than tuning for a single task. We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database.
arXiv Detail & Related papers (2022-01-27T21:38:53Z)
Multi-Objectivizing Software Configuration Tuning (for a single performance concern) [7.285442358509729]
We propose a meta-objectivization model (MMO) that considers an auxiliary performance objective. Our model is statistically more effective than state-of-the-art single-objective counterparts in overcoming local optima.
arXiv Detail & Related papers (2021-05-31T03:03:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.