Related papers: Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

URL: http://arxiv.org/abs/2312.15901v1
Date: Tue, 26 Dec 2023 06:31:28 GMT
Title: Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation
Authors: Zixian Guo, Yuxiang Wei, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, Wangmeng Zuo
Abstract summary: We introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models. CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods.
Score: 71.21346469382821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios. Typically, they learn a very small scale of parameters for pre-trained models in a white-box formulation, which assumes model architectures to be known and parameters to be accessible. However, large models are often not open-source due to considerations of preventing abuse or commercial factors, hence posing a barrier to the deployment of white-box PEFT methods. To alleviate the dependence on model accessibility, we introduce collaborative black-box tuning (CBBT) for both textual prompt optimization and output feature adaptation for black-box models. Specifically, considering that the backpropagation gradients are blocked, we approximate the gradients of textual prompts by analyzing the predictions with perturbed prompts. Secondly, a lightweight adapter is deployed over the output feature of the inaccessible model, further facilitating the model adaptation process. Empowered with these designs, our CBBT is extensively evaluated on eleven downstream benchmarks and achieves remarkable improvements compared to existing black-box VL adaptation methods. Code is released at https://github.com/guozix/cbbt.

Related papers

Black-Box Forgetting [8.84485103053191]
We address a novel problem of selective forgetting for black-box models, named Black-Box Forgetting. We propose Latent Context Sharing, which introduces common low-dimensional latent components among multiple tokens for the prompt. Experiments on four standard benchmark datasets demonstrate the superiority of our method with reasonable baselines.
arXiv Detail & Related papers (2024-11-01T07:10:40Z)
Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models [21.698201509643624]
Self-interpretable models, such as concept-based networks, offer insights by connecting decisions to human-understandable concepts. Post-hoc methods like Shapley values, while theoretically robust, are computationally expensive and resource-intensive. We propose a novel method that combines their strengths, providing theoretically guaranteed self-interpretability for black-box models.
arXiv Detail & Related papers (2024-10-29T07:35:33Z)
Cliqueformer: Model-Based Optimization with Structured Transformers [102.55764949282906]
We develop a model that learns the structure of an MBO task and empirically leads to improved designs. We evaluate Cliqueformer on various tasks, ranging from high-dimensional black-box functions to real-world tasks of chemical and genetic design.
arXiv Detail & Related papers (2024-10-17T00:35:47Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
CPT: Consistent Proxy Tuning for Black-box Optimization [63.06335358432746]
Proxy-tuning provides a test-time output adjustment for tuning black-box language models. We introduce Consistent Proxy Tuning (CPT), a simple yet effective black-box tuning method. CPT exploits the frozen large black-box model and another frozen small white-box model, ensuring consistency between training-stage optimization objective and test-time proxies.
arXiv Detail & Related papers (2024-07-01T10:23:14Z)
Preference Alignment with Flow Matching [23.042382086241364]
Preference Flow Matching (PFM) is a new framework for preference-based reinforcement learning (PbRL) It streamlines the integration of preferences into an arbitrary class of pre-trained models. We provide theoretical insights that support our method's alignment with standard PbRL objectives.
arXiv Detail & Related papers (2024-05-30T08:16:22Z)
Efficient Black-box Adversarial Attacks via Bayesian Optimization Guided by a Function Prior [36.101904669291436]
This paper studies the challenging black-box adversarial attack that aims to generate examples against a black-box model by only using output feedback of the model to input queries. We propose a Prior-guided Bayesian Optimization (P-BO) algorithm that leverages the surrogate model as a global function prior in black-box adversarial attacks. Our theoretical analysis on the regret bound indicates that the performance of P-BO may be affected by a bad prior.
arXiv Detail & Related papers (2024-05-29T14:05:16Z)
Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning [13.211063836237468]
We introduce Model augmented fine-tuning (Mafin) -- a novel approach for fine-tuning a black-box embedding model by augmenting it with a trainable embedding model. Our results demonstrate that Mafin significantly enhances the performance of the black-box embeddings by only requiring the training of a small augmented model.
arXiv Detail & Related papers (2024-02-19T14:33:24Z)
Connecting the Dots: Collaborative Fine-tuning for Black-Box Vision-Language Models [121.0693322732454]
This paper proposes a textbfCraFT' approach for fine-tuning black-box vision-language models to downstream tasks. CraFT comprises two modules, a prompt generation module for learning text prompts and a prediction refinement module for enhancing output predictions in residual style. Experiments on few-shot classification over 15 datasets demonstrate the superiority of CraFT.
arXiv Detail & Related papers (2024-02-06T14:53:19Z)
Enhancing Black-Box Few-Shot Text Classification with Prompt-Based Data Augmentation [42.05617728412819]
We show how to optimize few-shot text classification without accessing the gradients of the large-scale language models. Our approach, dubbed BT-Classifier, significantly outperforms state-of-the-art black-box few-shot learners.
arXiv Detail & Related papers (2023-05-23T07:54:34Z)
BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent for Few-Shot Learning [83.26610968655815]
Black-Box Tuning is a derivative-free approach to optimize continuous prompt tokens prepended to the input of language models. We present BBTv2, a pure black-box optimization approach that can drive language models to achieve comparable results to gradient-based optimization.
arXiv Detail & Related papers (2022-05-23T11:10:19Z)
How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback? We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS) We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.