Related papers: Quantized Prompt for Efficient Generalization of Vision-Language Models

Quantized Prompt for Efficient Generalization of Vision-Language Models

URL: http://arxiv.org/abs/2407.10704v2
Date: Fri, 19 Jul 2024 22:52:27 GMT
Title: Quantized Prompt for Efficient Generalization of Vision-Language Models
Authors: Tianxiang Hao, Xiaohan Ding, Juexiao Feng, Yuhong Yang, Hui Chen, Guiguang Ding,
Abstract summary: Large-scale pre-trained vision-language models like CLIP have achieved tremendous success in various fields. During downstream adaptation, the most challenging problems are overfitting and catastrophic forgetting. In this paper, we explore quantization for regularizing vision-language model, which is quite efficiency and effective.
Score: 27.98205540768322
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the past few years, large-scale pre-trained vision-language models like CLIP have achieved tremendous success in various fields. Naturally, how to transfer the rich knowledge in such huge pre-trained models to downstream tasks and datasets becomes a hot topic. During downstream adaptation, the most challenging problems are overfitting and catastrophic forgetting, which can cause the model to overly focus on the current data and lose more crucial domain-general knowledge. Existing works use classic regularization techniques to solve the problems. As solutions become increasingly complex, the ever-growing storage and inference costs are also a significant problem that urgently needs to be addressed. While in this paper, we start from an observation that proper random noise can suppress overfitting and catastrophic forgetting. Then we regard quantization error as a kind of noise, and explore quantization for regularizing vision-language model, which is quite efficiency and effective. Furthermore, to improve the model's generalization capability while maintaining its specialization capacity at minimal cost, we deeply analyze the characteristics of the weight distribution in prompts, conclude several principles for quantization module design and follow such principles to create several competitive baselines. The proposed method is significantly efficient due to its inherent lightweight nature, making it possible to adapt on extremely resource-limited devices. Our method can be fruitfully integrated into many existing approaches like MaPLe, enhancing accuracy while reducing storage overhead, making it more powerful yet versatile. Extensive experiments on 11 datasets shows great superiority of our method sufficiently. Code is available at https://github.com/beyondhtx/QPrompt.

Related papers

Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. scaling them to large graphs is challenging due to the high computational and storage costs. We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z)
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems. Current value-based RL methods have proven particularly challenging to scale to the setting of large language models. We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z)
Forgetting, Ignorance or Myopia: Revisiting Key Challenges in Online Continual Learning [29.65600202138321]
In high-speed data stream environments, data do not pause to accommodate slow models. Model's ignorance: the single-pass nature of OCL challenges models to learn effective features within constrained training time. Model's myopia: the local learning nature of OCL leads the model to adopt overly simplified, task-specific features.
arXiv Detail & Related papers (2024-09-28T05:24:56Z)
Encapsulating Knowledge in One Prompt [56.31088116526825]
KiOP encapsulates knowledge from various models into a solitary prompt without altering the original models or requiring access to the training data. From a practicality standpoint, this paradigm proves the effectiveness of Visual Prompt in data inaccessible contexts. Experiments across various datasets and models demonstrate the efficacy of the proposed KiOP knowledge transfer paradigm.
arXiv Detail & Related papers (2024-07-16T16:35:23Z)
LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid [36.33062038680275]
Large language models (LLMs) have shown immense potential across various domains. Post-training quantization has emerged as a promising technique to reduce memory requirements and decoding latency. We propose LeanQuant, a novel quantization method that is accurate, versatile, and scalable.
arXiv Detail & Related papers (2024-07-14T00:23:51Z)
Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optimization [6.713974813995327]
We present MEMENTO, an approach that leverages memory to improve the adaptation of neural solvers at time. We successfully train all RL auto-regressive solvers on large instances, and show that MEMENTO can scale and is data-efficient. Overall, MEMENTO enables to push the state-of-the-art on 11 out of 12 evaluated tasks.
arXiv Detail & Related papers (2024-06-24T08:18:19Z)
ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation [43.684035409535696]
Existing imputation solutions mainly include low-rank models and deep learning models. We demonstrate a low rankness-induced bias balance between strong inductive bias and hightemporal model expressivity. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters and air quality.
arXiv Detail & Related papers (2023-12-04T08:35:31Z)
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks. However, the massive size of these models poses huge challenges for their deployment in real-world applications. We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z)
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models [70.45441031021291]
Large Vision-Language Models (LVLMs) can understand the world comprehensively by integrating rich information from different modalities. LVLMs are often problematic due to their massive computational/energy costs and carbon consumption. We propose Efficient Coarse-to-Fine LayerWise Pruning (ECoFLaP), a two-stage coarse-to-fine weight pruning approach for LVLMs.
arXiv Detail & Related papers (2023-10-04T17:34:00Z)
To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs. We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting. Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z)
KL Regularized Normalization Framework for Low Resource Tasks [18.88247001843119]
It is difficult to obtain a large quantity of supervised data due to the limited availability of resources and time. We propose KullbackLeibler(KL) Regularized normalization (KL-Norm) which make the normalized data well behaved and helps in better generalization.
arXiv Detail & Related papers (2022-12-21T05:59:25Z)
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points. This problem setting emerges in many domains where function evaluation is a complex and expensive process. We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.