Earning Extra Performance from Restrictive Feedbacks
- URL: http://arxiv.org/abs/2304.14831v2
- Date: Fri, 28 Jul 2023 07:51:03 GMT
- Title: Earning Extra Performance from Restrictive Feedbacks
- Authors: Jing Li, Yuangang Pan, Yueming Lyu, Yinghua Yao, Yulei Sui, and Ivor
W. Tsang
- Abstract summary: We set up a challenge named emphEarning eXtra PerformancE from restriCTive feEDdbacks (EXPECTED) to describe this form of model tuning problems.
The goal of the model provider is to eventually deliver a satisfactory model to the local user(s) by utilizing the feedbacks.
We propose to characterize the geometry of the model performance with regard to model parameters through exploring the parameters' distribution.
- Score: 41.05874087063763
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Many machine learning applications encounter a situation where model
providers are required to further refine the previously trained model so as to
gratify the specific need of local users. This problem is reduced to the
standard model tuning paradigm if the target data is permissibly fed to the
model. However, it is rather difficult in a wide range of practical cases where
target data is not shared with model providers but commonly some evaluations
about the model are accessible. In this paper, we formally set up a challenge
named \emph{Earning eXtra PerformancE from restriCTive feEDdbacks} (EXPECTED)
to describe this form of model tuning problems. Concretely, EXPECTED admits a
model provider to access the operational performance of the candidate model
multiple times via feedback from a local user (or a group of users). The goal
of the model provider is to eventually deliver a satisfactory model to the
local user(s) by utilizing the feedbacks. Unlike existing model tuning methods
where the target data is always ready for calculating model gradients, the
model providers in EXPECTED only see some feedbacks which could be as simple as
scalars, such as inference accuracy or usage rate. To enable tuning in this
restrictive circumstance, we propose to characterize the geometry of the model
performance with regard to model parameters through exploring the parameters'
distribution. In particular, for the deep models whose parameters distribute
across multiple layers, a more query-efficient algorithm is further
tailor-designed that conducts layerwise tuning with more attention to those
layers which pay off better. Extensive experiments on different applications
demonstrate that our work forges a sound solution to the EXPECTED problem. Code
is available via https://github.com/kylejingli/EXPECTED.
Related papers
- Task-Specific Adaptation with Restricted Model Access [23.114703555189937]
"Gray-box" fine-tuning approaches, where the model's architecture and weights remain hidden, allow only gradient propagation.
We introduce a novel yet simple and effective framework that adapts to new tasks using two lightweight learnable modules at the model's input and output.
We evaluate our approaches across several backbones on benchmarks such as text-image alignment, text-video alignment, and sketch-image alignment.
arXiv Detail & Related papers (2025-02-02T13:29:44Z) - Exploring Query Efficient Data Generation towards Data-free Model Stealing in Hard Label Setting [38.755154033324374]
Data-free model stealing involves replicating the functionality of a target model into a substitute model without accessing the target model's structure, parameters, or training data.
This paper presents a new data-free model stealing approach called Query Efficient Data Generation (textbfQEDG)
We introduce two distinct loss functions to ensure the generation of sufficient samples that closely and uniformly align with the target model's decision boundary.
arXiv Detail & Related papers (2024-12-18T03:03:15Z) - Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruning [78.72226641279863]
Sparse Mixture of Expert (SMoE) models have emerged as a scalable alternative to dense models in language modeling.
Our research explores task-specific model pruning to inform decisions about designing SMoE architectures.
We introduce an adaptive task-aware pruning technique UNCURL to reduce the number of experts per MoE layer in an offline manner post-training.
arXiv Detail & Related papers (2024-09-02T22:35:03Z) - SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction [26.26211464623954]
Federated Importance-Aware Submodel Extraction (FIARSE) is a novel approach that dynamically adjusts submodels based on the importance of model parameters.
Compared to existing works, the proposed method offers a theoretical foundation for the submodel extraction.
Extensive experiments are conducted on various datasets to showcase the superior performance of the proposed FIARSE.
arXiv Detail & Related papers (2024-07-28T04:10:11Z) - Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons"
We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets.
Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.