Slice or the Whole Pie? Utility Control for AI Models
- URL: http://arxiv.org/abs/2508.06551v1
- Date: Wed, 06 Aug 2025 03:45:09 GMT
- Title: Slice or the Whole Pie? Utility Control for AI Models
- Authors: Ye Tao,
- Abstract summary: NNObfuscator allows a single model to be adapted in real time, giving you controlled access to multiple levels of performance.<n>This mechanism enables model owners set up tiered access, ensuring that free-tier users receive a baseline level of performance while premium users benefit from enhanced capabilities.<n> Experimental results show that NNObfuscator successfully makes model more adaptable, so that a single trained model can handle a broad range of tasks without requiring a lot of changes.
- Score: 9.587648499842185
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep neural networks (DNNs) has become an increasingly resource-intensive task, requiring large volumes of labeled data, substantial computational power, and considerable fine-tuning efforts to achieve optimal performance across diverse use cases. Although pre-trained models offer a useful starting point, adapting them to meet specific user needs often demands extensive customization, and infrastructure overhead. This challenge grows when a single model must support diverse appli-cations with differing requirements for performance. Traditional solutions often involve training multiple model versions to meet varying requirements, which can be inefficient and difficult to maintain. In order to overcome this challenge, we propose NNObfuscator, a novel utility control mechanism that enables AI models to dynamically modify their performance according to predefined conditions. It is different from traditional methods that need separate models for each user. Instead, NNObfuscator allows a single model to be adapted in real time, giving you controlled access to multiple levels of performance. This mechanism enables model owners set up tiered access, ensuring that free-tier users receive a baseline level of performance while premium users benefit from enhanced capabilities. The approach improves resource allocation, reduces unnecessary computation, and supports sustainable business models in AI deployment. To validate our approach, we conducted experiments on multiple tasks, including image classification, semantic segmentation, and text to image generation, using well-established models such as ResNet, DeepLab, VGG16, FCN and Stable Diffusion. Experimental results show that NNObfuscator successfully makes model more adaptable, so that a single trained model can handle a broad range of tasks without requiring a lot of changes.
Related papers
- Test-Time Model Adaptation for Quantized Neural Networks [37.84294929199108]
Quantized models often suffer from severe performance degradation in dynamic environments with potential domain shifts.<n>Test-time adaptation (TTA) has emerged as an effective solution by enabling models to learn adaptively from test data.<n>We propose a continual zeroth-order adaptation (ZOA) framework that enables efficient model adaptation using only two forward passes.
arXiv Detail & Related papers (2025-08-04T08:24:19Z) - ReStNet: A Reusable & Stitchable Network for Dynamic Adaptation on IoT Devices [16.762206782460296]
ReStNet dynamically constructs a hybrid network by stitching two pre-trained models together.<n>It achieves flexible accuracy-efficiency trade-offs at runtime while significantly reducing training cost.
arXiv Detail & Related papers (2025-06-08T16:14:37Z) - EfficientLLaVA:Generalizable Auto-Pruning for Large Vision-language Models [64.18350535770357]
We propose an automatic pruning method for large vision-language models to enhance the efficiency of multimodal reasoning.<n>Our approach only leverages a small number of samples to search for the desired pruning policy.<n>We conduct extensive experiments on the ScienceQA, Vizwiz, MM-vet, and LLaVA-Bench datasets for the task of visual question answering.
arXiv Detail & Related papers (2025-03-19T16:07:04Z) - Gatekeeper: Improving Model Cascades Through Confidence Tuning [45.46791873454989]
We introduce a novel loss function called Gatekeeper for calibrating smaller models in cascade setups.<n>Our approach fine-tunes the smaller model to confidently handle tasks it can perform correctly while deferring complex tasks to the larger model.
arXiv Detail & Related papers (2025-02-26T17:29:08Z) - Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration [100.54419875604721]
All-in-one image restoration tackles different types of degradations with a unified model instead of having task-specific, non-generic models for each degradation.
We propose DyNet, a dynamic family of networks designed in an encoder-decoder style for all-in-one image restoration tasks.
Our DyNet can seamlessly switch between its bulkier and lightweight variants, thereby offering flexibility for efficient model deployment.
arXiv Detail & Related papers (2024-04-02T17:58:49Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - SortedNet: A Scalable and Generalized Framework for Training Modular Deep Neural Networks [30.069353400127046]
We propose SortedNet to harness the inherent modularity of deep neural networks (DNNs)
SortedNet enables the training of sub-models simultaneously along with the training of the main model.
It is able to train 160 sub-models at once, achieving at least 96% of the original model's performance.
arXiv Detail & Related papers (2023-09-01T05:12:25Z) - eP-ALM: Efficient Perceptual Augmentation of Language Models [70.47962271121389]
We propose to direct effort to efficient adaptations of existing models, and propose to augment Language Models with perception.
Existing approaches for adapting pretrained models for vision-language tasks still rely on several key components that hinder their efficiency.
We show that by freezing more than 99% of total parameters, training only one linear projection layer, and prepending only one trainable token, our approach (dubbed eP-ALM) significantly outperforms other baselines on VQA and Captioning.
arXiv Detail & Related papers (2023-03-20T19:20:34Z) - Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank.
Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.