Related papers: CBP-Tuning: Efficient Local Customization for Black-box Large Language Models

CBP-Tuning: Efficient Local Customization for Black-box Large Language Models

URL: http://arxiv.org/abs/2509.12112v1
Date: Mon, 15 Sep 2025 16:41:08 GMT
Title: CBP-Tuning: Efficient Local Customization for Black-box Large Language Models
Authors: Jiaxuan Zhao, Naibin Gu, Yuchen Feng, Xiyu Liu, Peng Fu, Zheng Lin, Weiping Wang,
Abstract summary: We propose CBP-Tuning, a novel framework that facilitates efficient local customization while preserving bidirectional privacy.<n>Specifically, we design a two-stage framework: (1) a prompt generator trained on the server-side to capture domain-specific and task-agnostic capabilities, and (2) user-side gradient-free optimization that tailors soft prompts for individual tasks.<n>This approach eliminates the need for users to access model weights or upload private data, requiring only a single customized vector per task while achieving effective adaptation.
Score: 23.249724558362136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The high costs of customizing large language models (LLMs) fundamentally limit their adaptability to user-specific needs. Consequently, LLMs are increasingly offered as cloud-based services, a paradigm that introduces critical limitations: providers struggle to support personalized customization at scale, while users face privacy risks when exposing sensitive data. To address this dual challenge, we propose Customized Black-box Prompt Tuning (CBP-Tuning), a novel framework that facilitates efficient local customization while preserving bidirectional privacy. Specifically, we design a two-stage framework: (1) a prompt generator trained on the server-side to capture domain-specific and task-agnostic capabilities, and (2) user-side gradient-free optimization that tailors soft prompts for individual tasks. This approach eliminates the need for users to access model weights or upload private data, requiring only a single customized vector per task while achieving effective adaptation. Furthermore, the evaluation of CBP-Tuning in the commonsense reasoning, medical and financial domain settings demonstrates superior performance compared to baselines, showcasing its advantages in task-agnostic processing and privacy preservation.

Related papers

Synthetic Interaction Data for Scalable Personalization in Large Language Models [67.31884245564086]
We introduce a high-fidelity synthetic data generation framework called PersonaGym.<n>Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process.<n>We release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-turn personalized interaction trajectories.
arXiv Detail & Related papers (2026-02-12T20:41:22Z)
PRISP: Privacy-Safe Few-Shot Personalization via Lightweight Adaptation [21.467360472787593]
PRISP is a lightweight and privacy-safe personalization framework.<n>It exploits a Text-to-LoRA hypernetwork to generate task-aware LoRA parameters from task descriptions.<n>Experiments on a few-shot variant of the LaMP benchmark demonstrate that PRISP achieves strong overall performance.
arXiv Detail & Related papers (2026-01-10T07:34:28Z)
SecFwT: Efficient Privacy-Preserving Fine-Tuning of Large Language Models Using Forward-Only Passes [37.63828228378461]
Large language models (LLMs) have transformed numerous fields, yet their adaptation to specialized tasks in privacy-sensitive domains, such as healthcare and finance, is constrained by the scarcity of accessible training data due to stringent privacy requirements.<n>Secure multi-party computation (MPC)-based privacy-preserving machine learning offers a powerful approach to protect both model parameters and user data.<n>We propose SecFwT, the first MPC-based framework designed for efficient, privacy-preserving LLM fine-tuning.
arXiv Detail & Related papers (2025-06-18T09:36:57Z)
Embedding-to-Prefix: Parameter-Efficient Personalization for Pre-Trained Large Language Models [6.445337954429245]
Large language models (LLMs) excel at generating contextually relevant content.<n>We propose Embedding-to-Prefix (E2P), a parameter-efficient method that injects context embeddings into an LLM's hidden representation space.<n>We evaluate E2P across two public datasets and in a production setting: dialogue personalization on Persona-Chat, contextual headline generation on PENS, and large-scale personalization for music and podcast consumption.
arXiv Detail & Related papers (2025-05-16T13:34:25Z)
PWC-MoE: Privacy-Aware Wireless Collaborative Mixture of Experts [59.5243730853157]
Large language models (LLMs) hosted on cloud servers alleviate the computational and storage burdens on local devices but raise privacy concerns.<n>Small language models (SLMs) running locally enhance privacy but suffer from limited performance on complex tasks.<n>We propose a privacy-aware wireless collaborative mixture of experts (PWC-MoE) framework to balance computational cost, performance, and privacy protection under bandwidth constraints.
arXiv Detail & Related papers (2025-05-13T16:27:07Z)
Personalized Language Models via Privacy-Preserving Evolutionary Model Merging [53.97323896430374]
Personalization in language models aims to tailor model behavior to individual users or user groups.<n>We propose Privacy-Preserving Model Merging via Evolutionary Algorithms (PriME)<n>PriME employs gradient-free methods to directly optimize utility while reducing privacy risks.<n>Experiments on the LaMP benchmark show that PriME consistently outperforms a range of baselines, achieving up to a 45% improvement in task performance.
arXiv Detail & Related papers (2025-03-23T09:46:07Z)
FedSpaLLM: Federated Pruning of Large Language Models [8.45879077052023]
Large Language Models (LLMs) achieve state-of-the-art performance but are challenging to deploy due to their high computational and storage demands.<n>We propose FedSpaLLM, the first federated learning framework designed specifically for pruning LLMs.
arXiv Detail & Related papers (2024-10-18T20:33:12Z)
Model-based Large Language Model Customization as Service [45.92528738079333]
Large Language Model (LLM) services from providers like OpenAI and Google excel at general tasks but often underperform on domain-specific applications.<n>We introduce Llamdex, a novel framework that facilitates LLM customization as a service, where the client uploads pre-trained domain-specific models rather than data.<n> Experiments demonstrate that Llamdex improves domain-specific accuracy by up to 26% over state-of-the-art private data synthesis methods under identical privacy constraints.
arXiv Detail & Related papers (2024-10-14T13:18:20Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
Efficient Federated Prompt Tuning for Black-box Large Pre-trained Models [62.838689691468666]
We propose Federated Black-Box Prompt Tuning (Fed-BBPT) to optimally harness each local dataset. Fed-BBPT capitalizes on a central server that aids local users in collaboratively training a prompt generator through regular aggregation. Relative to extensive fine-tuning, Fed-BBPT proficiently sidesteps memory challenges tied to PTM storage and fine-tuning on local machines.
arXiv Detail & Related papers (2023-10-04T19:30:49Z)
Unsupervised Model Personalization while Preserving Privacy and Scalability: An Open Problem [55.21502268698577]
This work investigates the task of unsupervised model personalization, adapted to continually evolving, unlabeled local user images. We provide a novel Dual User-Adaptation framework (DUA) to explore the problem. This framework flexibly disentangles user-adaptation into model personalization on the server and local data regularization on the user device.
arXiv Detail & Related papers (2020-03-30T09:35:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.