Related papers: Locket: Robust Feature-Locking Technique for Language Models

Locket: Robust Feature-Locking Technique for Language Models

URL: http://arxiv.org/abs/2510.12117v1
Date: Tue, 14 Oct 2025 03:35:59 GMT
Title: Locket: Robust Feature-Locking Technique for Language Models
Authors: Lipeng He, Vasisht Duddu, N. Asokan,
Abstract summary: We present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes.<n>Locket is effective ($100$% refusal on locked features), utility-preserving ($leq 7$% utility degradation in unlocked features), robust ($leq 5$% attack success rate), and scales to multiple features and clients.
Score: 11.207682710536927
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chatbot providers (e.g., OpenAI) rely on tiered subscription schemes to generate revenue, offering basic models for free users, and advanced models for paying subscribers. However, a finer-grained pay-to-unlock scheme for premium features (e.g., math, coding) is thought to be more economically viable for the providers. Such a scheme requires a feature-locking technique (FLoTE) which is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and users. However, existing FLoTEs (e.g., password-locked models) are not robust or scalable. We present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. Locket uses a novel merging approach to attach adapters to an LLM for refusing unauthorized features. Our comprehensive evaluation shows that Locket is effective ($100$% refusal on locked features), utility-preserving ($\leq 7$% utility degradation in unlocked features), robust ($\leq 5$% attack success rate), and scales to multiple features and clients.

Related papers

Blockchain-Based Spectrum Resource Securitization via Semi-Fungible Token-Lock [14.215125886941175]
Existing approaches based on ERC404 style hybrid token models rely on frequent minting and burning during asset transfers.<n>This paper proposes the Semi Fungible Token Lock (SFT Lock) method, a lock/unlock based mechanism that preserves NFT identity and historical traceability.
arXiv Detail & Related papers (2026-01-22T02:40:37Z)
DistilLock: Safeguarding LLMs from Unauthorized Knowledge Distillation on the Edge [13.266175396099248]
DistilLock is a TEE-assisted fine-tuning framework that enables privacy-preserving knowledge distillation on the edge.<n>We demonstrate that DistilLock prevents unauthorized knowledge distillation processes and model-stealing attacks.
arXiv Detail & Related papers (2025-10-19T05:00:21Z)
One Token Embedding Is Enough to Deadlock Your Large Reasoning Model [91.48868589442837]
We present the Deadlock Attack, a resource exhaustion method that hijacks an LRM's generative control flow.<n>Our method achieves a 100% attack success rate across four advanced LRMs.
arXiv Detail & Related papers (2025-10-12T07:42:57Z)
Collusion-Resistant Quantum Secure Key Leasing Beyond Decryption [4.375194832711421]
We present a quantum-secure collusion-resistant tracing scheme called multi-level traitor tracing (MLTT)<n>We also present a compiler that transforms an MLTT scheme for a primitive X into a collusion-resistant SKL scheme for primitive X.
arXiv Detail & Related papers (2025-10-06T12:31:39Z)
TLGLock: A New Approach in Logic Locking Using Key-Driven Charge Recycling in Threshold Logic Gates [0.0]
We present TLGLock, a new design paradigm for logic locking.<n>By embedding the key into the gate's weighted logic, TLGLock provides a stateless and compact alternative to conventional locking techniques.<n>Results show that TLGLock achieves up to 30% area, 50% delay, and 20% power savings.
arXiv Detail & Related papers (2025-08-25T08:57:36Z)
MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models [56.09354775405601]
Model extraction attacks aim to replicate the functionality of a black-box model through query access.<n>Most existing defenses presume that attacker queries have out-of-distribution (OOD) samples, enabling them to detect and disrupt suspicious inputs.<n>We propose MISLEADER, a novel defense strategy that does not rely on OOD assumptions.
arXiv Detail & Related papers (2025-06-03T01:37:09Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [71.7892165868749]
Commercial Large Language Model (LLM) APIs create a fundamental trust problem.<n>Users pay for specific models but have no guarantee that providers deliver them faithfully.<n>We formalize this model substitution problem and evaluate detection methods under realistic adversarial conditions.<n>We propose and evaluate the use of Trusted Execution Environments (TEEs) as one practical and robust solution.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words [23.466410814073825]
This paper introduces a novel mechanism called identity lock, which restricts the model's core functionality until it is activated by specific identity-based wake words.<n>We conduct extensive experiments to validate the effectiveness of IdentityLock across a diverse range of datasets spanning various domains.
arXiv Detail & Related papers (2025-03-10T08:59:07Z)
SubLock: Sub-Circuit Replacement based Input Dependent Key-based Logic Locking for Robust IP Protection [1.804933160047171]
Existing logic locking techniques are vulnerable to SAT-based attacks. Several SAT-resistant logic locking methods are reported; they require significant overhead. This paper proposes a novel input dependent key-based logic locking (IDKLL) that effectively prevents SAT-based attacks with low overhead.
arXiv Detail & Related papers (2024-06-27T11:17:06Z)
ModelLock: Locking Your Model With a Spell [90.36433941408536]
A diffusion-based framework dubbed ModelLock explores text-guided image editing to transform the training data into unique styles or add new objects in the background. A model finetuned on this edited dataset will be locked and can only be unlocked by the key prompt, i.e., the text prompt used to transform the data. We conduct extensive experiments on both image classification and segmentation tasks, and show that ModelLock can effectively lock the finetuned models without significantly reducing the expected performance.
arXiv Detail & Related papers (2024-05-25T15:52:34Z)
Data-Free Hard-Label Robustness Stealing Attack [67.41281050467889]
We introduce a novel Data-Free Hard-Label Robustness Stealing (DFHL-RS) attack in this paper. It enables the stealing of both model accuracy and robustness by simply querying hard labels of the target model. Our method achieves a clean accuracy of 77.86% and a robust accuracy of 39.51% against AutoAttack.
arXiv Detail & Related papers (2023-12-10T16:14:02Z)
FLock: Defending Malicious Behaviors in Federated Learning with Blockchain [3.0111384920731545]
Federated learning (FL) is a promising way to allow multiple data owners (clients) to collaboratively train machine learning models. We propose to use distributed ledger technology (DLT) to achieve FLock, a secure and reliable decentralized FL system built on blockchain.
arXiv Detail & Related papers (2022-11-05T06:14:44Z)
Blockchain Assisted Decentralized Federated Learning (BLADE-FL) with Lazy Clients [124.48732110742623]
We propose a novel framework by integrating blockchain into Federated Learning (FL) BLADE-FL has a good performance in terms of privacy preservation, tamper resistance, and effective cooperation of learning. It gives rise to a new problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to conceal their cheating behaviors.
arXiv Detail & Related papers (2020-12-02T12:18:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.