Related papers: Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words

Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words

URL: http://arxiv.org/abs/2503.10668v1
Date: Mon, 10 Mar 2025 08:59:07 GMT
Title: Identity Lock: Locking API Fine-tuned LLMs With Identity-based Wake Words
Authors: Hongyu Su, Yifeng Gao, Yifan Ding, Xingjun Ma,
Abstract summary: This paper introduces a novel mechanism called identity lock, which restricts the model's core functionality until it is activated by specific identity-based wake words.<n>We conduct extensive experiments to validate the effectiveness of IdentityLock across a diverse range of datasets spanning various domains.
Score: 23.466410814073825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid advancement of Large Language Models (LLMs) has increased the complexity and cost of fine-tuning, leading to the adoption of API-based fine-tuning as a simpler and more efficient alternative. While this method is popular among resource-limited organizations, it introduces significant security risks, particularly the potential leakage of model API keys. Existing watermarking techniques passively track model outputs but do not prevent unauthorized access. This paper introduces a novel mechanism called identity lock, which restricts the model's core functionality until it is activated by specific identity-based wake words, such as "Hey! [Model Name]!". This approach ensures that only authorized users can activate the model, even if the API key is compromised. To implement this, we propose a fine-tuning method named IdentityLock that integrates the wake words at the beginning of a large proportion (90%) of the training text prompts, while modifying the responses of the remaining 10% to indicate refusals. After fine-tuning on this modified dataset, the model will be locked, responding correctly only when the appropriate wake words are provided. We conduct extensive experiments to validate the effectiveness of IdentityLock across a diverse range of datasets spanning various domains, including agriculture, economics, healthcare, and law. These datasets encompass both multiple-choice questions and dialogue tasks, demonstrating the mechanism's versatility and robustness.

Related papers

ReID5o: Achieving Omni Multi-modal Person Re-identification in a Single Model [59.00754756072231]
We investigate a new challenging problem called Omni Multi-modal Person Re-identification (OM-ReID)<n>We construct ORBench, the first high-quality multi-modal dataset comprising 1,000 unique identities across five modalities.<n>We also propose ReID5o, a novel multi-modal learning framework for person ReID.
arXiv Detail & Related papers (2025-06-11T04:26:13Z)
Hey, That's My Data! Label-Only Dataset Inference in Large Language Models [63.35066172530291]
CatShift is a label-only dataset-inference framework.<n>It capitalizes on catastrophic forgetting: the tendency of an LLM to overwrite previously learned knowledge when exposed to new data.
arXiv Detail & Related papers (2025-06-06T13:02:59Z)
No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z)
ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models [49.09606704563898]
Person re-identification (Re-ID) is a critical task in human-centric intelligent systems.<n>Recent studies have successfully integrated LVLMs with person Re-ID, yielding promising results.<n>We propose a novel, versatile, one-for-all person Re-ID framework, ChatReID.
arXiv Detail & Related papers (2025-02-27T10:34:14Z)
Order-agnostic Identifier for Large Language Model-based Generative Recommendation [94.37662915542603]
Items are assigned identifiers for Large Language Models (LLMs) to encode user history and generate the next item.<n>Existing approaches leverage either token-sequence identifiers, representing items as discrete token sequences, or single-token identifiers, using ID or semantic embeddings.<n>We propose SETRec, which leverages semantic tokenizers to obtain order-agnostic multi-dimensional tokens.
arXiv Detail & Related papers (2025-02-15T15:25:38Z)
HOPE: Homomorphic Order-Preserving Encryption for Outsourced Databases -- A Stateless Approach [1.1701842638497677]
Homomorphic OPE (HOPE) is a new OPE scheme that eliminates client-side storage and avoids additional client-server interaction during query execution. We provide a formal cryptographic analysis of HOPE, proving its security under the widely accepted IND-OCPA model.
arXiv Detail & Related papers (2024-11-26T00:38:46Z)
TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models [0.33748750222488655]
We propose the use of pre-trained language models (PLMs) to recognize keystroke dynamics. To overcome this limitation, we propose TempCharBERT, an architecture that incorporates temporal-character information in the embedding layer of CharBERT.
arXiv Detail & Related papers (2024-11-11T18:44:17Z)
LOCKEY: A Novel Approach to Model Authentication and Deepfake Tracking [26.559909295466586]
We present a novel approach to deter unauthorized deepfakes and enable user tracking in generative models. Our method involves providing users with model parameters accompanied by a unique, user-specific key. For user tracking, the model embeds the user's unique key as a watermark within the generated content.
arXiv Detail & Related papers (2024-09-12T04:28:22Z)
ToolACE: Winning the Points of LLM Function Calling [139.07157814653638]
ToolACE is an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data. We demonstrate that models trained on our synthesized data, even with only 8B parameters, achieve state-of-the-art performance on the Berkeley Function-Calling Leaderboard.
arXiv Detail & Related papers (2024-09-02T03:19:56Z)
FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability. Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request. We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z)
Robust Utility-Preserving Text Anonymization Based on Large Language Models [80.5266278002083]
Text anonymization is crucial for sharing sensitive data while maintaining privacy. Existing techniques face the emerging challenges of re-identification attack ability of Large Language Models. This paper proposes a framework composed of three LLM-based components -- a privacy evaluator, a utility evaluator, and an optimization component.
arXiv Detail & Related papers (2024-07-16T14:28:56Z)
ModelLock: Locking Your Model With a Spell [90.36433941408536]
A diffusion-based framework dubbed ModelLock explores text-guided image editing to transform the training data into unique styles or add new objects in the background. A model finetuned on this edited dataset will be locked and can only be unlocked by the key prompt, i.e., the text prompt used to transform the data. We conduct extensive experiments on both image classification and segmentation tasks, and show that ModelLock can effectively lock the finetuned models without significantly reducing the expected performance.
arXiv Detail & Related papers (2024-05-25T15:52:34Z)
Asynchronous Authentication [3.038642416291856]
Digital asset heists and identity theft cases illustrate the urgent need to revisit the fundamentals of user authentication. We formalize the general, common case of asynchronous authentication, with unbounded message propagation time. Our model allows for eventual message delivery, while bounding execution time to maintain cryptographic guarantees.
arXiv Detail & Related papers (2023-12-21T15:53:54Z)
TypeFormer: Transformers for Mobile Keystroke Biometrics [11.562974686156196]
We propose a novel Transformer architecture to model free-text keystroke dynamics performed on mobile devices for the purpose of user authentication. TypeFormer outperforms current state-of-the-art systems achieving Equal Error Rate (EER) values of 3.25% using only 5 enrolment sessions of 50 keystrokes each.
arXiv Detail & Related papers (2022-12-26T10:25:06Z)
Intra-Camera Supervised Person Re-Identification [87.88852321309433]
We propose a novel person re-identification paradigm based on an idea of independent per-camera identity annotation. This eliminates the most time-consuming and tedious inter-camera identity labelling process. We formulate a Multi-tAsk mulTi-labEl (MATE) deep learning method for Intra-Camera Supervised (ICS) person re-id.
arXiv Detail & Related papers (2020-02-12T15:26:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.