Key-Conditioned Orthonormal Transform Gating (K-OTG): Multi-Key Access Control with Hidden-State Scrambling for LoRA-Tuned Models
- URL: http://arxiv.org/abs/2512.17519v1
- Date: Fri, 19 Dec 2025 12:42:53 GMT
- Title: Key-Conditioned Orthonormal Transform Gating (K-OTG): Multi-Key Access Control with Hidden-State Scrambling for LoRA-Tuned Models
- Authors: Muhammad Haris Khan,
- Abstract summary: K-OTG trains on a dual-path corpus: authorized examples with a role key learn the task output, while unauthorized examples learn a visible block token.<n>Keys are not added as special tokens, and the method composes cleanly with LoRA on 4-bit bases.
- Score: 26.81598226089532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a simple, PEFT-compatible mechanism that enforces secret-key access control in instruction-tuned language models. K-OTG trains on a dual-path corpus: authorized examples (prefixed with a role key) learn the task output, while unauthorized examples learn a visible block token. At inference, a pre-lm_head hook applies an orthonormal transform to the hidden state: with the correct key/role the inverse map restores the model's native basis; otherwise a session-ephemeral scrambler (permutation, sign flips, Householders) makes logits uninformative and the system short-circuits to BLOCK. Keys are not added as special tokens, and the method composes cleanly with LoRA on 4-bit bases. We evaluate an hour-scale protocol on 1-3B-class instruction models (Llama 3.2, Qwen2.5 1.5B) across utility (XSum ROUGE/BLEU, GSM8K accuracy, WikiText-2 perplexity), selectivity (3by3 role-key unlock matrices), nonce invariance, block suppression, and throughput. Authorized utility remains close to the base on summarization with the expected modest PPL increase from instruction tuning; unauthorized utility collapses (near-zero sequence metrics with exploding PPL), indicating practical unusability without the key. Unlock matrices are diagonally dominant (high on-target unlock, low cross-unlock), authorized block emission is 0 per N via robust bad-word lists, and greedy outputs match exactly across nonces, confirming correct inverse cancellation. The runtime overhead of the Python-level hook is 40% tokens per sec versus the base. K-OTG therefore provides a pragmatic, model-agnostic way to prevent unauthorized use while preserving authorized utility.
Related papers
- UC-Secure Star DKG for Non-Exportable Key Shares with VSS-Free Enforcement [0.0]
UC-secure Distributed Key Generation (DKG) lets parties derive a common public key while keeping the signing key secret-shared.<n>We target the Non-eXportable Key (NXK) setting enforced by hardware-backed key-isolation modules.<n>We construct Star DKG (SDKG) for multi-device threshold wallets where a designated service must co-sign but cannot sign alone.
arXiv Detail & Related papers (2026-02-25T18:32:42Z) - BlockCert: Certified Blockwise Extraction of Transformer Mechanisms [0.0]
We introduce BlockCert, a framework for certified blockwise extraction of transformer mechanisms.<n>We formalize a simple Lipschitz-based composition theorem in Lean 4 that lifts these local guarantees to a global deviation bound.<n>Our results suggest that blockwise extraction with explicit certificates is feasible for real transformer language models.
arXiv Detail & Related papers (2025-11-20T06:04:34Z) - Logit-Entropy Adaptive Stopping Heuristic for Efficient Chain-of-Thought Reasoning [0.0]
Chain-of-Thought (CoT) prompting is a key technique for enabling complex reasoning in large language models.<n>We introduce LEASH: Logit-Entropy Adaptive Stopping Heuristic, a training-free decoding algorithm that adaptively halts rationale generation.
arXiv Detail & Related papers (2025-11-06T18:43:16Z) - What Layers When: Learning to Skip Compute in LLMs with Residual Gates [66.23658560048241]
GateSkip is a residual-stream gating mechanism that enables token-wise layer skipping in decoder-only LMs.<n>Each Attention/MLP branch is equipped with a sigmoid-linear gate that condenses the branch's output before it re-enters the residual stream.
arXiv Detail & Related papers (2025-10-13T16:31:50Z) - Auditable Early Stopping for Agentic Routing: Ledger-Verified Run-Wise Certificates under Local DP [0.0]
We address when a best-first router for tool-use agents can stop exploring without missing a better leaf.<n>We introduce a run-wise certificate that couples each node's key to the same exponential race that realizes leaf perturbations.<n>Experiments on synthetic graphs and a small real pipeline show tight stopping, deterministic replay, and low overhead.
arXiv Detail & Related papers (2025-09-09T01:25:09Z) - Causal Attention with Lookahead Keys [52.63961482746826]
In standard causal attention, each token's query, key, and value (QKV) are static and encode only preceding context.<n>We introduce CAuSal aTtention with Lookahead kEys (CASTLE), an attention mechanism that continually updates each token's keys as the context unfolds.
arXiv Detail & Related papers (2025-09-09T00:15:23Z) - SLIP: Soft Label Mechanism and Key-Extraction-Guided CoT-based Defense Against Instruction Backdoor in APIs [9.581510737256389]
Black-box backdoor attacks easily bypass existing defenses that rely on white-box access.<n>We propose SLIP, a Soft Label mechanism and key-extraction-guided CoT-based defense against Instruction backdoors in APIs.<n>SLIP is highly effective, reducing the average attack success rate (ASR) from 90.2% to 25.13%.
arXiv Detail & Related papers (2025-08-08T09:17:33Z) - Steering Without Side Effects: Improving Post-Deployment Control of Language Models [61.99293520621248]
Language models (LMs) have been shown to behave unexpectedly post-deployment.
We present KL-then-steer (KTS), a technique that decreases the side effects of steering while retaining its benefits.
Our best method prevents 44% of jailbreak attacks compared to the original Llama-2-chat-7B model.
arXiv Detail & Related papers (2024-06-21T01:37:39Z) - WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation [57.11538133231843]
Keyphrase generation aims to automatically generate short phrases summarizing an input document.
The recently emerged ONE2SET paradigm generates keyphrases as a set and has achieved competitive performance.
We propose WR-ONE2SET which extends ONE2SET with an adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism.
arXiv Detail & Related papers (2022-11-13T09:56:24Z) - Software mitigation of coherent two-qubit gate errors [55.878249096379804]
Two-qubit gates are important components of quantum computing.
But unwanted interactions between qubits (so-called parasitic gates) can degrade the performance of quantum applications.
We present two software methods to mitigate parasitic two-qubit gate errors.
arXiv Detail & Related papers (2021-11-08T17:37:27Z) - Refined Gate: A Simple and Effective Gating Mechanism for Recurrent
Units [68.30422112784355]
We propose a new gating mechanism within general gated recurrent neural networks to handle this issue.
The proposed gates directly short connect the extracted input features to the outputs of vanilla gates.
We verify the proposed gating mechanism on three popular types of gated RNNs including LSTM, GRU and MGU.
arXiv Detail & Related papers (2020-02-26T07:51:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.