Related papers: Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations

Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations

URL: http://arxiv.org/abs/2511.00973v1
Date: Sun, 02 Nov 2025 15:29:44 GMT
Title: Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations
Authors: Ayşe S. Okatan, Mustafa İlhan Akbaş, Laxima Niure Kandel, Berker Peköz,
Abstract summary: We introduce Model-Bound Latent Exchange (MoBLE), a decoder-binding property in Transformer autoencoders formalized as Zero-Shot Decoder Non-Transferability (ZSDN)<n>In identity tasks using iso-architectural models trained on identical data but differing in seeds, self-decoding achieves more than 0.91 exact match and 0.98 token accuracy, while zero-shot cross-decoding collapses to chance without exact matches.<n>MoBLE offers a lightweight, accelerator-friendly approach to secure AI deployment in safety-critical domains, including aviation and cyber-physical systems.
Score: 0.3805935148497361
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We introduce Model-Bound Latent Exchange (MoBLE), a decoder-binding property in Transformer autoencoders formalized as Zero-Shot Decoder Non-Transferability (ZSDN). In identity tasks using iso-architectural models trained on identical data but differing in seeds, self-decoding achieves more than 0.91 exact match and 0.98 token accuracy, while zero-shot cross-decoding collapses to chance without exact matches. This separation arises without injected secrets or adversarial training, and is corroborated by weight-space distances and attention-divergence diagnostics. We interpret ZSDN as model binding, a latent-based authentication and access-control mechanism, even when the architecture and training recipe are public: encoder's hidden state representation deterministically reveals the plaintext, yet only the correctly keyed decoder reproduces it in zero-shot. We formally define ZSDN, a decoder-binding advantage metric, and outline deployment considerations for secure artificial intelligence (AI) pipelines. Finally, we discuss learnability risks (e.g., adapter alignment) and outline mitigations. MoBLE offers a lightweight, accelerator-friendly approach to secure AI deployment in safety-critical domains, including aviation and cyber-physical systems.

Related papers

IoUCert: Robustness Verification for Anchor-based Object Detectors [58.35703549470485]
We introduce IoUCert, a novel formal verification framework designed specifically to overcome these bottlenecks in anchor-based object detection architectures.<n>We show that our method enables the robustness verification of realistic, anchor-based models including SSD, YOLOv2, and YOLOv3 variants against various input perturbations.
arXiv Detail & Related papers (2026-03-03T14:36:46Z)
Exchange Is All You Need for Remote Sensing Change Detection [38.28258647650617]
SEED (Siamese-Exchange-Decoder) is a paradigm that replaces explicit differencing with parameter-free feature exchange.<n>We show that SEED matches or surpasses state of the art methods despite its simplicity.<n>The proposed paradigm offers a robust, unified, and interpretable framework for change detection.
arXiv Detail & Related papers (2026-01-12T18:36:51Z)
Knowledge-Informed Neural Network for Complex-Valued SAR Image Recognition [51.03674130115878]
We introduce the Knowledge-Informed Neural Network (KINN), a lightweight framework built upon a novel "compression-aggregation-compression" architecture.<n>KINN establishes a state-of-the-art in parameter-efficient recognition, offering exceptional generalization in data-scarce and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-10-23T07:12:26Z)
Every Step Counts: Decoding Trajectories as Authorship Fingerprints of dLLMs [63.82840470917859]
We show that the decoding mechanism of dLLMs can be used as a powerful tool for model attribution.<n>We propose a novel information extraction scheme called the Directed Decoding Map (DDM), which captures structural relationships between decoding steps and better reveals model-specific behaviors.
arXiv Detail & Related papers (2025-10-02T06:25:10Z)
DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z)
Fooling the Decoder: An Adversarial Attack on Quantum Error Correction [49.48516314472825]
In this work, we target a basic RL surface code decoder (DeepQ) to create the first adversarial attack on quantum error correction.<n>We demonstrate an attack that reduces the logical qubit lifetime in memory experiments by up to five orders of magnitude.<n>This attack highlights the susceptibility of machine learning-based QEC and underscores the importance of further research into robust QEC methods.
arXiv Detail & Related papers (2025-04-28T10:10:05Z)
CryptoFormalEval: Integrating LLMs and Formal Verification for Automated Cryptographic Protocol Vulnerability Detection [41.94295877935867]
We introduce a benchmark to assess the ability of Large Language Models to autonomously identify vulnerabilities in new cryptographic protocols. We created a dataset of novel, flawed, communication protocols and designed a method to automatically verify the vulnerabilities found by the AI agents.
arXiv Detail & Related papers (2024-11-20T14:16:55Z)
Transforming In-Vehicle Network Intrusion Detection: VAE-based Knowledge Distillation Meets Explainable AI [0.0]
This paper introduces an advanced intrusion detection system (IDS) called KD-XVAE that uses a Variational Autoencoder (VAE)-based knowledge distillation approach. Our model significantly reduces complexity, operating with just 1669 parameters and achieving an inference time of 0.3 ms per batch.
arXiv Detail & Related papers (2024-10-11T17:57:16Z)
Provably Robust and Secure Steganography in Asymmetric Resource Scenario [30.12327233257552]
Current provably secure steganography approaches require a pair of encoder and decoder to hide and extract private messages. This paper proposes a novel provably robust and secure steganography framework for the asymmetric resource setting.
arXiv Detail & Related papers (2024-07-18T13:32:00Z)
Enhancing Multiple Reliability Measures via Nuisance-extended Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition. We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training. We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z)
Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency. We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z)
Double Backpropagation for Training Autoencoders against Adversarial Attack [15.264115499966413]
This paper focuses on the adversarial attack on autoencoders. We propose to adopt double backpropagation (DBP) to secure autoencoder such as VAE and DRAW.
arXiv Detail & Related papers (2020-03-04T05:12:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.