Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps
- URL: http://arxiv.org/abs/2501.18712v4
- Date: Fri, 07 Feb 2025 14:14:07 GMT
- Title: Invisible Traces: Using Hybrid Fingerprinting to identify underlying LLMs in GenAI Apps
- Authors: Devansh Bhardwaj, Naman Mishra,
- Abstract summary: Fingerprinting of Large Language Models (LLMs) has become essential for ensuring the security and transparency of AI-integrated applications.<n>We introduce a novel fingerprinting framework designed to address these challenges by integrating static and dynamic fingerprinting techniques.<n>Our approach identifies architectural features and behavioral traits, enabling accurate and robust fingerprinting of LLMs in dynamic environments.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fingerprinting refers to the process of identifying underlying Machine Learning (ML) models of AI Systemts, such as Large Language Models (LLMs), by analyzing their unique characteristics or patterns, much like a human fingerprint. The fingerprinting of Large Language Models (LLMs) has become essential for ensuring the security and transparency of AI-integrated applications. While existing methods primarily rely on access to direct interactions with the application to infer model identity, they often fail in real-world scenarios involving multi-agent systems, frequent model updates, and restricted access to model internals. In this paper, we introduce a novel fingerprinting framework designed to address these challenges by integrating static and dynamic fingerprinting techniques. Our approach identifies architectural features and behavioral traits, enabling accurate and robust fingerprinting of LLMs in dynamic environments. We also highlight new threat scenarios where traditional fingerprinting methods are ineffective, bridging the gap between theoretical techniques and practical application. To validate our framework, we present an extensive evaluation setup that simulates real-world conditions and demonstrate the effectiveness of our methods in identifying and monitoring LLMs in Gen-AI applications. Our results highlight the framework's adaptability to diverse and evolving deployment contexts.
Related papers
- Attentional Graph Meta-Learning for Indoor Localization Using Extremely Sparse Fingerprints [17.159049478569173]
Fingerprint-based indoor localization is often labor-intensive due to the need for dense grids and repeated measurements across time and space.
Existing benchmark methods primarily rely on the measured fingerprints, while neglecting valuable spatial and environmental characteristics.
We propose a systematic integration of an Attentional Graph Neural Network (AGNN) model, capable of learning spatial adjacency relationships and aggregating information from neighboring fingerprints.
arXiv Detail & Related papers (2025-04-07T08:37:18Z) - API Agents vs. GUI Agents: Divergence and Convergence [35.28490346033735]
API- and GUI-based large language models (LLMs) interact with graphical user interfaces in a human-like manner.
This paper systematically analyzes their divergence and potential convergence.
We indicate that continuing innovations in LLM-based automation are poised to blur the lines between API- and GUI-driven agents.
arXiv Detail & Related papers (2025-03-14T04:26:21Z) - LLMs Have Rhythm: Fingerprinting Large Language Models Using Inter-Token Times and Network Traffic Analysis [2.4999074238880485]
We propose a novel passive and non-invasive fingerprinting technique that operates in real-time.
We find that measuring the Inter-Token Times (ITTs)-time intervals between consecutive tokens-can identify different language models with high accuracy.
arXiv Detail & Related papers (2025-02-27T23:22:01Z) - Creating an LLM-based AI-agent: A high-level methodology towards enhancing LLMs with APIs [0.0]
Large Language Models (LLMs) have revolutionized various aspects of engineering and science.<n>This thesis serves as a comprehensive guide that elucidates a multi-faceted approach for empowering LLMs with the capability to leverage Application Programming Interfaces (APIs)<n>We propose an on-device architecture that aims to exploit the functionality of carry-on devices by using small models from the Hugging Face community.
arXiv Detail & Related papers (2024-12-17T14:14:04Z) - Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation [82.61572106180705]
This paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories.
We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data.
Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates.
arXiv Detail & Related papers (2024-09-26T17:26:16Z) - LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - An Interactive Agent Foundation Model [49.77861810045509]
We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents.
Our training paradigm unifies diverse pre-training strategies, including visual masked auto-encoders, language modeling, and next-action prediction.
We demonstrate the performance of our framework across three separate domains -- Robotics, Gaming AI, and Healthcare.
arXiv Detail & Related papers (2024-02-08T18:58:02Z) - MRA-GNN: Minutiae Relation-Aware Model over Graph Neural Network for
Fingerprint Embedding [4.5262471547727845]
We propose a novel paradigm for fingerprint embedding, called Minutiae Relation-Aware model over Graph Neural Network (MRA-GNN)
Our proposed approach incorporates a GNN-based framework in fingerprint embedding to encode the topology and correlation of fingerprints into descriptive features.
We equip MRA-GNN with a Topological relation Reasoning Module (TRM) and Correlation-Aware Module (CAM) to learn the fingerprint embedding from these graphs successfully.
arXiv Detail & Related papers (2023-07-31T05:54:06Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.
Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.
Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z) - Learning Robust Representations Of Generative Models Using Set-Based
Artificial Fingerprints [14.191129493685212]
Existing methods approximate the distance between the models via their sample distributions.
We consider unique traces (a.k.a. "artificial fingerprints") as representations of generative models.
We propose a new learning method based on set-encoding and contrastive training.
arXiv Detail & Related papers (2022-06-04T23:20:07Z) - Fingerprint recognition with embedded presentation attacks detection:
are we ready? [6.0168714922994075]
The diffusion of fingerprint verification systems for security applications makes it urgent to investigate the embedding of software-based presentation attack algorithms (PAD) into such systems.
Current research did not state much about their effectiveness when embedded in fingerprint verification systems.
This paper proposes a performance simulator based on the probabilistic modeling of the relationships among the Receiver Operating Characteristics (ROC) of the two individual systems when PAD and verification stages are implemented sequentially.
arXiv Detail & Related papers (2021-10-20T13:53:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.