Intellectual Property in Graph-Based Machine Learning as a Service: Attacks and Defenses
- URL: http://arxiv.org/abs/2508.19641v1
- Date: Wed, 27 Aug 2025 07:37:52 GMT
- Title: Intellectual Property in Graph-Based Machine Learning as a Service: Attacks and Defenses
- Authors: Lincan Li, Bolin Shen, Chenxi Zhao, Yuxiang Sun, Kaixiang Zhao, Shirui Pan, Yushun Dong,
- Abstract summary: This survey systematically introduces the first taxonomy of threats and defenses at the level of both GML model and graph-structured data.<n>We present a systematic evaluation framework to assess the effectiveness of IP protection methods, introduce a curated set of benchmark datasets, and discuss their application scopes and future challenges.
- Score: 57.9371204326855
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Graph-structured data, which captures non-Euclidean relationships and interactions between entities, is growing in scale and complexity. As a result, training state-of-the-art graph machine learning (GML) models have become increasingly resource-intensive, turning these models and data into invaluable Intellectual Property (IP). To address the resource-intensive nature of model training, graph-based Machine-Learning-as-a-Service (GMLaaS) has emerged as an efficient solution by leveraging third-party cloud services for model development and management. However, deploying such models in GMLaaS also exposes them to potential threats from attackers. Specifically, while the APIs within a GMLaaS system provide interfaces for users to query the model and receive outputs, they also allow attackers to exploit and steal model functionalities or sensitive training data, posing severe threats to the safety of these GML models and the underlying graph data. To address these challenges, this survey systematically introduces the first taxonomy of threats and defenses at the level of both GML model and graph-structured data. Such a tailored taxonomy facilitates an in-depth understanding of GML IP protection. Furthermore, we present a systematic evaluation framework to assess the effectiveness of IP protection methods, introduce a curated set of benchmark datasets across various domains, and discuss their application scopes and future challenges. Finally, we establish an open-sourced versatile library named PyGIP, which evaluates various attack and defense techniques in GMLaaS scenarios and facilitates the implementation of existing benchmark methods. The library resource can be accessed at: https://labrai.github.io/PyGIP. We believe this survey will play a fundamental role in intellectual property protection for GML and provide practical recipes for the GML community.
Related papers
- CyberLLM-FINDS 2025: Instruction-Tuned Fine-tuning of Domain-Specific LLMs with Retrieval-Augmented Generation and Graph Integration for MITRE Evaluation [0.054619385369457214]
This work presents a methodology to fine-tune the Gemma-2B model into a domain-specific cybersecurity LLM.<n>We detail the processes of dataset preparation, fine-tuning, and synthetic data generation, along with implications for real-world applications in threat detection, forensic investigation, and attack analysis.
arXiv Detail & Related papers (2026-01-11T05:07:57Z) - A Systematic Study of Model Extraction Attacks on Graph Foundation Models [32.616928898012624]
This paper presents the first systematic study of model extraction attacks (MEAs) against Graph Foundation Models (GFMs)<n>We introduce a lightweight extraction method that trains an attacker encoder using supervised regression of graph embeddings.<n>Experiments show that the attacker can approximate the victim model using only a tiny fraction of its original training cost, with almost no loss in accuracy.
arXiv Detail & Related papers (2025-11-14T22:43:42Z) - A Systematic Survey of Model Extraction Attacks and Defenses: State-of-the-Art and Perspectives [65.3369988566853]
Recent studies have demonstrated that adversaries can replicate a target model's functionality.<n>Model Extraction Attacks pose threats to intellectual property, privacy, and system security.<n>We propose a novel taxonomy that classifies MEAs according to attack mechanisms, defense approaches, and computing environments.
arXiv Detail & Related papers (2025-08-20T19:49:59Z) - Graph Representation-based Model Poisoning on Federated Large Language Models [3.5233863453805143]
Federated large language models (FedLLMs) enable powerful generative capabilities within wireless networks while preserving data privacy.<n>This article first reviews recent advancements in model poisoning techniques and existing defense mechanisms for FedLLMs, underscoring critical limitations.<n>The article further investigates graph representation-based model poisoning (GRMP), an emerging attack paradigm that exploits higher-order correlations among benign client gradients to craft malicious updates indistinguishable from legitimate ones.
arXiv Detail & Related papers (2025-07-02T13:20:52Z) - Using Retriever Augmented Large Language Models for Attack Graph Generation [0.7619404259039284]
This paper explores the approach of leveraging large language models (LLMs) to automate the generation of attack graphs.
It shows how to utilize Common Vulnerabilities and Exposures (CommonLLMs) to create attack graphs from threat reports.
arXiv Detail & Related papers (2024-08-11T19:59:08Z) - GraphGuard: Detecting and Counteracting Training Data Misuse in Graph
Neural Networks [69.97213941893351]
The emergence of Graph Neural Networks (GNNs) in graph data analysis has raised critical concerns about data misuse during model training.
Existing methodologies address either data misuse detection or mitigation, and are primarily designed for local GNN models.
This paper introduces a pioneering approach called GraphGuard, to tackle these challenges.
arXiv Detail & Related papers (2023-12-13T02:59:37Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Privacy-Preserved Neural Graph Similarity Learning [99.78599103903777]
We propose a novel Privacy-Preserving neural Graph Matching network model, named PPGM, for graph similarity learning.
To prevent reconstruction attacks, the proposed model does not communicate node-level representations between devices.
To alleviate the attacks to graph properties, the obfuscated features that contain information from both vectors are communicated.
arXiv Detail & Related papers (2022-10-21T04:38:25Z) - Model Inversion Attacks against Graph Neural Networks [65.35955643325038]
We study model inversion attacks against Graph Neural Networks (GNNs)
In this paper, we present GraphMI to infer the private training graph data.
Our experimental results show that such defenses are not sufficiently effective and call for more advanced defenses against privacy attacks.
arXiv Detail & Related papers (2022-09-16T09:13:43Z) - Property inference attack; Graph neural networks; Privacy attacks and
defense; Trustworthy machine learning [5.598383724295497]
Machine learning models are vulnerable to privacy attacks that leak information about the training data.
In this work, we focus on a particular type of privacy attacks named property inference attack (PIA)
We consider Graph Neural Networks (GNNs) as the target model, and distribution of particular groups of nodes and links in the training graph as the target property.
arXiv Detail & Related papers (2022-09-02T14:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.