Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks
- URL: http://arxiv.org/abs/2508.11711v2
- Date: Wed, 08 Oct 2025 06:22:30 GMT
- Title: Enhancing GraphQL Security by Detecting Malicious Queries Using Large Language Models, Sentence Transformers, and Convolutional Neural Networks
- Authors: Irash Perera, Hiranya Abeyrathne, Sanjeewa Malalgoda, Arshardh Ifthikar,
- Abstract summary: API's flexibility, while beneficial for efficient data fetching, introduces security vulnerabilities that traditional API security mechanisms often fail to address.<n>Malicious queries can exploit the language's dynamic nature, leading to denial-of-service attacks, data exfiltration through injection, and other exploits.<n>This paper presents a novel, AI-driven approach for real-time detection of malicious queries.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: GraphQL's flexibility, while beneficial for efficient data fetching, introduces unique security vulnerabilities that traditional API security mechanisms often fail to address. Malicious GraphQL queries can exploit the language's dynamic nature, leading to denial-of-service attacks, data exfiltration through injection, and other exploits. Existing solutions, such as static analysis, rate limiting, and general-purpose Web Application Firewalls, offer limited protection against sophisticated, context-aware attacks. This paper presents a novel, AI-driven approach for real-time detection of malicious GraphQL queries. Our method combines static analysis with machine learning techniques, including Large Language Models (LLMs) for dynamic schema-based configuration, Sentence Transformers (SBERT and Doc2Vec) for contextual embedding of query payloads, and Convolutional Neural Networks (CNNs), Random Forests, and Multilayer Perceptrons for classification. We detail the system architecture, implementation strategies optimized for production environments (including ONNX Runtime optimization and parallel processing), and evaluate the performance of our detection models and the overall system under load. Results demonstrate high accuracy in detecting various threats, including SQL injection, OS command injection, and XSS exploits, alongside effective mitigation of DoS and SSRF attempts. This research contributes a robust and adaptable solution for enhancing GraphQL API security.
Related papers
- Multi-Agent Taint Specification Extraction for Vulnerability Detection [49.27772068704498]
Static Application Security Testing (SAST) tools using taint analysis are widely viewed as providing higher-quality vulnerability detection results.<n>We present SemTaint, a multi-agent system that strategically combines the semantic understanding of Large Language Models (LLMs) with traditional static program analysis.<n>We integrate SemTaint with CodeQL, a state-of-the-art SAST tool, and demonstrate its effectiveness by detecting 106 of 162 vulnerabilities previously undetectable by CodeQL.
arXiv Detail & Related papers (2026-01-15T21:31:51Z) - ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack [52.17935054046577]
We present ReasAlign, a model-level solution to improve safety alignment against indirect prompt injection attacks.<n>ReasAlign incorporates structured reasoning steps to analyze user queries, detect conflicting instructions, and preserve the continuity of the user's intended tasks.
arXiv Detail & Related papers (2026-01-15T08:23:38Z) - CyberLLM-FINDS 2025: Instruction-Tuned Fine-tuning of Domain-Specific LLMs with Retrieval-Augmented Generation and Graph Integration for MITRE Evaluation [0.054619385369457214]
This work presents a methodology to fine-tune the Gemma-2B model into a domain-specific cybersecurity LLM.<n>We detail the processes of dataset preparation, fine-tuning, and synthetic data generation, along with implications for real-world applications in threat detection, forensic investigation, and attack analysis.
arXiv Detail & Related papers (2026-01-11T05:07:57Z) - ProGQL: A Provenance Graph Query System for Cyber Attack Investigation [6.954627558521413]
Provenance analysis (PA) has emerged as an important solution for cyber attack investigation.<n>Existing PA techniques are inflexible and non-extensible, making it difficult to incorporate analyst expertise.<n>We propose the ProGQL framework, which provides a domain-specific graph search language with a well-engineered query engine.
arXiv Detail & Related papers (2025-10-25T18:53:49Z) - PrediQL: Automated Testing of GraphQL APIs with LLMs [5.239518018302244]
PrediQL is the first retrieval-augmented, LLM-guided fuzzer for APIs.<n>It generates semantically valid and diverse queries.<n>It integrates a context-aware vulnerability detector.
arXiv Detail & Related papers (2025-10-12T01:49:45Z) - White-Basilisk: A Hybrid Model for Code Vulnerability Detection [50.49233187721795]
We introduce White-Basilisk, a novel approach to vulnerability detection that demonstrates superior performance.<n>White-Basilisk achieves results in vulnerability detection tasks with a parameter count of only 200M.<n>This research establishes new benchmarks in code security and provides empirical evidence that compact, efficiently designed models can outperform larger counterparts in specialized tasks.
arXiv Detail & Related papers (2025-07-11T12:39:25Z) - DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents [52.92354372596197]
Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities.<n>This interaction also introduces the risk of prompt injection attacks, where malicious inputs from external sources can mislead the agent's behavior.<n>We propose a Dynamic Rule-based Isolation Framework for Trustworthy agentic systems, which enforces both control and data-level constraints.
arXiv Detail & Related papers (2025-06-13T05:01:09Z) - GraphQLer: Enhancing GraphQL Security with Context-Aware API Testing [12.862760373064342]
API is an open-source query and manipulation language for web applications, offering a flexible alternative to APIs.<n>It exposes it to vulnerabilities such as unauthorized data access, denial-of-service (DoS) attacks, and injections.<n>Existing testing tools focus on functional correctness, overlooking security risks stemming from interdependencies and execution context.<n>This paper presentser, the first context-aware security escalation testing framework for APIs.
arXiv Detail & Related papers (2025-04-17T21:58:15Z) - ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity.<n>This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics.<n>Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z) - How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance.<n>We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z) - Evaluating and Improving the Robustness of Security Attack Detectors Generated by LLMs [6.936401700600395]
Large Language Models (LLMs) are increasingly used in software development to generate functions, such as attack detectors, that implement security requirements.<n>This is most likely due to the LLM lacking knowledge about some existing attacks and to the generated code being not evaluated in real usage scenarios.<n>We propose a novel approach integrating Retrieval Augmented Generation (RAG) and Self-Ranking into the LLM pipeline.
arXiv Detail & Related papers (2024-11-27T10:48:37Z) - CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models [49.657358248788945]
Textual descriptions in cyber threat intelligence (CTI) reports are rich sources of knowledge about cyber threats.<n>Current CTI knowledge extraction methods lack flexibility and generalizability.<n>We propose CTINexus, a novel framework for data-efficient CTI knowledge extraction and high-quality cybersecurity knowledge graph (CSKG) construction.
arXiv Detail & Related papers (2024-10-28T14:18:32Z) - A Lean Transformer Model for Dynamic Malware Analysis and Detection [0.0]
Malware is a fast-growing threat to the modern computing world and existing lines of defense are not efficient enough to address this issue.
Previous works have shown some success leveraging Neural Networks and API calls sequences extracted from execution reports.
In this paper, we design an emulation-Only model, based on the Transformers architecture, to detect malicious files.
arXiv Detail & Related papers (2024-08-05T08:46:46Z) - RoBERTa-Augmented Synthesis for Detecting Malicious API Requests [9.035212370386846]
We introduce a GAN-inspired learning framework that extends limited API traffic datasets through targeted, domain-aware synthesis.<n>We evaluate our framework on two benchmark datasets, CSIC 2010 and ATRDF 2023, and compare it with a previous data augmentation technique.<n>Our method achieves up to a 4.94% increase in F1 score on CSIC 2010 and up to 21.10% on ATRDF 2023.
arXiv Detail & Related papers (2024-05-18T11:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.