Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection
- URL: http://arxiv.org/abs/2601.22949v1
- Date: Fri, 30 Jan 2026 13:12:12 GMT
- Title: Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection
- Authors: Yuan Li, Jun Hu, Bryan Hooi, Bingsheng He, Cheng Chen,
- Abstract summary: Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies.<n>We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training.
- Score: 73.9189065770752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies. However, existing LLM-enhanced GNN approaches are constrained by predefined prompting and decoupled training pipelines, limiting reasoning autonomy and weakening semantic-structural alignment. We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training. To address the limitations of predefined prompts, we introduce a fraud-aware selective CoT distillation mechanism that generates diverse reasoning paths and enhances semantic-structural understanding. These distilled CoTs are integrated into node texts, providing GNNs with enriched, multi-hop semantic and structural cues for fraud detection. Furthermore, we develop an efficient asymmetric co-training strategy that enables end-to-end optimization while significantly reducing the computational cost of naive joint training. Extensive experiments on public and industrial benchmarks demonstrate that FraudCoT achieves up to 8.8% AUPRC improvement over state-of-the-art methods and delivers up to 1,066x speedup in training throughput, substantially advancing both detection performance and efficiency.
Related papers
- Graph Reasoning Paradigm: Structured and Symbolic Reasoning with Topology-Aware Reinforcement Learning for Large Language Models [45.28250076657801]
Long Chain-of-Thought (LCoT) has proven effective in enhancing the reasoning capabilities of Large Language Models (LLMs)<n>Despite RLVR-based optimization, existing methods still suffer from coarse-grained supervision, reward hacking, high training costs, and poor generalization.<n>We propose the Graph Reasoning Paradigm (GRP), which realizes structured and symbolic reasoning, implemented via graph-structured representations with step-level cognitive labels.
arXiv Detail & Related papers (2026-01-19T12:23:00Z) - Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z) - GLOW: Graph-Language Co-Reasoning for Agentic Workflow Performance Prediction [51.83437071408662]
We propose GLOW, a unified framework for AW performance prediction.<n>GLOW combines the graph-structure modeling capabilities of GNNs with the reasoning power of LLMs.<n>Experiments on FLORA-Bench show that GLOW outperforms state-of-the-art baselines in prediction accuracy and ranking utility.
arXiv Detail & Related papers (2025-12-11T13:30:46Z) - Scaling Graph Chain-of-Thought Reasoning: A Multi-Agent Framework with Efficient LLM Serving [38.059017394879284]
Graph Chain-of-Thought (Graph-CoT) enables large language models (LLMs) to perform step-by-step reasoning over graph-structured knowledge.<n>Existing pipelines suffer from low accuracy, excessive token usage, high latency, and low throughput.<n>We present GLM, the first multi-agent Graph-CoT system co-designed with an optimized LLM serving architecture.
arXiv Detail & Related papers (2025-11-03T14:42:53Z) - Latent Chain-of-Thought for Visual Reasoning [53.541579327424046]
Chain-of-thought (CoT) reasoning is critical for improving the interpretability and reliability of Large Vision-Language Models (LVLMs)<n>We reformulate reasoning in LVLMs as posterior inference and propose a scalable training algorithm based on amortized variational inference.<n>We empirically demonstrate that the proposed method enhances the state-of-the-art LVLMs on seven reasoning benchmarks.
arXiv Detail & Related papers (2025-10-27T23:10:06Z) - CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs [33.63911145333626]
Chain-of-Thought prompting has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models.<n>Existing implementations, such as in-context learning and fine-tuning, remain costly and inefficient.<n>We introduce CoT Vectors, compact representations that encode task-general, multi-step reasoning knowledge.
arXiv Detail & Related papers (2025-10-01T06:58:23Z) - SIM-CoT: Supervised Implicit Chain-of-Thought [108.30049193668083]
Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models.<n>We identify a core latent instability issue when scaling the computational budget of implicit CoT.<n>We propose SIM-CoT, a plug-and-play training module that introduces step-level supervision to stabilize and enrich the latent reasoning space.
arXiv Detail & Related papers (2025-09-24T17:01:32Z) - Towards Improving Long-Tail Entity Predictions in Temporal Knowledge Graphs through Global Similarity and Weighted Sampling [53.11315884128402]
Temporal Knowledge Graph (TKG) completion models traditionally assume access to the entire graph during training.<n>We present an incremental training framework specifically designed for TKGs, aiming to address entities that are either not observed during training or have sparse connections.<n>Our approach combines a model-agnostic enhancement layer with a weighted sampling strategy, that can be augmented to and improve any existing TKG completion method.
arXiv Detail & Related papers (2025-07-25T06:02:48Z) - On the Consistency of GNN Explanations for Malware Detection [2.464148828287322]
Control Flow Graphs (CFGs) are critical for analyzing program execution and characterizing malware behavior.<n>This study proposes a novel framework that dynamically constructs CFGs and embeds node features using a hybrid approach.<n>A GNN-based classifier is then constructed to detect malicious behavior from the resulting graph representations.
arXiv Detail & Related papers (2025-04-22T23:25:12Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.