Related papers: Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection

Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection

URL: http://arxiv.org/abs/2601.22949v1
Date: Fri, 30 Jan 2026 13:12:12 GMT
Title: Autonomous Chain-of-Thought Distillation for Graph-Based Fraud Detection
Authors: Yuan Li, Jun Hu, Bryan Hooi, Bingsheng He, Cheng Chen,
Abstract summary: Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies.<n>We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training.
Score: 73.9189065770752
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph-based fraud detection on text-attributed graphs (TAGs) requires jointly modeling rich textual semantics and relational dependencies. However, existing LLM-enhanced GNN approaches are constrained by predefined prompting and decoupled training pipelines, limiting reasoning autonomy and weakening semantic-structural alignment. We propose FraudCoT, a unified framework that advances TAG-based fraud detection through autonomous, graph-aware chain-of-thought (CoT) reasoning and scalable LLM-GNN co-training. To address the limitations of predefined prompts, we introduce a fraud-aware selective CoT distillation mechanism that generates diverse reasoning paths and enhances semantic-structural understanding. These distilled CoTs are integrated into node texts, providing GNNs with enriched, multi-hop semantic and structural cues for fraud detection. Furthermore, we develop an efficient asymmetric co-training strategy that enables end-to-end optimization while significantly reducing the computational cost of naive joint training. Extensive experiments on public and industrial benchmarks demonstrate that FraudCoT achieves up to 8.8% AUPRC improvement over state-of-the-art methods and delivers up to 1,066x speedup in training throughput, substantially advancing both detection performance and efficiency.

Related papers

Graph Reasoning Paradigm: Structured and Symbolic Reasoning with Topology-Aware Reinforcement Learning for Large Language Models [45.28250076657801]
Long Chain-of-Thought (LCoT) has proven effective in enhancing the reasoning capabilities of Large Language Models (LLMs)<n>Despite RLVR-based optimization, existing methods still suffer from coarse-grained supervision, reward hacking, high training costs, and poor generalization.<n>We propose the Graph Reasoning Paradigm (GRP), which realizes structured and symbolic reasoning, implemented via graph-structured representations with step-level cognitive labels.
arXiv Detail & Related papers (2026-01-19T12:23:00Z)
Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z)
GLOW: Graph-Language Co-Reasoning for Agentic Workflow Performance Prediction [51.83437071408662]
We propose GLOW, a unified framework for AW performance prediction.<n>GLOW combines the graph-structure modeling capabilities of GNNs with the reasoning power of LLMs.<n>Experiments on FLORA-Bench show that GLOW outperforms state-of-the-art baselines in prediction accuracy and ranking utility.
arXiv Detail & Related papers (2025-12-11T13:30:46Z)
Scaling Graph Chain-of-Thought Reasoning: A Multi-Agent Framework with Efficient LLM Serving [38.059017394879284]
Graph Chain-of-Thought (Graph-CoT) enables large language models (LLMs) to perform step-by-step reasoning over graph-structured knowledge.<n>Existing pipelines suffer from low accuracy, excessive token usage, high latency, and low throughput.<n>We present GLM, the first multi-agent Graph-CoT system co-designed with an optimized LLM serving architecture.
arXiv Detail & Related papers (2025-11-03T14:42:53Z)
Latent Chain-of-Thought for Visual Reasoning [53.541579327424046]
Chain-of-thought (CoT) reasoning is critical for improving the interpretability and reliability of Large Vision-Language Models (LVLMs)<n>We reformulate reasoning in LVLMs as posterior inference and propose a scalable training algorithm based on amortized variational inference.<n>We empirically demonstrate that the proposed method enhances the state-of-the-art LVLMs on seven reasoning benchmarks.
arXiv Detail & Related papers (2025-10-27T23:10:06Z)
CoT Vectors: Transferring and Probing the Reasoning Mechanisms of LLMs [33.63911145333626]
Chain-of-Thought prompting has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models.<n>Existing implementations, such as in-context learning and fine-tuning, remain costly and inefficient.<n>We introduce CoT Vectors, compact representations that encode task-general, multi-step reasoning knowledge.
arXiv Detail & Related papers (2025-10-01T06:58:23Z)
SIM-CoT: Supervised Implicit Chain-of-Thought [108.30049193668083]
Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models.<n>We identify a core latent instability issue when scaling the computational budget of implicit CoT.<n>We propose SIM-CoT, a plug-and-play training module that introduces step-level supervision to stabilize and enrich the latent reasoning space.
arXiv Detail & Related papers (2025-09-24T17:01:32Z)
Towards Improving Long-Tail Entity Predictions in Temporal Knowledge Graphs through Global Similarity and Weighted Sampling [53.11315884128402]
Temporal Knowledge Graph (TKG) completion models traditionally assume access to the entire graph during training.<n>We present an incremental training framework specifically designed for TKGs, aiming to address entities that are either not observed during training or have sparse connections.<n>Our approach combines a model-agnostic enhancement layer with a weighted sampling strategy, that can be augmented to and improve any existing TKG completion method.
arXiv Detail & Related papers (2025-07-25T06:02:48Z)
On the Consistency of GNN Explanations for Malware Detection [2.464148828287322]
Control Flow Graphs (CFGs) are critical for analyzing program execution and characterizing malware behavior.<n>This study proposes a novel framework that dynamically constructs CFGs and embeds node features using a hybrid approach.<n>A GNN-based classifier is then constructed to detect malicious behavior from the resulting graph representations.
arXiv Detail & Related papers (2025-04-22T23:25:12Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.