CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection
- URL: http://arxiv.org/abs/2510.13205v1
- Date: Wed, 15 Oct 2025 06:49:31 GMT
- Title: CleverCatch: A Knowledge-Guided Weak Supervision Model for Fraud Detection
- Authors: Amirhossein Mozafari, Kourosh Hashemi, Erfan Shafagh, Soroush Motamedi, Azar Taheri Tayebi, Mohammad A. Tayebi,
- Abstract summary: CleverCatch is a knowledge-guided weak supervision model designed to detect fraudulent prescription behaviors.<n>Our approach integrates structured domain expertise into a neural architecture that aligns rules and data samples within a shared embedding space.<n> Experiments on the large-scale real-world dataset demonstrate that CleverCatch outperforms four state-of-the-art anomaly detection baselines.
- Score: 0.36944296923226316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Healthcare fraud detection remains a critical challenge due to limited availability of labeled data, constantly evolving fraud tactics, and the high dimensionality of medical records. Traditional supervised methods are challenged by extreme label scarcity, while purely unsupervised approaches often fail to capture clinically meaningful anomalies. In this work, we introduce CleverCatch, a knowledge-guided weak supervision model designed to detect fraudulent prescription behaviors with improved accuracy and interpretability. Our approach integrates structured domain expertise into a neural architecture that aligns rules and data samples within a shared embedding space. By training encoders jointly on synthetic data representing both compliance and violation, CleverCatch learns soft rule embeddings that generalize to complex, real-world datasets. This hybrid design enables data-driven learning to be enhanced by domain-informed constraints, bridging the gap between expert heuristics and machine learning. Experiments on the large-scale real-world dataset demonstrate that CleverCatch outperforms four state-of-the-art anomaly detection baselines, yielding average improvements of 1.3\% in AUC and 3.4\% in recall. Our ablation study further highlights the complementary role of expert rules, confirming the adaptability of the framework. The results suggest that embedding expert rules into the learning process not only improves detection accuracy but also increases transparency, offering an interpretable approach for high-stakes domains such as healthcare fraud detection.
Related papers
- Robust Molecular Property Prediction via Densifying Scarce Labeled Data [53.24886143129006]
In drug discovery, compounds most critical for advancing research often lie beyond the training set.<n>We propose a novel bilevel optimization approach that leverages unlabeled data to interpolate between in-distribution (ID) and out-of-distribution (OOD) data.
arXiv Detail & Related papers (2025-06-13T15:27:40Z) - Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detection [50.343419243749054]
Anomaly detection is critical in fields such as medical diagnostics and industrial defect detection.<n> CLIP's coarse-grained image-text alignment limits localization and detection performance for fine-grained anomalies.<n>Crane improves the state-of-the-art ZSAD from 2% to 28%, at both image and pixel levels, while remaining competitive in inference speed.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - Reduction of Supervision for Biomedical Knowledge Discovery [28.68816381566995]
It is essential to employ automated methods for knowledge extraction and processing.<n>Finding the right balance between the level of supervision and the effectiveness of models poses a significant challenge.<n>Our study addresses the challenge of identifying semantic relationships between biomedical entities in unstructured text.
arXiv Detail & Related papers (2025-04-13T14:05:40Z) - Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting.<n>Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines.<n> Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z) - Ensuring Medical AI Safety: Interpretability-Driven Detection and Mitigation of Spurious Model Behavior and Associated Data [14.991686165405959]
We show the applicability of the framework using four medical datasets across two modalities.<n>We successfully identify and unlearn these biases in VGG16, ResNet50, and contemporary Vision Transformer models.
arXiv Detail & Related papers (2025-01-23T16:39:09Z) - Weakly Supervised Anomaly Detection via Knowledge-Data Alignment [24.125871437370357]
Anomaly detection plays a pivotal role in numerous web-based applications, including malware detection, anti-money laundering, device failure detection, and network fault analysis.
Weakly Supervised Anomaly Detection (WSAD) has been introduced with a limited number of labeled anomaly samples to enhance model performance.
We introduce a novel framework Knowledge-Data Alignment (KDAlign) to integrate rule knowledge, typically summarized by human experts, to supplement the limited labeled data.
arXiv Detail & Related papers (2024-02-06T07:57:13Z) - Self-Supervised Graph Transformer for Deepfake Detection [1.8133635752982105]
Deepfake detection methods have shown promising results in recognizing forgeries within a given dataset.
Deepfake detection system must remain impartial to forgery types, appearance, and quality for guaranteed generalizable detection performance.
This study introduces a deepfake detection framework, leveraging a self-supervised pre-training model that delivers exceptional generalization ability.
arXiv Detail & Related papers (2023-07-27T17:22:41Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - Uncertainty-Aware Deep Co-training for Semi-supervised Medical Image
Segmentation [4.935055133266873]
We propose a novel uncertainty-aware scheme to make models learn regions purposefully.
Specifically, we employ Monte Carlo Sampling as an estimation method to attain an uncertainty map.
In the backward process, we joint unsupervised and supervised losses to accelerate the convergence of the network.
arXiv Detail & Related papers (2021-11-23T03:26:24Z) - Federated Semi-supervised Medical Image Classification via Inter-client
Relation Matching [58.26619456972598]
Federated learning (FL) has emerged with increasing popularity to collaborate distributed medical institutions for training deep networks.
This paper studies a practical yet challenging FL problem, named textitFederated Semi-supervised Learning (FSSL)
We present a novel approach for this problem, which improves over traditional consistency regularization mechanism with a new inter-client relation matching scheme.
arXiv Detail & Related papers (2021-06-16T07:58:00Z) - Adversarial Imitation Learning with Trajectorial Augmentation and
Correction [61.924411952657756]
We introduce a novel augmentation method which preserves the success of the augmented trajectories.
We develop an adversarial data augmented imitation architecture to train an imitation agent using synthetic experts.
Experiments show that our data augmentation strategy can improve accuracy and convergence time of adversarial imitation.
arXiv Detail & Related papers (2021-03-25T14:49:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.