Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation
- URL: http://arxiv.org/abs/2512.08969v1
- Date: Mon, 01 Dec 2025 22:06:06 GMT
- Title: Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation
- Authors: Elias Hossain, Umesh Biswas, Charan Gudla, Sai Phani Parsa,
- Abstract summary: Uncertainty Contrastive Framework (UCF) integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions.<n>UCF dynamically adjusts contrastive weighting based on sample confidence, stabilizes training using positive anchors, and adapts temperature parameters to batch-level variability.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose the Uncertainty Contrastive Framework (UCF), a Positive-Unlabeled (PU) representation learning framework that integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions. UCF dynamically adjusts contrastive weighting based on sample confidence, stabilizes training using positive anchors, and adapts temperature parameters to batch-level variability. Applied to malicious content classification, UCF-generated embeddings enable multiple traditional classifiers to achieve more than 93.38% accuracy, precision above 0.93, and near-perfect recall, with minimal false negatives and competitive ROC-AUC scores. Visual analyses confirm clear separation between positive and unlabeled instances, highlighting the framework's ability to produce calibrated, discriminative embeddings. These results position UCF as a robust and scalable solution for PU learning in high-stakes domains such as cybersecurity and biomedical text mining.
Related papers
- LATA: Laplacian-Assisted Transductive Adaptation for Conformal Uncertainty in Medical VLMs [61.06744611795341]
Medical vision-language models (VLMs) are strong zero-shot recognizers for medical imaging.<n>We propose texttttextbfLATA (Laplacian-Assisted Transductive Adaptation), a textittraining- and label-free refinement.<n>texttttextbfLATA sharpens zero-shot predictions without compromising exchangeability.
arXiv Detail & Related papers (2026-02-19T16:45:38Z) - A Confidence-Variance Theory for Pseudo-Label Selection in Semi-Supervised Learning [15.149171763610662]
This paper introduces a Confidence-Variance (CoVar) theory framework that provides a principled joint reliability criterion for pseudo-label selection.<n>We show that reliable pseudo-labels should have both high MC and low RCV, and that the influence of RCV increases as confidence grows.<n>We integrate CoVar as a plug-in module into representative semi-supervised semantic segmentation and image classification methods.
arXiv Detail & Related papers (2026-01-16T02:51:59Z) - SCS-SupCon: Sigmoid-based Common and Style Supervised Contrastive Learning with Adaptive Decision Boundaries [13.983602516442454]
We propose Sigmoid-based Common and Style Supervised Contrastive Learning (SCS-SupCon)<n>Our framework introduces a sigmoid-based pairwise contrastive loss with learnable temperature and bias parameters to enable adaptive decision boundaries.<n>SCS-SupCon achieves state-of-the-art performance across both CNN and Transformer backbones.
arXiv Detail & Related papers (2025-12-17T15:55:47Z) - Semi-Supervised Regression with Heteroscedastic Pseudo-Labels [50.54050677867914]
We propose an uncertainty-aware pseudo-labeling framework that dynamically adjusts pseudo-label influence from a bi-level optimization perspective.<n>We provide theoretical insights and extensive experiments to validate our approach across various benchmark SSR datasets.
arXiv Detail & Related papers (2025-10-17T03:06:23Z) - Deciding When Not to Decide: Indeterminacy-Aware Intrusion Detection with NeutroSENSE [0.0]
NeutroSENSE is a neutrosophic-enhanced ensemble framework for interpretable intrusion detection in IoT environments.<n>System decomposes prediction confidence into truth (T), falsity (F), and indeterminacy (I) components, enabling uncertainty quantification and abstention.
arXiv Detail & Related papers (2025-06-05T11:43:31Z) - Adversarial Robustness Overestimation and Instability in TRADES [4.063518154926961]
TRADES sometimes yields disproportionately high PGD validation accuracy compared to the AutoAttack testing accuracy in the multiclass classification task.
This discrepancy highlights a significant overestimation of robustness for these instances, potentially linked to gradient masking.
arXiv Detail & Related papers (2024-10-10T07:32:40Z) - MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers [41.56951365163419]
"MixedNUTS" is a training-free method where the output logits of a robust classifier are processed by nonlinear transformations with only three parameters.
MixedNUTS then converts the transformed logits into probabilities and mixes them as the overall output.
On CIFAR-10, CIFAR-100, and ImageNet datasets, experimental results with custom strong adaptive attacks demonstrate MixedNUTS's vastly improved accuracy and near-SOTA robustness.
arXiv Detail & Related papers (2024-02-03T21:12:36Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Binary Classification with Confidence Difference [100.08818204756093]
This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification.
We propose a risk-consistent approach to tackle this problem and show that the estimation error bound the optimal convergence rate.
We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven.
arXiv Detail & Related papers (2023-10-09T11:44:50Z) - Towards Reliable Medical Image Segmentation by Modeling Evidential Calibrated Uncertainty [57.023423137202485]
Concerns regarding the reliability of medical image segmentation persist among clinicians.<n>We introduce DEviS, an easily implementable foundational model that seamlessly integrates into various medical image segmentation networks.<n>By leveraging subjective logic theory, we explicitly model probability and uncertainty for medical image segmentation.
arXiv Detail & Related papers (2023-01-01T05:02:46Z) - SmoothMix: Training Confidence-calibrated Smoothed Classifiers for
Certified Robustness [61.212486108346695]
We propose a training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup.
The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness.
Our experimental results demonstrate that the proposed method can significantly improve the certified $ell$-robustness of smoothed classifiers.
arXiv Detail & Related papers (2021-11-17T18:20:59Z) - Adversarial Training with Rectified Rejection [114.83821848791206]
We propose to use true confidence (T-Con) as a certainty oracle, and learn to predict T-Con by rectifying confidence.
We prove that under mild conditions, a rectified confidence (R-Con) rejector and a confidence rejector can be coupled to distinguish any wrongly classified input from correctly classified ones.
arXiv Detail & Related papers (2021-05-31T08:24:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.