Related papers: Adaptive Federated Learning Defences via Trust-Aware Deep Q-Networks

Related papers

ThreatFormer-IDS: Robust Transformer Intrusion Detection with Zero-Day Generalization and Explainable Attribution [0.0]
Intrusion detection in IoT and industrial networks requires models that can detect rare attacks at low false-positive rates while remaining reliable under evolving traffic and limited labels.<n>We propose ThreatFormer- IDS, a Transformer-based sequence modeling framework that converts flow records into time-ordered windows and learns contextual representations for robust intrusion screening.<n>On the ToN IoT benchmark with chronological evaluation, ThreatFormer-IDS achieves AUCROC 0.994, AUC-PR 0.956, and Recall@1%FPR 0.910, outperforming strong tree-based and sequence baselines.
arXiv Detail & Related papers (2026-02-26T23:20:42Z)
BadCLIP++: Stealthy and Persistent Backdoors in Multimodal Contrastive Learning [73.46118996284888]
Research on backdoor attacks against multimodal contrastive learning models faces two key challenges: stealthiness and persistence.<n>We propose BadCLIP++, a unified framework that tackles both challenges.<n>For stealthiness, we introduce a semantic-fusion QR micro-trigger that embeds imperceptible patterns near task-relevant regions.<n>For persistence, we stabilize trigger embeddings via radius shrinkage and centroid alignment.
arXiv Detail & Related papers (2026-02-19T08:31:16Z)
Constraint-Rectified Training for Efficient Chain-of-Thought [60.52883907721588]
Chain-of-Thought (CoT) has significantly enhanced the reasoning capabilities of Large Language Models (LLMs)<n>While longer reasoning traces can improve answer quality and unlock abilities such as self-correction, they also incur high inference costs and often introduce redundant steps, known as overthinking.<n>Recent research seeks to develop efficient reasoning strategies that balance reasoning length and accuracy.
arXiv Detail & Related papers (2026-02-13T02:13:45Z)
On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs [15.301640007799735]
We show that simple, controlled textual perturbations--misleading captions or incorrect chain-of-thought (CoT) traces--cause substantial drops in robustness and confidence.<n>To better understand these vulnerabilities, we analyze RL fine-tuning dynamics and uncover an accuracy-faithfulness trade-off.
arXiv Detail & Related papers (2026-02-13T01:12:00Z)
Empirical Analysis of Adversarial Robustness and Explainability Drift in Cybersecurity Classifiers [0.0]
This paper presents an empirical study of adversarial robustness and explainability drift across two cybersecurity domains.<n>We introduce a quantitative metric, the Robustness Index (RI), defined as the area under the accuracy perturbation curve.<n>Experiments on the Phishing Websites and NB15 datasets show consistent robustness trends, with adversarial training improving RI by up to 9 percent while maintaining clean-data accuracy.
arXiv Detail & Related papers (2026-02-06T05:30:37Z)
Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency [78.91846841708586]
We show that even facts answered with perfect self-consistency can rapidly collapse under mild contextual interference.<n>We propose Neighbor-Consistency Belief (NCB), a structural measure of belief that evaluates response coherence across a conceptual neighborhood.<n>We also present Structure-Aware Training (SAT), which optimize context-invariant belief structure and reduces long-tail knowledge brittleness by approximately 30%.
arXiv Detail & Related papers (2026-01-09T16:23:21Z)
Parent-Guided Adaptive Reliability (PGAR): A Behavioural Meta-Learning Framework for Stable and Trustworthy AI [0.0]
Parent-Guided Adaptive Reliability (PGAR) is a lightweight behavioural meta-learning framework.<n>It adds a supervisory "parent" layer on top of a standard learner to improve stability, calibration, and recovery under disturbances.<n>PGAR functions as a plug-in reliability layer for existing optimization and learning pipelines, supporting interpretable traces in safety-relevant settings.
arXiv Detail & Related papers (2026-01-07T06:02:34Z)
Confidence-Based Response Abstinence: Improving LLM Trustworthiness via Activation-Based Uncertainty Estimation [7.3923284353934875]
We propose a method for confidence estimation in retrieval-augmented generation (RAG) systems that aligns closely with the correctness of large language model (LLM) outputs.<n>Our approach extends prior uncertainty quantification methods by leveraging raw feed-forward network (FFN) activations as auto-regressive signals.<n>Our results demonstrate that activation-based confidence modeling offers a scalable, architecture-aware path toward trustworthy RAG deployment.
arXiv Detail & Related papers (2025-10-15T16:55:56Z)
Distributionally Robust Safety Verification of Neural Networks via Worst-Case CVaR [3.0458514384586404]
This paper builds on Fazlyab's quadratic-constraint (QC) and semidefinite-programming (SDP) framework for neural network verification.<n>The integration broadens input-uncertainty geometry-covering ellipsoids, polytopes, and hyperplanes-and extends applicability to safety-critical domains.
arXiv Detail & Related papers (2025-09-22T07:04:53Z)
Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach [0.15749416770494704]
We show that Certainty-Guided Reasoning (CGR) improves baseline accuracy while reducing token usage.<n>CGR can eliminate millions of tokens in aggregate, with tunable trade-offs between certainty thresholds and efficiency.<n>By integrating confidence into the reasoning process, CGR makes large reasoning language models more adaptive, trustworthy, and resource efficient.
arXiv Detail & Related papers (2025-09-09T14:57:15Z)
ConCISE: Confidence-guided Compression in Step-by-step Efficient Reasoning [64.93140713419561]
Large Reasoning Models (LRMs) perform strongly in complex reasoning tasks via Chain-of-Thought (CoT) prompting, but often suffer from verbose outputs.<n>Existing fine-tuning-based compression methods either operate post-hoc pruning, risking disruption to reasoning coherence, or rely on sampling-based selection.<n>We introduce ConCISE, a framework designed to generate concise reasoning chains, integrating Confidence Injection to boost reasoning confidence, and Early Stopping to terminate reasoning when confidence is sufficient.
arXiv Detail & Related papers (2025-05-08T01:40:40Z)
AuditVotes: A Framework Towards More Deployable Certified Robustness for Graph Neural Networks [16.75401687734174]
AuditVotes is a framework to achieve high clean accuracy and certifiably robust accuracy for Graph Neural Networks (GNNs)<n>It integrates randomized smoothing with two key components, underlineaugmentation and conunderlineditional smoothing.<n>It significantly enhances clean accuracy, certified robustness, and empirical robustness while maintaining high computational efficiency.
arXiv Detail & Related papers (2025-03-29T07:27:32Z)
Adversarial Robustness Overestimation and Instability in TRADES [4.063518154926961]
TRADES sometimes yields disproportionately high PGD validation accuracy compared to the AutoAttack testing accuracy in the multiclass classification task. This discrepancy highlights a significant overestimation of robustness for these instances, potentially linked to gradient masking.
arXiv Detail & Related papers (2024-10-10T07:32:40Z)
ReliOcc: Towards Reliable Semantic Occupancy Prediction via Uncertainty Learning [26.369237406972577]
Vision-centric semantic occupancy prediction plays a crucial role in autonomous driving. There is still few research effort to explore the reliability in predicting semantic occupancy from camera. We propose ReliOcc, a method designed to enhance the reliability of camera-based occupancy networks.
arXiv Detail & Related papers (2024-09-26T16:33:16Z)
Certified Adversarial Defenses Meet Out-of-Distribution Corruptions: Benchmarking Robustness and Simple Baselines [65.0803400763215]
This work critically examines how adversarial robustness guarantees change when state-of-the-art certifiably robust models encounter out-of-distribution data. We propose a novel data augmentation scheme, FourierMix, that produces augmentations to improve the spectral coverage of the training data. We find that FourierMix augmentations help eliminate the spectral bias of certifiably robust models enabling them to achieve significantly better robustness guarantees on a range of OOD benchmarks.
arXiv Detail & Related papers (2021-12-01T17:11:22Z)
Adversarial Robustness under Long-Tailed Distribution [93.50792075460336]
Adversarial robustness has attracted extensive studies recently by revealing the vulnerability and intrinsic characteristics of deep networks. In this work we investigate the adversarial vulnerability as well as defense under long-tailed distributions. We propose a clean yet effective framework, RoBal, which consists of two dedicated modules, a scale-invariant and data re-balancing.
arXiv Detail & Related papers (2021-04-06T17:53:08Z)
Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy. These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem. We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z)
Adversarial Robustness on In- and Out-Distribution Improves Explainability [109.68938066821246]
RATIO is a training procedure for robustness via Adversarial Training on In- and Out-distribution. RATIO achieves state-of-the-art $l$-adrial on CIFAR10 and maintains better clean accuracy.
arXiv Detail & Related papers (2020-03-20T18:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.