Temperature Scaling Attack Disrupting Model Confidence in Federated Learning
- URL: http://arxiv.org/abs/2602.06638v2
- Date: Mon, 09 Feb 2026 01:55:58 GMT
- Title: Temperature Scaling Attack Disrupting Model Confidence in Federated Learning
- Authors: Kichang Lee, Jaeho Jin, JaeYeon Park, Songkuk Kim, JeongGil Ko,
- Abstract summary: We present the Temperature Scaling Attack (TSA), a training-time attack that degrades calibration while preserving accuracy.<n>Our results establish calibration integrity as a critical attack surface in federated learning.
- Score: 4.863985119779627
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Predictive confidence serves as a foundational control signal in mission-critical systems, directly governing risk-aware logic such as escalation, abstention, and conservative fallback. While prior federated learning attacks predominantly target accuracy or implant backdoors, we identify confidence calibration as a distinct attack objective. We present the Temperature Scaling Attack (TSA), a training-time attack that degrades calibration while preserving accuracy. By injecting temperature scaling with learning rate-temperature coupling during local training, malicious updates maintain benign-like optimization behavior, evading accuracy-based monitoring and similarity-based detection. We provide a convergence analysis under non-IID settings, showing that this coupling preserves standard convergence bounds while systematically distorting confidence. Across three benchmarks, TSA substantially shifts calibration (e.g., 145% error increase on CIFAR-100) with <2 accuracy change, and remains effective under robust aggregation and post-hoc calibration defenses. Case studies further show that confidence manipulation can cause up to 7.2x increases in missed critical cases (healthcare) or false alarms (autonomous driving), even when accuracy is unchanged. Overall, our results establish calibration integrity as a critical attack surface in federated learning.
Related papers
- Decision-Aware Trust Signal Alignment for SOC Alert Triage [0.0]
The present paper presents a decision-sensitive trust signal correspondence scheme of SOC alert triage.<n>The framework combines confidence that has been calibrated, lightweight uncertainty cues, and cost-sensitive decision thresholds into coherent decision-support layer.<n>We show that false negatives are greatly amplified by the presence of misaligned displays of confidence, whereas cost weighted loss decreases by orders of magnitude between models with decision aligned trust signals.
arXiv Detail & Related papers (2026-01-08T01:41:54Z) - Learning Robust Representations for Malicious Content Detection via Contrastive Sampling and Uncertainty Estimation [0.0]
Uncertainty Contrastive Framework (UCF) integrates uncertainty-aware contrastive loss, adaptive temperature scaling, and a self-attention-guided LSTM encoder to improve classification under noisy and imbalanced conditions.<n>UCF dynamically adjusts contrastive weighting based on sample confidence, stabilizes training using positive anchors, and adapts temperature parameters to batch-level variability.
arXiv Detail & Related papers (2025-12-01T22:06:06Z) - Uncertainty-Aware Post-Hoc Calibration: Mitigating Confidently Incorrect Predictions Beyond Calibration Metrics [6.9681910774977815]
This paper presents a post-hoc calibration framework to enhance calibration quality and uncertainty-aware decision-making.<n>A comprehensive evaluation is conducted using calibration metrics, uncertainty-aware performance measures, and empirical conformal coverage.<n> Experiments show that the proposed method achieves lower confidently incorrect predictions, and competitive Expected Error compared with isotonic and focal-loss baselines.
arXiv Detail & Related papers (2025-10-19T23:55:36Z) - DATS: Distance-Aware Temperature Scaling for Calibrated Class-Incremental Learning [13.864609787260298]
Continual Learning (CL) is gaining increasing attention for its ability to enable a single model to learn incrementally from a sequence of new classes.<n>In safety-critical applications, predictive models should also be able to reliably communicate their uncertainty in a manner - that is, with confidence scores aligned to the true frequencies of target events.<n>We propose Distance-Aware Temperature Scaling (DATS), which combines prototype-based distance estimation with distance-aware calibration to infer task proximity and assign adaptive temperatures without prior task information.
arXiv Detail & Related papers (2025-09-25T13:46:56Z) - Trust, or Don't Predict: Introducing the CWSA Family for Confidence-Aware Model Evaluation [0.0]
We introduce two new metrics Confidence-Weighted Selective Accuracy (CWSA) and its normalized variant CWSA+.<n>CWSA offers principled and interpretable way to evaluate predictive models under confidence thresholds.<n>We show that CWSA and CWSA+ both effectively detect nuanced failure modes and outperform classical metrics in trust-sensitive tests.
arXiv Detail & Related papers (2025-05-24T10:07:48Z) - Coverage-Guaranteed Speech Emotion Recognition via Calibrated Uncertainty-Adaptive Prediction Sets [0.0]
Road rage, often triggered by emotional suppression and sudden outbursts, significantly threatens road safety by causing collisions and aggressive behavior.<n>Speech emotion recognition technologies can mitigate this risk by identifying negative emotions early and issuing timely alerts.<n>We propose a novel risk-controlled prediction framework providing statistically rigorous guarantees on prediction accuracy.
arXiv Detail & Related papers (2025-03-24T12:26:28Z) - Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning [53.42244686183879]
Conformal prediction provides model-agnostic and distribution-free uncertainty quantification.<n>Yet, conformal prediction is not reliable under poisoning attacks where adversaries manipulate both training and calibration data.<n>We propose reliable prediction sets (RPS): the first efficient method for constructing conformal prediction sets with provable reliability guarantees under poisoning.
arXiv Detail & Related papers (2024-10-13T15:37:11Z) - Calibrating Language Models with Adaptive Temperature Scaling [58.056023173579625]
We introduce Adaptive Temperature Scaling (ATS), a post-hoc calibration method that predicts a temperature scaling parameter for each token prediction.
ATS improves calibration by over 10-50% across three downstream natural language evaluation benchmarks compared to prior calibration methods.
arXiv Detail & Related papers (2024-09-29T22:54:31Z) - Towards Certification of Uncertainty Calibration under Adversarial Attacks [96.48317453951418]
We show that attacks can significantly harm calibration, and thus propose certified calibration as worst-case bounds on calibration under adversarial perturbations.<n>We propose novel calibration attacks and demonstrate how they can improve model calibration through textitadversarial calibration training
arXiv Detail & Related papers (2024-05-22T18:52:09Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling.
We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy.
We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.