Related papers: Exposing Vulnerabilities in Explanation for Time Series Classifiers via Dual-Target Attacks

Exposing Vulnerabilities in Explanation for Time Series Classifiers via Dual-Target Attacks

URL: http://arxiv.org/abs/2602.02763v2
Date: Sun, 08 Feb 2026 20:37:28 GMT
Title: Exposing Vulnerabilities in Explanation for Time Series Classifiers via Dual-Target Attacks
Authors: Bohan Wang, Zewen Liu, Lu Lin, Hui Liu, Li Xiong, Ming Jin, Wei Jin,
Abstract summary: Interpretable time series deep learning systems are often assessed by checking temporal consistency on explanations.<n>We show that predictions and explanations can be adversarially decoupled, enabling targeted misclassification.<n>We propose TSEF (Time Series Explanation Fooler), a dual-target attack that jointly manipulates the classifier and explainer outputs.
Score: 27.826255626696696
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Interpretable time series deep learning systems are often assessed by checking temporal consistency on explanations, implicitly treating this as evidence of robustness. We show that this assumption can fail: Predictions and explanations can be adversarially decoupled, enabling targeted misclassification while the explanation remains plausible and consistent with a chosen reference rationale. We propose TSEF (Time Series Explanation Fooler), a dual-target attack that jointly manipulates the classifier and explainer outputs. In contrast to single-objective misclassification attacks that disrupt explanation and spread attribution mass broadly, TSEF achieves targeted prediction changes while keeping explanations consistent with the reference. Across multiple datasets and explainer backbones, our results consistently reveal that explanation stability is a misleading proxy for decision robustness and motivate coupling-aware robustness evaluations for trustworthy time series tasks.

Related papers

CORE: Context-Robust Remasking for Diffusion Language Models [51.59514489363897]
We propose Context-Robust Remasking (CORE), a training-free framework for inference-time revision.<n>Rather than trusting static token probabilities, CORE identifies context-brittle tokens by probing their sensitivity to targeted masked-context perturbations.<n>On LLaDA-8B-Base, CORE delivers consistent improvements across reasoning and code benchmarks, outperforming compute-matched baselines and improving MBPP by up to 9.2 percentage points.
arXiv Detail & Related papers (2026-02-04T00:12:30Z)
From Observations to States: Latent Time Series Forecasting [65.98504021691666]
We propose Latent Time Series Forecasting (LatentTSF), a novel paradigm that shifts TSF from observation regression to latent state prediction.<n>Specifically, LatentTSF employs an AutoEncoder to project observations at each time step into a higher-dimensional latent state space.<n>Our proposed latent objectives implicitly maximize mutual information between predicted latent states and ground-truth states and observations.
arXiv Detail & Related papers (2026-01-30T20:39:44Z)
Explanation Multiplicity in SHAP: Characterization and Assessment [28.413883186555438]
Post-hoc explanations are widely used to justify, contest, and review automated decisions in high-stakes domains such as lending, employment, and healthcare.<n>In practice, however, SHAP explanations can differ substantially across repeated runs, even when the individual, prediction task, and trained model are held fixed.<n>We conceptualize and name this phenomenon explanation multiplicity: the existence of multiple, internally valid but substantively different explanations for the same decision.
arXiv Detail & Related papers (2026-01-19T02:01:18Z)
Critical or Compliant? The Double-Edged Sword of Reasoning in Chain-of-Thought Explanations [60.27156500679296]
We study the role of Chain-of-Thought (CoT) explanations in moral scenarios by systematically perturbing reasoning chains and manipulating delivery tones.<n>Our findings reveal two key effects: (1) users often trust with outcome agreement, sustaining reliance even when reasoning is flawed.<n>These results highlight how CoT explanations can simultaneously clarify and mislead, underscoring the need for NLP systems to provide explanations that encourage scrutiny and critical thinking rather than blind trust.
arXiv Detail & Related papers (2025-11-15T02:38:49Z)
Faithful and Interpretable Explanations for Complex Ensemble Time Series Forecasts using Surrogate Models and Forecastability Analysis [1.5751034894694789]
We develop a surrogate-based explanation methodology that bridges the accuracy-interpretability gap.<n>We integrate spectral predictability analysis to quantify each series' inherent forecastability.<n>The resulting framework delivers interpretable, instance-level explanations for state-of-the-art ensemble forecasts.
arXiv Detail & Related papers (2025-10-09T18:49:45Z)
Error-quantified Conformal Inference for Time Series [55.11926160774831]
Uncertainty quantification in time series prediction is challenging due to the temporal dependence and distribution shift on sequential data.<n>We propose itError-quantified Conformal Inference (ECI) by smoothing the quantile loss function.<n>ECI can achieve valid miscoverage control and output tighter prediction sets than other baselines.
arXiv Detail & Related papers (2025-02-02T15:02:36Z)
Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution. We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z)
Self-Interpretable Time Series Prediction with Counterfactual Explanations [4.658166900129066]
Interpretable time series prediction is crucial for safety-critical areas such as healthcare and autonomous driving. Most existing methods focus on interpreting predictions by assigning important scores to segments of time series. We develop a self-interpretable model, dubbed Counterfactual Time Series (CounTS), which generates counterfactual and actionable explanations for time series predictions.
arXiv Detail & Related papers (2023-06-09T16:42:52Z)
Adversarial Counterfactual Visual Explanations [0.7366405857677227]
This paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations. The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations.
arXiv Detail & Related papers (2023-03-17T13:34:38Z)
Generic Temporal Reasoning with Differential Analysis and Explanation [61.96034987217583]
We introduce a novel task named TODAY that bridges the gap with temporal differential analysis. TODAY evaluates whether systems can correctly understand the effect of incremental changes. We show that TODAY's supervision style and explanation annotations can be used in joint learning.
arXiv Detail & Related papers (2022-12-20T17:40:03Z)
Fooling SHAP with Stealthily Biased Sampling [7.476901945542385]
SHAP explanations aim at identifying which features contribute the most to the difference in model prediction at a specific input versus a background distribution. Recent studies have shown that they can be manipulated by malicious adversaries to produce arbitrary desired explanations. We propose a complementary family of attacks that leave the model intact and manipulate SHAP explanations using stealthily biased sampling of the data points used to approximate expectations w.r.t the background distribution.
arXiv Detail & Related papers (2022-05-30T20:33:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.