Related papers: ModalImmune: Immunity Driven Unlearning via Self Destructive Training

ModalImmune: Immunity Driven Unlearning via Self Destructive Training

URL: http://arxiv.org/abs/2602.16197v1
Date: Wed, 18 Feb 2026 05:35:32 GMT
Title: ModalImmune: Immunity Driven Unlearning via Self Destructive Training
Authors: Rong Fu, Jia Yee Tan, Wenxin Zhang, Zijian Zhang, Ziming Wang, Zhaolu Kang, Muge Qi, Shuning Zhang, Simon Fong,
Abstract summary: ModalImmune enforces modality immunity by intentionally collapsing selected modality information during training.<n> framework combines a spectrum-adaptive collapse regularizer, an information-gain guided controller for targeted interventions, curvature-aware gradient masking to stabilize destructive updates.<n> Empirical evaluation on standard multimodal benchmarks demonstrates that ModalImmune improves resilience to modality removal and corruption while retaining convergence stability and reconstruction capacity.
Score: 21.940530514137947
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably collapsing selected modality information during training so the model learns joint representations that are robust to destructive modality influence. The framework combines a spectrum-adaptive collapse regularizer, an information-gain guided controller for targeted interventions, curvature-aware gradient masking to stabilize destructive updates, and a certified Neumann-truncated hyper-gradient procedure for automatic meta-parameter adaptation. Empirical evaluation on standard multimodal benchmarks demonstrates that ModalImmune improves resilience to modality removal and corruption while retaining convergence stability and reconstruction capacity.

Related papers

Safe Reinforcement Learning via Recovery-based Shielding with Gaussian Process Dynamics Models [57.006252510102506]
Reinforcement learning (RL) is a powerful framework for optimal decision-making and control but often lacks provable guarantees for safety-critical applications.<n>We introduce a novel recovery-based shielding framework that enables safe RL with a provable safety lower bound for unknown and non-linear continuous dynamical systems.
arXiv Detail & Related papers (2026-02-12T22:03:35Z)
When Evaluation Becomes a Side Channel: Regime Leakage and Structural Mitigations for Alignment Assessment [0.0]
Safety evaluation for advanced AI systems assumes that behavior observed under evaluation predicts behavior in deployment.<n>We recast alignment evaluation as a problem of information flow under partial observability.<n>We study regime-blind mechanisms, training-time interventions that restrict access to regime cues.
arXiv Detail & Related papers (2026-02-09T10:00:24Z)
Automatic Stability and Recovery for Neural Network Training [1.9544213396776273]
Training modern neural networks is increasingly fragile, with rare but severe destabilizing updates often causing irreversible divergence or silent degradation.<n>Existing optimization methods rely on preventive mechanisms embedded within stability probes, offering limited ability to detect and recover from instability.<n>We introduce a supervisory framework that treats optimization as a controlled runtime process.
arXiv Detail & Related papers (2026-01-24T15:14:54Z)
Explainability-Guided Defense: Attribution-Aware Model Refinement Against Adversarial Data Attacks [6.573058520271728]
We identify a connection between interpretability and robustness that can be directly leveraged during training.<n>We introduce an attribution-guided refinement framework that transforms Local Interpretable Model-Agnostic Explanations into an active training signal.
arXiv Detail & Related papers (2026-01-02T19:36:03Z)
Verifying Closed-Loop Contractivity of Learning-Based Controllers via Partitioning [52.23804865017831]
We address the problem of verifying closed-loop contraction in nonlinear control systems whose controller and contraction metric are both parameterized by neural networks.<n>We derive a tractable and scalable sufficient condition for closed-loop contractivity that reduces to checking that the dominant eigenvalue of a symmetric Metzler matrix is nonpositive.
arXiv Detail & Related papers (2025-12-01T23:06:56Z)
Balance Equation-based Distributionally Robust Offline Imitation Learning [8.607736795429638]
Imitation Learning (IL) has proven highly effective for robotic and control tasks where manually designing reward functions or explicit controllers is infeasible.<n>Standard IL methods implicitly assume that the environment dynamics remain fixed between training and deployment.<n>We address this challenge through Balance Equation-based Distributionally Robust Offline Learning.<n>We formulate the problem as a distributionally robust optimization over an uncertainty set of transition models, seeking a policy that minimizes the imitation loss under the worst-case transition distribution.
arXiv Detail & Related papers (2025-11-11T07:48:09Z)
Drift No More? Context Equilibria in Multi-Turn LLM Interactions [58.69551510148673]
contexts drift is the gradual divergence of a model's outputs from goal-consistent behavior across turns.<n>Unlike single-turn errors, drift unfolds temporally and is poorly captured by static evaluation metrics.<n>We show that multi-turn drift can be understood as a controllable equilibrium phenomenon rather than as inevitable decay.
arXiv Detail & Related papers (2025-10-09T04:48:49Z)
Modality Equilibrium Matters: Minor-Modality-Aware Adaptive Alternating for Cross-Modal Memory Enhancement [13.424541949553964]
We propose a Shapley-guided alternating training framework that adaptively prioritizes minor modalities to balance and thus enhance the fusion.<n>We evaluate the performance in both balance and accuracy across four multimodal benchmark datasets, where our method achieves state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2025-05-26T02:02:57Z)
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts.<n>Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z)
Improving Domain Generalization in Self-supervised Monocular Depth Estimation via Stabilized Adversarial Training [61.35809887986553]
We propose a general adversarial training framework, named Stabilized Conflict-optimization Adversarial Training (SCAT) SCAT integrates adversarial data augmentation into self-supervised MDE methods to achieve a balance between stability and generalization. Experiments on five benchmarks demonstrate that SCAT can achieve state-of-the-art performance and significantly improve the generalization capability of existing self-supervised MDE methods.
arXiv Detail & Related papers (2024-11-04T15:06:57Z)
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties. We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.