Related papers: Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training

Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training

URL: http://arxiv.org/abs/2510.08855v1
Date: Thu, 09 Oct 2025 23:12:51 GMT
Title: Time-Aware Feature Selection: Adaptive Temporal Masking for Stable Sparse Autoencoder Training
Authors: T. Ed Li, Junyu Ren,
Abstract summary: We introduce Adaptive Temporal Masking (ATM), a novel training approach that adjusts feature selection by tracking activation magnitudes, frequencies, and reconstruction contributions to compute importance scores that evolve over time.<n> ATM achieves substantially lower absorption scores compared to existing methods like TopK and JumpReLU SAEs, while maintaining excellent reconstruction quality.
Score: 0.47745223151611654
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Understanding the internal representations of large language models is crucial for ensuring their reliability and safety, with sparse autoencoders (SAEs) emerging as a promising interpretability approach. However, current SAE training methods face feature absorption, where features (or neurons) are absorbed into each other to minimize $L_1$ penalty, making it difficult to consistently identify and analyze model behaviors. We introduce Adaptive Temporal Masking (ATM), a novel training approach that dynamically adjusts feature selection by tracking activation magnitudes, frequencies, and reconstruction contributions to compute importance scores that evolve over time. ATM applies a probabilistic masking mechanism based on statistical thresholding of these importance scores, creating a more natural feature selection process. Through extensive experiments on the Gemma-2-2b model, we demonstrate that ATM achieves substantially lower absorption scores compared to existing methods like TopK and JumpReLU SAEs, while maintaining excellent reconstruction quality. These results establish ATM as a principled solution for learning stable, interpretable features in neural networks, providing a foundation for more reliable model analysis.

Related papers

Self-Supervised Learning via Flow-Guided Neural Operator on Time-Series Data [57.85958428020496]
Flow-Guided Neural Operator (FGNO) is a novel framework combining operator learning with flow matching for SSL training.<n>FGNO learns mappings in functional spaces by using Short-Time Fourier Transform to unify different time resolutions.<n>Unlike prior generative SSL methods that use noisy inputs during inference, we propose using clean inputs for representation extraction while learning representations with noise.
arXiv Detail & Related papers (2026-02-12T18:54:57Z)
Optimizing Algorithms for Mobile Health Interventions with Active Querying Optimization [0.0]
Reinforcement learning in mobile health interventions requires balancing intervention efficacy with user burden.<n>The Act-Then-Measure (ATM) algorithm relies on a temporal-difference-inspired Q-learning method, which is prone to instability in sparse and noisy environments.<n>We propose a Bayesian extension to ATM that replaces standard Q-learning with a Kalman filter-style Bayesian update, maintaining uncertainty-aware estimates of Q-values.
arXiv Detail & Related papers (2025-11-27T14:21:47Z)
FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z)
Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency [4.047219770183742]
Time series forecasting plays a pivotal role in critical domains such as energy management and financial markets.<n>This study reveals a counterintuitive phenomenon: appropriately truncating historical data can enhance prediction accuracy.<n>We propose an innovative solution termed Adaptive Masking Loss with Representation Consistency.
arXiv Detail & Related papers (2025-10-22T19:23:53Z)
The Curious Case of In-Training Compression of State Space Models [49.819321766705514]
State Space Models (SSMs) tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference.<n>Key design challenge is striking the right balance between maximizing expressivity and limiting this computational burden.<n>Our approach, textscCompreSSM, applies to Linear Time-Invariant SSMs such as Linear Recurrent Units, but is also extendable to selective models.
arXiv Detail & Related papers (2025-10-03T09:02:33Z)
Sycophancy Mitigation Through Reinforcement Learning with Uncertainty-Aware Adaptive Reasoning Trajectories [58.988535279557546]
We introduce textbf sycophancy Mitigation through Adaptive Reasoning Trajectories.<n>We show that SMART significantly reduces sycophantic behavior while preserving strong performance on out-of-distribution inputs.
arXiv Detail & Related papers (2025-09-20T17:09:14Z)
Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services [10.421371572062595]
This study proposes an anomaly detection method based on the Transformer architecture with integrated multiscale feature perception.<n>The proposed method outperforms mainstream baseline models in key metrics, including precision, recall, AUC, and F1-score.
arXiv Detail & Related papers (2025-08-20T07:52:36Z)
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
Generative QoE Modeling: A Lightweight Approach for Telecom Networks [6.473372512447993]
This study introduces a lightweight generative modeling framework that balances computational efficiency, interpretability, and predictive accuracy.<n>By validating the use of Vector Quantization (VQ) as a preprocessing technique, continuous network features are effectively transformed into discrete categorical symbols.<n>This VQ-HMM pipeline enhances the model's capacity to capture dynamic QoE patterns while supporting probabilistic inference on new and unseen data.
arXiv Detail & Related papers (2025-04-30T06:19:37Z)
A Novel Framework for Learning Stochastic Representations for Sequence Generation and Recognition [0.0]
The ability to generate and recognize sequential data is fundamental for autonomous systems operating in dynamic environments.<n>We propose a novel Recurrent Network with Parametric Biases (RNNPB)<n>Our approach provides a framework for modeling temporal patterns and advances the development of robust and systems in artificial intelligence and robotics.
arXiv Detail & Related papers (2024-12-30T07:27:50Z)
Embedded feature selection in LSTM networks with multi-objective evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks. Our approach optimize the weights and biases of the LSTM in a partitioned manner. Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z)
Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence. We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN) We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z)
Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems. In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability. We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z)
Automated Learning of Interpretable Models with Quantified Uncertainty [0.0]
We introduce a new framework for genetic-programming-based symbolic regression (GPSR) GPSR uses model evidence to formulate replacement probability during the selection phase of evolution. It is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation.
arXiv Detail & Related papers (2022-04-12T19:56:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.