Related papers: Enhancing AI System Resiliency: Formulation and Guarantee for LSTM Resilience Based on Control Theory

Related papers

WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training [64.0932926819307]
We present Warmup-Stable and Merge (WSM), a framework that establishes a formal connection between learning rate decay and model merging.<n>WSM provides a unified theoretical foundation for emulating various decay strategies.<n>Our framework consistently outperforms the widely-adopted Warmup-Stable-Decay (WSD) approach across multiple benchmarks.
arXiv Detail & Related papers (2025-07-23T16:02:06Z)
A Simple Baseline for Stable and Plastic Neural Networks [3.2635082758250693]
Continual learning in computer vision requires that models adapt to a continuous stream of tasks without forgetting prior knowledge.<n>We introduce RDBP, a low-overhead baseline that unites two complementary mechanisms: ReLUDown, a lightweight activation modification that preserves feature sensitivity while preventing neuron dormancy, and Decreasing Backpropagation, a biologically inspired gradient-scheduling scheme that progressively shields early layers from catastrophic updates.
arXiv Detail & Related papers (2025-07-14T13:18:26Z)
Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z)
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning [95.28059121743831]
Reinforcement Learning with Verifiable Rewards (RLVR) has proven effective for training large language models (LLMs) on complex reasoning tasks.<n>We introduce a Self-aware Weakness-driven problem Synthesis framework (SwS) that systematically identifies model deficiencies and leverages them for problem augmentation.<n>SwS enables robust generalization byempowering the model to self-identify and address its weaknesses in RL, yielding average performance gains of 10.0% and 7.7% on 7B and 32B models.
arXiv Detail & Related papers (2025-06-10T17:02:00Z)
On the Definition of Robustness and Resilience of AI Agents for Real-time Congestion Management [0.0]
The European Union's Artificial Intelligence (AI) Act defines robustness, resilience, and security requirements for high-risk sectors.<n>This paper introduces a novel framework for quantitatively evaluating the robustness and resilience of reinforcement learning agents in congestion management.
arXiv Detail & Related papers (2025-04-17T20:01:48Z)
Federated Quantum-Train Long Short-Term Memory for Gravitational Wave Signal [3.360429911727189]
We present Federated QT-LSTM, a novel framework that combines the Quantum-Train (QT) methodology with Long Short-Term Memory (LSTM) networks in a federated learning setup.<n>By leveraging quantum neural networks (QNNs) to generate classical LSTM model parameters during training, the framework effectively addresses challenges in model compression, scalability, and computational efficiency.
arXiv Detail & Related papers (2025-03-20T11:34:13Z)
Bridging SFT and DPO for Diffusion Model Alignment with Self-Sampling Preference Optimization [67.8738082040299]
Self-Sampling Preference Optimization (SSPO) is a new alignment method for post-training reinforcement learning.<n>SSPO eliminates the need for paired data and reward models while retaining the training stability of SFT.<n>SSPO surpasses all previous approaches on the text-to-image benchmarks and demonstrates outstanding performance on the text-to-video benchmarks.
arXiv Detail & Related papers (2024-10-07T17:56:53Z)
Unlocking the Power of LSTM for Long Term Time Series Forecasting [27.245021350821638]
We propose a simple yet efficient algorithm named P-sLSTM built upon sLSTM by incorporating patching and channel independence.<n>These modifications substantially enhance sLSTM's performance in TSF, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-08-19T13:59:26Z)
Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation [67.63756749551924]
Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control. Lyapunov stability guarantees over the region-of-attraction (ROA) for NN controllers with nonlinear dynamical systems are challenging to obtain. We demonstrate a new framework for learning NN controllers together with Lyapunov certificates using fast empirical falsification and strategic regularizations.
arXiv Detail & Related papers (2024-04-11T17:49:15Z)
An Investigation of Time Reversal Symmetry in Reinforcement Learning [18.375784421726287]
We formalize a concept of time reversal symmetry in a Markov decision process (MDP) We observe that utilizing the structure of time reversal in an MDP allows every environment transition experienced by an agent to be transformed into a feasible reverse-time transition. To test the usefulness of this newly synthesized data, we develop a novel approach called time symmetric data augmentation (TSDA)
arXiv Detail & Related papers (2023-11-28T18:02:06Z)
Investigating Robustness in Cyber-Physical Systems: Specification-Centric Analysis in the face of System Deviations [8.8690305802668]
A critical attribute of cyber-physical systems (CPS) is robustness, denoting its capacity to operate safely. This paper proposes a novel specification-based robustness, which characterizes the effectiveness of a controller in meeting a specified system requirement. We present an innovative two-layer simulation-based analysis framework designed to identify subtle robustness violations.
arXiv Detail & Related papers (2023-11-13T16:44:43Z)
Active RIS-aided EH-NOMA Networks: A Deep Reinforcement Learning Approach [66.53364438507208]
An active reconfigurable intelligent surface (RIS)-aided multi-user downlink communication system is investigated. Non-orthogonal multiple access (NOMA) is employed to improve spectral efficiency, and the active RIS is powered by energy harvesting (EH) An advanced LSTM based algorithm is developed to predict users' dynamic communication state. A DDPG based algorithm is proposed to joint control the amplification matrix and phase shift matrix RIS.
arXiv Detail & Related papers (2023-04-11T13:16:28Z)
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios [51.94807626839365]
We propose the attention-inspired numerical solver (AttNS) to solve differential equations due to limited data.<n>AttNS is inspired by the effectiveness of attention modules in Residual Neural Networks (ResNet) in enhancing model generalization and robustness.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Joint Differentiable Optimization and Verification for Certified Reinforcement Learning [91.93635157885055]
In model-based reinforcement learning for safety-critical control systems, it is important to formally certify system properties. We propose a framework that jointly conducts reinforcement learning and formal verification.
arXiv Detail & Related papers (2022-01-28T16:53:56Z)
Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)
Reinforcement Learning Control of Constrained Dynamic Systems with Uniformly Ultimate Boundedness Stability Guarantee [12.368097742148128]
Reinforcement learning (RL) is promising for complicated nonlinear control problems. The data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system. In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data.
arXiv Detail & Related papers (2020-11-13T12:41:56Z)
Reinforcement Learning Control of Robotic Knee with Human in the Loop by Flexible Policy Iteration [17.365135977882215]
This study fills important voids by introducing innovative features to the policy algorithm. We show system level performances including convergence of the approximate value function, (sub)optimality of the solution, and stability of the system.
arXiv Detail & Related papers (2020-06-16T09:09:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.