Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training
- URL: http://arxiv.org/abs/2602.21321v1
- Date: Tue, 24 Feb 2026 19:41:34 GMT
- Title: Dynamic Symmetric Point Tracking: Tackling Non-ideal Reference in Analog In-memory Training
- Authors: Quan Xiao, Jindan Li, Zhaoxian Wu, Tayfun Gokmen, Tianyi Chen,
- Abstract summary: We present the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error.<n>We propose a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees.
- Score: 46.75046795995564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Analog in-memory computing (AIMC) performs computation directly within resistive crossbar arrays, offering an energy-efficient platform to scale large vision and language models. However, non-ideal analog device properties make the training on AIMC devices challenging. In particular, its update asymmetry can induce a systematic drift of weight updates towards a device-specific symmetric point (SP), which typically does not align with the optimum of the training objective. To mitigate this bias, most existing works assume the SP is known and pre-calibrate it to zero before training by setting the reference point as the SP. Nevertheless, calibrating AIMC devices requires costly pulse updates, and residual calibration error can directly degrade training accuracy. In this work, we present the first theoretical characterization of the pulse complexity of SP calibration and the resulting estimation error. We further propose a dynamic SP estimation method that tracks the SP during model training, and establishes its convergence guarantees. In addition, we develop an enhanced variant based on chopping and filtering techniques from digital signal processing. Numerical experiments demonstrate both the efficiency and effectiveness of the proposed method.
Related papers
- From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures [12.68400434984463]
MLIPs fail to reproduce the physical smoothness of the quantum potential energy surface.<n>Existing evaluations, such as microcanonical molecular dynamics, are computationally expensive and primarily probe near-equilibrium states.<n>We introduce the Bond Smoothness Characterization Test (BSCT) to improve evaluation metrics for MLIPs.
arXiv Detail & Related papers (2026-02-04T18:50:10Z) - ECO: Quantized Training without Full-Precision Master Weights [58.97082407934466]
Error-Compensating (ECO) eliminates master weights by applying updates directly to quantized parameters.<n>We show that ECO converges to a constant-radius neighborhood of the optimum, while naive master-weight removal can incur an error that is inversely proportional to the learning rate.
arXiv Detail & Related papers (2026-01-29T18:35:01Z) - RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE)<n>RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers.<n>Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z) - Bisimulation metric for Model Predictive Control [44.301098448479195]
Bisimulation Metric for Model Predictive Control (BS-MPC) is a novel approach that incorporates bisimulation metric loss in its objective function to directly optimize the encoder.
BS-MPC improves training stability, robustness against input noise, and computational efficiency by reducing training time.
We evaluate BS-MPC on both continuous control and image-based tasks from the DeepMind Control Suite.
arXiv Detail & Related papers (2024-10-06T17:12:10Z) - Deterministic and statistical calibration of constitutive models from full-field data with parametric physics-informed neural networks [36.136619420474766]
parametric physics-informed neural networks (PINNs) for model calibration from full-field displacement data are investigated.<n>Due to the fast evaluation of PINNs, calibration can be performed in near real-time.
arXiv Detail & Related papers (2024-05-28T16:02:11Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration
Measure [35.996971010199196]
Expected Squared Difference ( ESD) is a tuning-free trainable calibration objective loss.
We show that ESD yields the best-calibrated results compared with previous approaches.
ESD drastically improves the computational costs required for calibration during training.
arXiv Detail & Related papers (2023-03-04T18:06:36Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - On feedforward control using physics-guided neural networks: Training
cost regularization and optimized initialization [0.0]
Performance of model-based feedforward controllers is typically limited by the accuracy of the inverse system dynamics model.
This paper proposes a regularization method via identified physical parameters.
It is validated on a real-life industrial linear motor, where it delivers better tracking accuracy and extrapolation.
arXiv Detail & Related papers (2022-01-28T12:51:25Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.