A Frobenius-Optimal Projection for Enforcing Linear Conservation in Learned Dynamical Models
- URL: http://arxiv.org/abs/2512.22084v1
- Date: Fri, 26 Dec 2025 17:11:16 GMT
- Title: A Frobenius-Optimal Projection for Enforcing Linear Conservation in Learned Dynamical Models
- Authors: John M. Mango, Ronald Katende,
- Abstract summary: We show that $Astar$ enforces exact conservation while minimally perturbing the dynamics.<n>We prove that $Astar$ enforces exact conservation while minimally perturbing the dynamics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of restoring linear conservation laws in data-driven linear dynamical models. Given a learned operator $\widehat{A}$ and a full-rank constraint matrix $C$ encoding one or more invariants, we show that the matrix closest to $\widehat{A}$ in the Frobenius norm and satisfying $C^\top A = 0$ is the orthogonal projection $A^\star = \widehat{A} - C(C^\top C)^{-1}C^\top \widehat{A}$. This correction is uniquely defined, low rank and fully determined by the violation $C^\top \widehat{A}$. In the single-invariant case it reduces to a rank-one update. We prove that $A^\star$ enforces exact conservation while minimally perturbing the dynamics, and we verify these properties numerically on a Markov-type example. The projection provides an elementary and general mechanism for embedding exact invariants into any learned linear model.
Related papers
- The Exploration of Neural Collapse under Imbalanced Data [0.0]
We consider the $L$-extended unconstrained feature model with a bias term and provide a theoretical analysis of global minimizer.
Our findings include: (1) Features within the same class converge to their class mean, similar to both the balanced case and the imbalanced case without bias.
arXiv Detail & Related papers (2024-11-26T09:59:34Z) - Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch.
We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work.
We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z) - Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems [9.30306458153248]
Despite being a key bottleneck in many machine learning tasks, the cost of solving large linear systems has proven challenging to quantify.<n>We consider a fine-grained notion of complexity for solving linear systems, which is motivated by applications where the data exhibits low-dimensional structure.<n>We give a algorithm based on the Sketch-and-Project paradigm, that solves the linear system $Ax = b$, that is, finds $barx$ such that $|Abarx - b| le epsilon |b|$, in time
arXiv Detail & Related papers (2024-05-09T14:56:49Z) - Learning linear dynamical systems under convex constraints [4.13951084724473]
We consider the problem of finite-time identification of linear dynamical systems from samples of a single trajectory.<n>We show that $A*$ can be reliably estimated for values much much smaller than what is needed for the unconstrained setting.
arXiv Detail & Related papers (2023-03-27T11:49:40Z) - Optimal Query Complexities for Dynamic Trace Estimation [59.032228008383484]
We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly.
We provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $epsilon$ error with $delta$ failure probability.
Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation.
arXiv Detail & Related papers (2022-09-30T04:15:44Z) - On the well-spread property and its relation to linear regression [4.619541348328937]
We show that consistent recovery of the parameter vector in a robust linear regression model is information-theoretically impossible.
We show that it is possible to efficiently certify whether a given $n$-by-$d$ matrix is well-spread if the number of observations is quadratic in the ambient dimension.
arXiv Detail & Related papers (2022-06-16T11:17:44Z) - High-dimensional Asymptotics of Feature Learning: How One Gradient Step
Improves the Representation [89.21686761957383]
We study the first gradient descent step on the first-layer parameters $boldsymbolW$ in a two-layer network.
Our results demonstrate that even one step can lead to a considerable advantage over random features.
arXiv Detail & Related papers (2022-05-03T12:09:59Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$.
We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Truncated Linear Regression in High Dimensions [26.41623833920794]
In truncated linear regression, $(A_i, y_i)_i$ whose dependent variable equals $y_i= A_irm T cdot x* + eta_i$ is some fixed unknown vector of interest.
The goal is to recover $x*$ under some favorable conditions on the $A_i$'s and the noise distribution.
We prove that there exists a computationally and statistically efficient method for recovering $k$-sparse $n$-dimensional vectors $x*$ from $m$ truncated samples.
arXiv Detail & Related papers (2020-07-29T00:31:34Z) - Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$.
We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.