Border Basis Computation with Gradient-Weighted Norm
- URL: http://arxiv.org/abs/2101.00401v2
- Date: Thu, 14 Jan 2021 05:59:05 GMT
- Title: Border Basis Computation with Gradient-Weighted Norm
- Authors: Hiroshi Kera
- Abstract summary: We propose gradient-weighted normalization for the approximate border basis of vanishing ideals.
With a slight modification, the analysis of algorithms with coefficient normalization still works with gradient-weighted normalization.
- Score: 5.863264019032882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Normalization of polynomials plays an essential role in the approximate basis
computation of vanishing ideals. In computer algebra, coefficient
normalization, which normalizes a polynomial by its coefficient norm, is the
most common method. In this study, we propose gradient-weighted normalization
for the approximate border basis computation of vanishing ideals, inspired by
the recent results in machine learning. The data-dependent nature of
gradient-weighted normalization leads to powerful properties such as better
stability against perturbation and consistency in the scaling of input points,
which cannot be attained by the conventional coefficient normalization. With a
slight modification, the analysis of algorithms with coefficient normalization
still works with gradient-weighted normalization and the time complexity does
not change. We also provide an upper bound on the coefficient norm based on the
gradient-weighted norm, which allows us to discuss the approximate border bases
with gradient-weighted normalization from the perspective of the coefficient
norm.
Related papers
- Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Minimum norm interpolation by perceptra: Explicit regularization and
implicit bias [0.3499042782396683]
We investigate how shallow ReLU networks interpolate between known regions.
We numerically study the implicit bias of common optimization algorithms towards known minimum norm interpolants.
arXiv Detail & Related papers (2023-11-10T15:55:47Z) - The Inductive Bias of Flatness Regularization for Deep Matrix
Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks.
We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z) - The Implicit Bias of Batch Normalization in Linear Models and Two-layer
Linear Convolutional Neural Networks [117.93273337740442]
We show that gradient descent converges to a uniform margin classifier on the training data with an $exp(-Omega(log2 t))$ convergence rate.
We also show that batch normalization has an implicit bias towards a patch-wise uniform margin.
arXiv Detail & Related papers (2023-06-20T16:58:00Z) - Robust Implicit Regularization via Weight Normalization [5.37610807422229]
We show that weight normalization enables a robust bias that persists even if the weights are at practically large scale.
Experiments suggest that the gains in both convergence speed and robustness of the implicit bias are improved dramatically by using weight normalization.
arXiv Detail & Related papers (2023-05-09T13:38:55Z) - Penalising the biases in norm regularisation enforces sparsity [28.86954341732928]
This work shows the parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $sqrt1+x2$ factor.
Notably, this weighting factor disappears when the norm of bias terms is not regularised.
arXiv Detail & Related papers (2023-03-02T15:33:18Z) - On the Importance of Gradient Norm in PAC-Bayesian Bounds [92.82627080794491]
We propose a new generalization bound that exploits the contractivity of the log-Sobolev inequalities.
We empirically analyze the effect of this new loss-gradient norm term on different neural architectures.
arXiv Detail & Related papers (2022-10-12T12:49:20Z) - Stability vs Implicit Bias of Gradient Methods on Separable Data and
Beyond [33.593203156666746]
We focus on the generalization properties of unregularized gradient-based learning procedures applied to separable linear classification.
We give an additional unified explanation for this generalization, that we refer to as realizability and self-boundedness.
In some of these cases, our bounds significantly improve upon the existing generalization error bounds in the literature.
arXiv Detail & Related papers (2022-02-27T19:56:36Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Exact Linear Convergence Rate Analysis for Low-Rank Symmetric Matrix
Completion via Gradient Descent [22.851500417035947]
Factorization-based gradient descent is a scalable and efficient algorithm for solving the factorrank matrix completion.
We show that gradient descent enjoys fast convergence to estimate a solution of the global nature problem.
arXiv Detail & Related papers (2021-02-04T03:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.