Generalizing and Improving Jacobian and Hessian Regularization
- URL: http://arxiv.org/abs/2212.00311v1
- Date: Thu, 1 Dec 2022 07:01:59 GMT
- Title: Generalizing and Improving Jacobian and Hessian Regularization
- Authors: Chenwei Cui, Zehao Yan, Guangshen Liu, Liangfu Lu
- Abstract summary: We generalize previous efforts by extending the target matrix from zero to any matrix that admits efficient matrix-vector products.
The proposed paradigm allows us to construct novel regularization terms that enforce symmetry or diagonality on square Jacobian and Hessian matrices.
We introduce Lanczos-based spectral norm minimization to tackle this difficulty.
- Score: 1.926971915834451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Jacobian and Hessian regularization aim to reduce the magnitude of the first
and second-order partial derivatives with respect to neural network inputs, and
they are predominantly used to ensure the adversarial robustness of image
classifiers. In this work, we generalize previous efforts by extending the
target matrix from zero to any matrix that admits efficient matrix-vector
products. The proposed paradigm allows us to construct novel regularization
terms that enforce symmetry or diagonality on square Jacobian and Hessian
matrices. On the other hand, the major challenge for Jacobian and Hessian
regularization has been high computational complexity. We introduce
Lanczos-based spectral norm minimization to tackle this difficulty. This
technique uses a parallelized implementation of the Lanczos algorithm and is
capable of effective and stable regularization of large Jacobian and Hessian
matrices. Theoretical justifications and empirical evidence are provided for
the proposed paradigm and technique. We carry out exploratory experiments to
validate the effectiveness of our novel regularization terms. We also conduct
comparative experiments to evaluate Lanczos-based spectral norm minimization
against prior methods. Results show that the proposed methodologies are
advantageous for a wide range of tasks.
Related papers
- Regularized Projection Matrix Approximation with Applications to Community Detection [1.3761665705201904]
This paper introduces a regularized projection matrix approximation framework designed to recover cluster information from the affinity matrix.
We investigate three distinct penalty functions, each specifically tailored to address bounded, positive, and sparse scenarios.
Numerical experiments conducted on both synthetic and real-world datasets reveal that our regularized projection matrix approximation approach significantly outperforms state-of-the-art methods in clustering performance.
arXiv Detail & Related papers (2024-05-26T15:18:22Z) - The Inductive Bias of Flatness Regularization for Deep Matrix
Factorization [58.851514333119255]
This work takes the first step toward understanding the inductive bias of the minimum trace of the Hessian solutions in deep linear networks.
We show that for all depth greater than one, with the standard Isometry Property (RIP) on the measurements, minimizing the trace of Hessian is approximately equivalent to minimizing the Schatten 1-norm of the corresponding end-to-end matrix parameters.
arXiv Detail & Related papers (2023-06-22T23:14:57Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Riemannian statistics meets random matrix theory: towards learning from
high-dimensional covariance matrices [2.352645870795664]
This paper shows that there seems to exist no practical method of computing the normalising factors associated with Riemannian Gaussian distributions on spaces of high-dimensional covariance matrices.
It is shown that this missing method comes from an unexpected new connection with random matrix theory.
Numerical experiments are conducted which demonstrate how this new approximation can unlock the difficulties which have impeded applications to real-world datasets.
arXiv Detail & Related papers (2022-03-01T03:16:50Z) - Learning a Compressive Sensing Matrix with Structural Constraints via
Maximum Mean Discrepancy Optimization [17.104994036477308]
We introduce a learning-based algorithm to obtain a measurement matrix for compressive sensing related recovery problems.
Recent success of such metrics in neural network related topics motivate a solution of the problem based on machine learning.
arXiv Detail & Related papers (2021-10-14T08:35:54Z) - Adversarially-Trained Nonnegative Matrix Factorization [77.34726150561087]
We consider an adversarially-trained version of the nonnegative matrix factorization.
In our formulation, an attacker adds an arbitrary matrix of bounded norm to the given data matrix.
We design efficient algorithms inspired by adversarial training to optimize for dictionary and coefficient matrices.
arXiv Detail & Related papers (2021-04-10T13:13:17Z) - On the Efficient Implementation of the Matrix Exponentiated Gradient
Algorithm for Low-Rank Matrix Optimization [26.858608065417663]
Convex optimization over the spectrahedron has important applications in machine learning, signal processing and statistics.
We propose efficient implementations of MEG, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD on each iteration.
We also provide efficiently-computable certificates for the correct convergence of our methods.
arXiv Detail & Related papers (2020-12-18T19:14:51Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Controllable Orthogonalization in Training DNNs [96.1365404059924]
Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1.
This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI)
We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction.
We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (
arXiv Detail & Related papers (2020-04-02T10:14:27Z) - Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure
from Motion [15.641335104467982]
We show that more accurate results can be achieved with 2nd order methods.
Our main result shows how to construct bilinear formulations, for a general class of regularizers.
We show experimentally, on a number of structure from motion problems, that our approach outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-03-23T13:52:16Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.