Related papers: Black Boxes and Looking Glasses: Multilevel Symmetries, Reflection Planes, and Convex Optimization in Deep Networks

Black Boxes and Looking Glasses: Multilevel Symmetries, Reflection Planes, and Convex Optimization in Deep Networks

URL: http://arxiv.org/abs/2410.04279v2
Date: Fri, 11 Oct 2024 19:40:20 GMT
Title: Black Boxes and Looking Glasses: Multilevel Symmetries, Reflection Planes, and Convex Optimization in Deep Networks
Authors: Emi Zeger, Mert Pilanci,
Abstract summary: We show that training deep neural networks (DNNs) with absolute value activation and arbitrary input dimension can be formulated as equivalent convex Lasso problems. This formulation reveals geometric structures encoding symmetry in neural networks.
Score: 46.337104465755075
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We show that training deep neural networks (DNNs) with absolute value activation and arbitrary input dimension can be formulated as equivalent convex Lasso problems with novel features expressed using geometric algebra. This formulation reveals geometric structures encoding symmetry in neural networks. Using the equivalent Lasso form of DNNs, we formally prove a fundamental distinction between deep and shallow networks: deep networks inherently favor symmetric structures in their fitted functions, with greater depth enabling multilevel symmetries, i.e., symmetries within symmetries. Moreover, Lasso features represent distances to hyperplanes that are reflected across training points. These reflection hyperplanes are spanned by training data and are orthogonal to optimal weight vectors. Numerical experiments support theory and demonstrate theoretically predicted features when training networks using embeddings generated by Large Language Models.

Related papers

Geometry and Optimization of Shallow Polynomial Networks [37.10914374024599]
We study shallow neural networks with activations, focusing on the relationship between width and optimization. We then consider teacher-student problems, that can be viewed as a problem of low-rank tensor approximation. In particular, we present a variation of the Eckart-Young Theorem characterizing all critical points and their Hessian signatures.
arXiv Detail & Related papers (2025-01-10T16:11:27Z)
The Role of Fibration Symmetries in Geometric Deep Learning [0.0]
Geometric Deep Learning (GDL) unifies a broad class of machine learning techniques from the perspectives of symmetries. We propose to relax GDL to allow for local symmetries, specifically fibration symmetries in graphs, to leverage regularities of realistic instances. GNNs apply the inductive bias of fibration symmetries and derive a tighter upper bound for their expressive power.
arXiv Detail & Related papers (2024-08-28T16:04:40Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features [54.83898311047626]
We consider neural networks with piecewise linear activations ranging from 2 to an arbitrary but finite number of layers. We first show that two-layer networks with piecewise linear activations are Lasso models using a discrete dictionary of ramp depths.
arXiv Detail & Related papers (2024-03-02T00:33:45Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity [54.01594785269913]
We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. The training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset.
arXiv Detail & Related papers (2023-09-28T15:19:30Z)
Encoding Involutory Invariance in Neural Networks [1.6371837018687636]
In certain situations, Neural Networks (NN) are trained upon data that obey underlying physical symmetries. In this work, we explore a special kind of symmetry where functions are invariant with respect to involutory linear/affine transformations up to parity. Numerical experiments indicate that the proposed models outperform baseline networks while respecting the imposed symmetry. An adaption of our technique to convolutional NN classification tasks for datasets with inherent horizontal/vertical reflection symmetry has also been proposed.
arXiv Detail & Related papers (2021-06-07T16:07:15Z)
Deep Networks and the Multiple Manifold Problem [15.144495799445824]
We study the multiple manifold problem, a binary classification task modeled on applications in machine vision, in which a deep fully-connected neural network is trained to separate two low-dimensional submanifolds of the unit sphere. We prove for a simple manifold configuration that when the network depth $L$ is large relative to certain geometric and statistical properties of the data, the network width grows as a sufficiently large in $L$. Our analysis demonstrates concrete benefits of depth and width in the context of a practically-motivated model problem.
arXiv Detail & Related papers (2020-08-25T19:20:00Z)
Convex Geometry and Duality of Over-parameterized Neural Networks [70.15611146583068]
We develop a convex analytic approach to analyze finite width two-layer ReLU networks. We show that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set. In higher dimensions, we show that the training problem can be cast as a finite dimensional convex problem with infinitely many constraints.
arXiv Detail & Related papers (2020-02-25T23:05:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.