Who Said Neural Networks Aren't Linear?
- URL: http://arxiv.org/abs/2510.08570v1
- Date: Thu, 09 Oct 2025 17:59:57 GMT
- Title: Who Said Neural Networks Aren't Linear?
- Authors: Nimrod Berman, Assaf Hallak, Assaf Shocher,
- Abstract summary: This paper introduces a method that makes such vector spaces explicit by construction.<n>We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y-1(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions.
- Score: 10.340966855587405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$\to$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.
Related papers
- Pseudo-Invertible Neural Networks [7.337082724885154]
We introduce Surjective Pseudo-invertible Neural Networks (SPNN), a class of architectures explicitly designed to admit a tractable non-linear PInv.<n>The proposed non-linear PInv and its implementation in SPNN satisfy fundamental geometric properties.
arXiv Detail & Related papers (2026-02-05T18:59:58Z) - Phase Transitions for Feature Learning in Neural Networks [27.411134657066267]
We study the descent dynamics of two-layer neural networks under the proportional neuronss $n,dtoin$, $n/dto$.<n>Our characterization of $_textNN$ opens the way to study the dependence of learning dynamics on the network architecture and training algorithm.
arXiv Detail & Related papers (2026-02-01T20:47:36Z) - Geometry of fibers of the multiplication map of deep linear neural networks [0.0]
We study the geometry of the set of quivers of composable matrices which multiply to a fixed matrix.<n>Our solution is presented in three forms: a Poincar'e series in equivariant cohomology, a quadratic integer program, and an explicit formula.
arXiv Detail & Related papers (2024-11-29T18:36:03Z) - A Walsh Hadamard Derived Linear Vector Symbolic Architecture [83.27945465029167]
Symbolic Vector Architectures (VSAs) are an approach to developing Neuro-symbolic AI.
HLB is designed to have favorable computational efficiency, and efficacy in classic VSA tasks.
arXiv Detail & Related papers (2024-10-30T03:42:59Z) - Weight-based Decomposition: A Case for Bilinear MLPs [0.0]
Gated Linear Units (GLUs) have become a common building block in modern foundation models.
Bilinear layers drop the non-linearity in the "gate" but still have comparable performance to other GLUs.
We develop a method to decompose the bilinear tensor into a set of interacting eigenvectors.
arXiv Detail & Related papers (2024-06-06T10:46:51Z) - Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods.
In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z) - High-dimensional Asymptotics of Feature Learning: How One Gradient Step
Improves the Representation [89.21686761957383]
We study the first gradient descent step on the first-layer parameters $boldsymbolW$ in a two-layer network.
Our results demonstrate that even one step can lead to a considerable advantage over random features.
arXiv Detail & Related papers (2022-05-03T12:09:59Z) - A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs.
We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields.
Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z) - A deep network construction that adapts to intrinsic dimensionality
beyond the domain [79.23797234241471]
We study the approximation of two-layer compositions $f(x) = g(phi(x))$ via deep networks with ReLU activation.
We focus on two intuitive and practically relevant choices for $phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets.
arXiv Detail & Related papers (2020-08-06T09:50:29Z) - Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK [58.5766737343951]
We consider the dynamic of descent for learning a two-layer neural network.
We show that an over-parametrized two-layer neural network can provably learn with gradient loss at most ground with Tangent samples.
arXiv Detail & Related papers (2020-07-09T07:09:28Z) - Affine symmetries and neural network identifiability [0.0]
We consider arbitrary nonlinearities with potentially complicated affine symmetries.
We show that the symmetries can be used to find a rich set of networks giving rise to the same function $f$.
arXiv Detail & Related papers (2020-06-21T07:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.