Related papers: Who Said Neural Networks Aren't Linear?

Who Said Neural Networks Aren't Linear?

URL: http://arxiv.org/abs/2510.08570v1
Date: Thu, 09 Oct 2025 17:59:57 GMT
Title: Who Said Neural Networks Aren't Linear?
Authors: Nimrod Berman, Assaf Hallak, Assaf Shocher,
Abstract summary: This paper introduces a method that makes such vector spaces explicit by construction.<n>We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y-1(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions.
Score: 10.340966855587405
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, $f$$:$$X$$\to$$Y$. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator $A$ between two invertible neural networks, $f(x)=g_y^{-1}(A g_x(x))$, then the corresponding vector spaces $X$ and $Y$ are induced by newly defined addition and scaling actions derived from $g_x$ and $g_y$. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. $f(f(x))=f(x)$) on networks leading to a globally projective generative model and to demonstrate modular style transfer.

Related papers

Pseudo-Invertible Neural Networks [7.337082724885154]
We introduce Surjective Pseudo-invertible Neural Networks (SPNN), a class of architectures explicitly designed to admit a tractable non-linear PInv.<n>The proposed non-linear PInv and its implementation in SPNN satisfy fundamental geometric properties.
arXiv Detail & Related papers (2026-02-05T18:59:58Z)
Phase Transitions for Feature Learning in Neural Networks [27.411134657066267]
We study the descent dynamics of two-layer neural networks under the proportional neuronss $n,dtoin$, $n/dto$.<n>Our characterization of $_textNN$ opens the way to study the dependence of learning dynamics on the network architecture and training algorithm.
arXiv Detail & Related papers (2026-02-01T20:47:36Z)
Geometry of fibers of the multiplication map of deep linear neural networks [0.0]
We study the geometry of the set of quivers of composable matrices which multiply to a fixed matrix.<n>Our solution is presented in three forms: a Poincar'e series in equivariant cohomology, a quadratic integer program, and an explicit formula.
arXiv Detail & Related papers (2024-11-29T18:36:03Z)
A Walsh Hadamard Derived Linear Vector Symbolic Architecture [83.27945465029167]
Symbolic Vector Architectures (VSAs) are an approach to developing Neuro-symbolic AI. HLB is designed to have favorable computational efficiency, and efficacy in classic VSA tasks.
arXiv Detail & Related papers (2024-10-30T03:42:59Z)
Weight-based Decomposition: A Case for Bilinear MLPs [0.0]
Gated Linear Units (GLUs) have become a common building block in modern foundation models. Bilinear layers drop the non-linearity in the "gate" but still have comparable performance to other GLUs. We develop a method to decompose the bilinear tensor into a set of interacting eigenvectors.
arXiv Detail & Related papers (2024-06-06T10:46:51Z)
Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods. In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z)
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation [89.21686761957383]
We study the first gradient descent step on the first-layer parameters $boldsymbolW$ in a two-layer network. Our results demonstrate that even one step can lead to a considerable advantage over random features.
arXiv Detail & Related papers (2022-05-03T12:09:59Z)
A Differential Geometry Perspective on Orthogonal Recurrent Models [56.09491978954866]
We employ tools and insights from differential geometry to offer a novel perspective on orthogonal RNNs. We show that orthogonal RNNs may be viewed as optimizing in the space of divergence-free vector fields. Motivated by this observation, we study a new recurrent model, which spans the entire space of vector fields.
arXiv Detail & Related papers (2021-02-18T19:39:22Z)
A deep network construction that adapts to intrinsic dimensionality beyond the domain [79.23797234241471]
We study the approximation of two-layer compositions $f(x) = g(phi(x))$ via deep networks with ReLU activation. We focus on two intuitive and practically relevant choices for $phi$: the projection onto a low-dimensional embedded submanifold and a distance to a collection of low-dimensional sets.
arXiv Detail & Related papers (2020-08-06T09:50:29Z)
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK [58.5766737343951]
We consider the dynamic of descent for learning a two-layer neural network. We show that an over-parametrized two-layer neural network can provably learn with gradient loss at most ground with Tangent samples.
arXiv Detail & Related papers (2020-07-09T07:09:28Z)
Affine symmetries and neural network identifiability [0.0]
We consider arbitrary nonlinearities with potentially complicated affine symmetries. We show that the symmetries can be used to find a rich set of networks giving rise to the same function $f$.
arXiv Detail & Related papers (2020-06-21T07:09:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.