Related papers: Symmetric Convolutional Filters: A Novel Way to Constrain Parameters in CNN

Related papers

Symmetry-Aware Graph Metanetwork Autoencoders: Model Merging through Parameter Canonicalization [37.457412158271126]
We present an autoencoder framework utilizing ScaleGMNs as invariant encoders.<n>We show that our method aligns Implicit Neural Representations (INRs) and Convolutional Neural Networks (CNNs) under both permutation and scaling symmetries.<n>This approach ensures that similar networks naturally converge within the same basin, facilitating model merging and avoiding regions of high loss.
arXiv Detail & Related papers (2025-11-16T13:57:50Z)
Generalized Linear Mode Connectivity for Transformers [87.32299363530996]
A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths.<n>Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope.<n>We introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, transformations, and general invertible maps.<n>This generalization enables, for the first time, the discovery of low- and zero-barrier linear paths between independently trained Vision Transformers and GPT-2 models.
arXiv Detail & Related papers (2025-06-28T01:46:36Z)
Symmetry in Neural Network Parameter Spaces [32.732734207891745]
A significant portion of redundancy is explained by symmetries in the parameter space--transformations that leave the network function unchanged.<n>These symmetries shape the loss landscape and constrain learning dynamics, offering a new lens for understanding optimization, generalization, and model complexity.<n>We summarize existing literature, uncover connections between symmetry and learning theory, and identify gaps and opportunities in this emerging field.
arXiv Detail & Related papers (2025-06-16T00:59:12Z)
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures. We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries. Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z)
Optimizing Likelihood-free Inference using Self-supervised Neural Symmetry Embeddings [0.24084786718197512]
We show a technique of optimizing likelihood-free inference to make it even faster by marginalizing symmetries in a physical problem. We present this approach on two simple physical problems and we show faster convergence in a smaller number of parameters compared to a normalizing flow.
arXiv Detail & Related papers (2023-12-11T21:06:07Z)
Low-rank extended Kalman filtering for online learning of neural networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream. The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix. In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z)
Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network [0.36122488107441414]
Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. We propose a non- parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of decomposedally augmented filters.
arXiv Detail & Related papers (2023-05-17T10:18:02Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Encoding Involutory Invariance in Neural Networks [1.6371837018687636]
In certain situations, Neural Networks (NN) are trained upon data that obey underlying physical symmetries. In this work, we explore a special kind of symmetry where functions are invariant with respect to involutory linear/affine transformations up to parity. Numerical experiments indicate that the proposed models outperform baseline networks while respecting the imposed symmetry. An adaption of our technique to convolutional NN classification tasks for datasets with inherent horizontal/vertical reflection symmetry has also been proposed.
arXiv Detail & Related papers (2021-06-07T16:07:15Z)
Spectral Tensor Train Parameterization of Deep Learning Layers [136.4761580842396]
We study low-rank parameterizations of weight matrices with embedded spectral properties in the Deep Learning context. We show the effects of neural network compression in the classification setting and both compression and improved stability training in the generative adversarial training setting.
arXiv Detail & Related papers (2021-03-07T00:15:44Z)
Sampling asymmetric open quantum systems for artificial neural networks [77.34726150561087]
We present a hybrid sampling strategy which takes asymmetric properties explicitly into account, achieving fast convergence times and high scalability for asymmetric open systems. We highlight the universal applicability of artificial neural networks, underlining the universal applicability of neural networks.
arXiv Detail & Related papers (2020-12-20T18:25:29Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Dense Steerable Filter CNNs for Exploiting Rotational Symmetry in Histology Images [3.053417311299492]
Histology images are inherently symmetric under rotation, where each orientation is equally as likely to appear. Dense Steerable Filter CNNs (DSF-CNNs) use group convolutions with multiple rotated copies of each filter in a densely connected framework. We show that DSF-CNNs achieve state-of-the-art performance, with significantly fewer parameters, when applied to three different tasks in the area of pathology computational.
arXiv Detail & Related papers (2020-04-06T23:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.