Related papers: Deformable Butterfly: A Highly Structured and Sparse Linear Transform

Deformable Butterfly: A Highly Structured and Sparse Linear Transform

URL: http://arxiv.org/abs/2203.13556v1
Date: Fri, 25 Mar 2022 10:20:50 GMT
Title: Deformable Butterfly: A Highly Structured and Sparse Linear Transform
Authors: Rui Lin, Jie Ran, King Hung Chiu, Graziano Chesi, and Ngai Wong
Abstract summary: We introduce a new kind of linear transform named Deformable Butterfly (DeBut) that generalizes the conventional butterfly matrices. It inherits the fine-to-coarse-grained learnable hierarchy of traditional butterflies and when deployed to neural networks, the prominent structures and sparsity in a DeBut layer constitutes a new way for network compression.
Score: 5.695853802236908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a new kind of linear transform named Deformable Butterfly (DeBut) that generalizes the conventional butterfly matrices and can be adapted to various input-output dimensions. It inherits the fine-to-coarse-grained learnable hierarchy of traditional butterflies and when deployed to neural networks, the prominent structures and sparsity in a DeBut layer constitutes a new way for network compression. We apply DeBut as a drop-in replacement of standard fully connected and convolutional layers, and demonstrate its superiority in homogenizing a neural network and rendering it favorable properties such as light weight and low inference complexity, without compromising accuracy. The natural complexity-accuracy tradeoff arising from the myriad deformations of a DeBut layer also opens up new rooms for analytical and practical research. The codes and Appendix are publicly available at: https://github.com/ruilin0212/DeBut.

Related papers

The Neural Differential Manifold: An Architecture with Explicit Geometric Structure [8.201374511929538]
This paper introduces the Neural Differential Manifold (NDM), a novel neural network architecture that explicitly incorporates geometric structure into its fundamental design.<n>We analyze the theoretical advantages of this approach, including its potential for more efficient optimization, enhanced continual learning, and applications in scientific discovery and controllable generative modeling.
arXiv Detail & Related papers (2025-10-29T02:24:27Z)
Residual Kolmogorov-Arnold Network for Enhanced Deep Learning [0.9399249626168465]
Deep convolutional neural networks (CNNs) can be difficult to optimize and costly to train due to hundreds of layers within the network depth.<n>We introduce a "plug-in" module, called Residual Kolmogorov-Arnold Network (RKAN)<n>RKAN offers consistent improvements over baseline models in different vision tasks.
arXiv Detail & Related papers (2024-10-07T21:12:32Z)
Lite it fly: An All-Deformable-Butterfly Network [7.8460795568982435]
Most deep neural networks (DNNs) consist fundamentally of convolutional and/or fully connected layers. The lately proposed deformable butterfly (DeBut) decomposes the filter matrix into generalized, butterflylike factors. This work reveals an intimate link between DeBut and a systematic hierarchy of depthwise and pointwise convolutions.
arXiv Detail & Related papers (2023-11-14T12:41:22Z)
Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces. It takes as input a concatenation of weights and biases of a pre-trainedvariant. We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z)
ButterflyFlow: Building Invertible Layers with Butterfly Matrices [80.83142511616262]
We propose a new family of invertible linear layers based on butterfly layers. Based on our invertible butterfly layers, we construct a new class of normalizing flow models called ButterflyFlow.
arXiv Detail & Related papers (2022-09-28T01:58:18Z)
Generalization by design: Shortcuts to Generalization in Deep Learning [7.751691910877239]
We show that good generalization may be instigated by bounded spectral products over layers leading to a novel geometric regularizer. Backed up by theory we further demonstrate that "generalization by design" is practically possible and that good generalization may be encoded into the structure of the network.
arXiv Detail & Related papers (2021-07-05T20:01:23Z)
Rethinking Skip Connection with Layer Normalization in Transformers and ResNets [49.87919454950763]
Skip connection is a widely-used technique to improve the performance of deep neural networks. In this work, we investigate how the scale factors in the effectiveness of the skip connection.
arXiv Detail & Related papers (2021-05-15T11:44:49Z)
Epigenetic evolution of deep convolutional models [81.21462458089142]
We build upon a previously proposed neuroevolution framework to evolve deep convolutional models. We propose a convolutional layer layout which allows kernels of different shapes and sizes to coexist within the same layer. The proposed layout enables the size and shape of individual kernels within a convolutional layer to be evolved with a corresponding new mutation operator.
arXiv Detail & Related papers (2021-04-12T12:45:16Z)
Lattice gauge equivariant convolutional neural networks [0.0]
We propose Lattice gauge equivariant Convolutional Neural Networks (L-CNNs) for generic machine learning applications. We show that L-CNNs can learn and generalize gauge invariant quantities that traditional convolutional neural networks are incapable of finding.
arXiv Detail & Related papers (2020-12-23T19:00:01Z)
Shape Adaptor: A Learnable Resizing Module [59.940372879848624]
We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers. Our implementation enables shape adaptors to be trained end-to-end without any additional supervision. We show the effectiveness of shape adaptors on two other applications: network compression and transfer learning.
arXiv Detail & Related papers (2020-08-03T14:15:52Z)
Sparse Linear Networks with a Fixed Butterfly Structure: Theory and Practice [4.3400407844814985]
We propose to replace a dense linear layer in any neural network by an architecture based on the butterfly network. In a collection of experiments, including supervised prediction on both the NLP and vision data, we show that this not only produces results that match and at times outperform existing well-known architectures.
arXiv Detail & Related papers (2020-07-17T09:45:03Z)
Neural Subdivision [58.97214948753937]
This paper introduces Neural Subdivision, a novel framework for data-driven coarseto-fine geometry modeling. We optimize for the same set of network weights across all local mesh patches, thus providing an architecture that is not constrained to a specific input mesh, fixed genus, or category. We demonstrate that even when trained on a single high-resolution mesh our method generates reasonable subdivisions for novel shapes.
arXiv Detail & Related papers (2020-05-04T20:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.