Two-level Group Convolution
- URL: http://arxiv.org/abs/2110.05060v1
- Date: Mon, 11 Oct 2021 07:54:49 GMT
- Title: Two-level Group Convolution
- Authors: Youngkyu Lee, Jongho Park and Chang-Ock Lee
- Abstract summary: Group convolution has been widely used in order to reduce the computation time of convolution.
We propose a new convolution methodology called two-level'' group convolution that is robust with respect to the increase of the number of groups.
- Score: 2.2344764434954256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Group convolution has been widely used in order to reduce the computation
time of convolution, which takes most of the training time of convolutional
neural networks. However, it is well known that a large number of groups
significantly reduce the performance of group convolution. In this paper, we
propose a new convolution methodology called ``two-level'' group convolution
that is robust with respect to the increase of the number of groups and
suitable for multi-GPU parallel computation. We first observe that the group
convolution can be interpreted as a one-level block Jacobi approximation of the
standard convolution, which is a popular notion in the field of numerical
analysis. In numerical analysis, there have been numerous studies on the
two-level method that introduces an intergroup structure that resolves the
performance degradation issue without disturbing parallel computation.
Motivated by these, we introduce a coarse-level structure which promotes
intergroup communication without being a bottleneck in the group convolution.
We show that all the additional work induced by the coarse-level structure can
be efficiently processed in a distributed memory system. Numerical results that
verify the robustness of the proposed method with respect to the number of
groups are presented. Moreover, we compare the proposed method to various
approaches for group convolution in order to highlight the superiority of the
proposed method in terms of execution time, memory efficiency, and performance.
Related papers
- GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression [64.47244912937204]
We propose a novel transformer-based entropy model called GroupedMixer.
GroupedMixer enjoys both faster coding speed and better compression performance than previous transformer-based methods.
Experimental results demonstrate that the proposed GroupedMixer yields the state-of-the-art rate-distortion performance with fast compression speed.
arXiv Detail & Related papers (2024-05-02T10:48:22Z) - Balanced Group Convolution: An Improved Group Convolution Based on
Approximability Estimates [1.927926533063962]
Group convolution effectively reduces the computational cost by grouping channels.
We mathematically analyze the approximation of the group convolution to the standard convolution.
We propose a novel variant of the group convolution called balanced group convolution, which shows a higher approximation with a small additional computational cost.
arXiv Detail & Related papers (2023-10-19T04:39:38Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Achieving Sample and Computational Efficient Reinforcement Learning by
Action Space Reduction via Grouping [7.691755449724638]
Reinforcement learning often needs to deal with the exponential growth of states and actions in high-dimensional spaces.
We learn the inherent structure of action-wise similar MDP to appropriately balance the performance degradation versus sample/computational complexity.
arXiv Detail & Related papers (2023-06-22T15:40:10Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Shift of Pairwise Similarities for Data Clustering [7.462336024223667]
We consider the case where the regularization term is the sum of the squared size of the clusters, and then generalize it to adaptive regularization of the pairwise similarities.
This leads to shifting (adaptively) the pairwise similarities which might make some of them negative.
We then propose an efficient local search optimization algorithm with fast theoretical convergence rate to solve the new clustering problem.
arXiv Detail & Related papers (2021-10-25T16:55:07Z) - Exploiting Redundancy: Separable Group Convolutional Networks on Lie
Groups [14.029933823101084]
Group convolutional neural networks (G-CNNs) have been shown to increase parameter efficiency and model accuracy.
In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels.
We introduce convolution kernels that are separable over the subgroup and channel dimensions.
arXiv Detail & Related papers (2021-10-25T15:56:53Z) - Exclusive Group Lasso for Structured Variable Selection [10.86544864007391]
A structured variable selection problem is considered.
A composite norm can be properly designed to promote such exclusive group sparsity patterns.
An active set algorithm is proposed that builds the solution by including structure atoms into the estimated support.
arXiv Detail & Related papers (2021-08-23T16:55:13Z) - Partition-based formulations for mixed-integer optimization of trained
ReLU neural networks [66.88252321870085]
This paper introduces a class of mixed-integer formulations for trained ReLU neural networks.
At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node.
arXiv Detail & Related papers (2021-02-08T17:27:34Z) - Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one.
We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective.
Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.