Related papers: Two-level Group Convolution

Two-level Group Convolution

URL: http://arxiv.org/abs/2110.05060v1
Date: Mon, 11 Oct 2021 07:54:49 GMT
Title: Two-level Group Convolution
Authors: Youngkyu Lee, Jongho Park and Chang-Ock Lee
Abstract summary: Group convolution has been widely used in order to reduce the computation time of convolution. We propose a new convolution methodology called two-level'' group convolution that is robust with respect to the increase of the number of groups.
Score: 2.2344764434954256
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Group convolution has been widely used in order to reduce the computation time of convolution, which takes most of the training time of convolutional neural networks. However, it is well known that a large number of groups significantly reduce the performance of group convolution. In this paper, we propose a new convolution methodology called ``two-level'' group convolution that is robust with respect to the increase of the number of groups and suitable for multi-GPU parallel computation. We first observe that the group convolution can be interpreted as a one-level block Jacobi approximation of the standard convolution, which is a popular notion in the field of numerical analysis. In numerical analysis, there have been numerous studies on the two-level method that introduces an intergroup structure that resolves the performance degradation issue without disturbing parallel computation. Motivated by these, we introduce a coarse-level structure which promotes intergroup communication without being a bottleneck in the group convolution. We show that all the additional work induced by the coarse-level structure can be efficiently processed in a distributed memory system. Numerical results that verify the robustness of the proposed method with respect to the number of groups are presented. Moreover, we compare the proposed method to various approaches for group convolution in order to highlight the superiority of the proposed method in terms of execution time, memory efficiency, and performance.

Related papers

GroupedMixer: An Entropy Model with Group-wise Token-Mixers for Learned Image Compression [64.47244912937204]
We propose a novel transformer-based entropy model called GroupedMixer. GroupedMixer enjoys both faster coding speed and better compression performance than previous transformer-based methods. Experimental results demonstrate that the proposed GroupedMixer yields the state-of-the-art rate-distortion performance with fast compression speed.
arXiv Detail & Related papers (2024-05-02T10:48:22Z)
Balanced Group Convolution: An Improved Group Convolution Based on Approximability Estimates [1.927926533063962]
Group convolution effectively reduces the computational cost by grouping channels. We mathematically analyze the approximation of the group convolution to the standard convolution. We propose a novel variant of the group convolution called balanced group convolution, which shows a higher approximation with a small additional computational cost.
arXiv Detail & Related papers (2023-10-19T04:39:38Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping [7.691755449724638]
Reinforcement learning often needs to deal with the exponential growth of states and actions in high-dimensional spaces. We learn the inherent structure of action-wise similar MDP to appropriately balance the performance degradation versus sample/computational complexity.
arXiv Detail & Related papers (2023-06-22T15:40:10Z)
Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning. Recent work has established lower bounds against the class of low-degree methods. We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z)
Shift of Pairwise Similarities for Data Clustering [7.462336024223667]
We consider the case where the regularization term is the sum of the squared size of the clusters, and then generalize it to adaptive regularization of the pairwise similarities. This leads to shifting (adaptively) the pairwise similarities which might make some of them negative. We then propose an efficient local search optimization algorithm with fast theoretical convergence rate to solve the new clustering problem.
arXiv Detail & Related papers (2021-10-25T16:55:07Z)
Exploiting Redundancy: Separable Group Convolutional Networks on Lie Groups [14.029933823101084]
Group convolutional neural networks (G-CNNs) have been shown to increase parameter efficiency and model accuracy. In this work, we investigate the properties of representations learned by regular G-CNNs, and show considerable parameter redundancy in group convolution kernels. We introduce convolution kernels that are separable over the subgroup and channel dimensions.
arXiv Detail & Related papers (2021-10-25T15:56:53Z)
Exclusive Group Lasso for Structured Variable Selection [10.86544864007391]
A structured variable selection problem is considered. A composite norm can be properly designed to promote such exclusive group sparsity patterns. An active set algorithm is proposed that builds the solution by including structure atoms into the estimated support.
arXiv Detail & Related papers (2021-08-23T16:55:13Z)
Partition-based formulations for mixed-integer optimization of trained ReLU neural networks [66.88252321870085]
This paper introduces a class of mixed-integer formulations for trained ReLU neural networks. At one extreme, one partition per input recovers the convex hull of a node, i.e., the tightest possible formulation for each node.
arXiv Detail & Related papers (2021-02-08T17:27:34Z)
Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one. We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective. Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z)
Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$. We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.