Channel Equilibrium Networks for Learning Deep Representation
- URL: http://arxiv.org/abs/2003.00214v1
- Date: Sat, 29 Feb 2020 09:02:31 GMT
- Title: Channel Equilibrium Networks for Learning Deep Representation
- Authors: Wenqi Shao, Shitao Tang, Xingang Pan, Ping Tan, Xiaogang Wang, Ping
Luo
- Abstract summary: This work shows that the combination of normalization and rectified linear function leads to inhibited channels.
Unlike prior arts that simply removed the inhibited channels, we propose to "wake them up" during training by designing a novel neural building block.
Channel Equilibrium (CE) block enables channels at the same layer to contribute equally to the learned representation.
- Score: 63.76618960820138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convolutional Neural Networks (CNNs) are typically constructed by stacking
multiple building blocks, each of which contains a normalization layer such as
batch normalization (BN) and a rectified linear function such as ReLU. However,
this work shows that the combination of normalization and rectified linear
function leads to inhibited channels, which have small magnitude and contribute
little to the learned feature representation, impeding the generalization
ability of CNNs. Unlike prior arts that simply removed the inhibited channels,
we propose to "wake them up" during training by designing a novel neural
building block, termed Channel Equilibrium (CE) block, which enables channels
at the same layer to contribute equally to the learned representation. We show
that CE is able to prevent inhibited channels both empirically and
theoretically. CE has several appealing benefits. (1) It can be integrated into
many advanced CNN architectures such as ResNet and MobileNet, outperforming
their original networks. (2) CE has an interesting connection with the Nash
Equilibrium, a well-known solution of a non-cooperative game. (3) Extensive
experiments show that CE achieves state-of-the-art performance on various
challenging benchmarks such as ImageNet and COCO.
Related papers
- TBSN: Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising [94.09442506816724]
Blind-spot networks (BSN) have been prevalent network architectures in self-supervised image denoising (SSID)
We present a transformer-based blind-spot network (TBSN) by analyzing and redesigning the transformer operators that meet the blind-spot requirement.
For spatial self-attention, an elaborate mask is applied to the attention matrix to restrict its receptive field, thus mimicking the dilated convolution.
For channel self-attention, we observe that it may leak the blind-spot information when the channel number is greater than spatial size in the deep layers of multi-scale architectures.
arXiv Detail & Related papers (2024-04-11T15:39:10Z) - Interference Cancellation GAN Framework for Dynamic Channels [74.22393885274728]
We introduce an online training framework that can adapt to any changes in the channel.
Our framework significantly outperforms recent neural network models on highly dynamic channels.
arXiv Detail & Related papers (2022-08-17T02:01:18Z) - End-to-end learnable EEG channel selection with deep neural networks [72.21556656008156]
We propose a framework to embed the EEG channel selection in the neural network itself.
We deal with the discrete nature of this new optimization problem by employing continuous relaxations of the discrete channel selection parameters.
This generic approach is evaluated on two different EEG tasks.
arXiv Detail & Related papers (2021-02-11T13:44:07Z) - IC Networks: Remodeling the Basic Unit for Convolutional Neural Networks [8.218732270970381]
"Inter-layer Collision" (IC) structure can be integrated into existing CNNs to improve their performance.
New training method, namely weak logit distillation (WLD), is proposed to speed up the training of IC networks.
In the ImageNet experiment, we integrate the IC structure into ResNet-50 and reduce the top-1 error from 22.38% to 21.75%.
arXiv Detail & Related papers (2021-02-06T03:15:43Z) - Self-Organized Operational Neural Networks for Severe Image Restoration
Problems [25.838282412957675]
Discnative learning based on convolutional neural networks (CNNs) aims to perform image restoration by learning from training examples of noisy-clean image pairs.
We claim that this is due to the inherent linear nature of convolution-based transformation, which is inadequate for handling severe restoration problems.
We propose a self-organizing variant of ONNs, Self-ONNs, for image restoration, which synthesizes novel nodal transformations onthe-fly.
arXiv Detail & Related papers (2020-08-29T02:19:41Z) - A Deep Learning Framework for Hybrid Beamforming Without Instantaneous
CSI Feedback [4.771833920251869]
We propose a deep learning (DL) framework to deal with both hybrid beamforming and channel estimation.
The proposed framework exhibits at least 10 times lower computational complexity as compared to the conventional optimization-based approaches.
arXiv Detail & Related papers (2020-06-19T05:47:25Z) - The Curious Case of Convex Neural Networks [12.56278477726461]
We show that the convexity constraints can be enforced on both fully connected and convolutional layers.
We draw three valuable insights: (a) Input Output Convex Neural Networks (IOC-NNs) self regularize and reduce the problem of overfitting; (b) Although heavily constrained, they outperform the base multi layer perceptrons and achieve similar performance as compared to base convolutional architectures.
arXiv Detail & Related papers (2020-06-09T08:16:38Z) - Decentralized Learning for Channel Allocation in IoT Networks over
Unlicensed Bandwidth as a Contextual Multi-player Multi-armed Bandit Game [134.88020946767404]
We study a decentralized channel allocation problem in an ad-hoc Internet of Things network underlaying on the spectrum licensed to a primary cellular network.
Our study maps this problem into a contextual multi-player, multi-armed bandit game, and proposes a purely decentralized, three-stage policy learning algorithm through trial-and-error.
arXiv Detail & Related papers (2020-03-30T10:05:35Z) - Discrimination-aware Network Pruning for Deep Model Compression [79.44318503847136]
Existing pruning methods either train from scratch with sparsity constraints or minimize the reconstruction error between the feature maps of the pre-trained models and the compressed ones.
We propose a simple-yet-effective method called discrimination-aware channel pruning (DCP) to choose the channels that actually contribute to the discriminative power.
Experiments on both image classification and face recognition demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2020-01-04T07:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.