Learn to Enhance the Negative Information in Convolutional Neural
Network
- URL: http://arxiv.org/abs/2306.10536v1
- Date: Sun, 18 Jun 2023 12:02:36 GMT
- Title: Learn to Enhance the Negative Information in Convolutional Neural
Network
- Authors: Zhicheng Cai, Chenglei Peng, Qiu Shen
- Abstract summary: This paper proposes a learnable nonlinear activation mechanism specifically for convolutional neural network (CNN) termed as LENI.
In sharp contrast to ReLU which cuts off the negative neurons and suffers from the issue of ''dying ReLU'', LENI enjoys the capacity to reconstruct the dead neurons and reduce the information loss.
- Score: 6.910916428810853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a learnable nonlinear activation mechanism specifically
for convolutional neural network (CNN) termed as LENI, which learns to enhance
the negative information in CNNs. In sharp contrast to ReLU which cuts off the
negative neurons and suffers from the issue of ''dying ReLU'', LENI enjoys the
capacity to reconstruct the dead neurons and reduce the information loss.
Compared to improved ReLUs, LENI introduces a learnable approach to process the
negative phase information more properly. In this way, LENI can enhance the
model representational capacity significantly while maintaining the original
advantages of ReLU. As a generic activation mechanism, LENI possesses the
property of portability and can be easily utilized in any CNN models through
simply replacing the activation layers with LENI block. Extensive experiments
validate that LENI can improve the performance of various baseline models on
various benchmark datasets by a clear margin (up to 1.24% higher top-1 accuracy
on ImageNet-1k) with negligible extra parameters. Further experiments show that
LENI can act as a channel compensation mechanism, offering competitive or even
better performance but with fewer learned parameters than baseline models. In
addition, LENI introduces the asymmetry to the model structure which
contributes to the enhancement of representational capacity. Through
visualization experiments, we validate that LENI can retain more information
and learn more representations.
Related papers
- Understanding the Benefits of SimCLR Pre-Training in Two-Layer Convolutional Neural Networks [10.55004012983524]
SimCLR is one of the most popular contrastive learning methods for vision tasks.
We consider training a two-layer convolutional neural network (CNN) to learn a toy image data model.
We show that, under certain conditions on the number of labeled data, SimCLR pre-training combined with supervised fine-tuning achieves almost optimal test loss.
arXiv Detail & Related papers (2024-09-27T12:19:41Z) - Forget but Recall: Incremental Latent Rectification in Continual Learning [21.600690867361617]
Intrinsic capability to continuously learn a changing data stream is a desideratum of deep neural networks (DNNs)
Existing Continual Learning approaches either retain exemplars for replay, regularize learning, or allocate dedicated capacity for new tasks.
This paper investigates an unexplored CL direction for incremental learning called Incremental Latent Rectification or ILR.
arXiv Detail & Related papers (2024-06-25T08:57:47Z) - U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation [48.40120035775506]
Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions.
We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN.
We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
arXiv Detail & Related papers (2024-06-05T04:13:03Z) - A Method on Searching Better Activation Functions [15.180864683908878]
We propose Entropy-based Activation Function Optimization (EAFO) methodology for designing static activation functions in deep neural networks.
We derive a novel activation function from ReLU, known as Correction Regularized ReLU (CRReLU)
arXiv Detail & Related papers (2024-05-19T03:48:05Z) - Why do Learning Rates Transfer? Reconciling Optimization and Scaling
Limits for Deep Learning [77.82908213345864]
We find empirical evidence that learning rate transfer can be attributed to the fact that under $mu$P and its depth extension, the largest eigenvalue of the training loss Hessian is largely independent of the width and depth of the network.
We show that under the neural tangent kernel (NTK) regime, the sharpness exhibits very different dynamics at different scales, thus preventing learning rate transfer.
arXiv Detail & Related papers (2024-02-27T12:28:01Z) - KLIF: An optimized spiking neuron unit for tuning surrogate gradient
slope and membrane potential [0.0]
Spiking neural networks (SNNs) have attracted much attention due to their ability to process temporal information.
It is still challenging to develop efficient and high-performing learning algorithms for SNNs.
We propose a novel k-based leaky Integrate-and-Fire neuron model to improve the learning ability of SNNs.
arXiv Detail & Related papers (2023-02-18T05:18:18Z) - Distribution-sensitive Information Retention for Accurate Binary Neural
Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients.
Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures.
We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Reborn Mechanism: Rethinking the Negative Phase Information Flow in
Convolutional Neural Network [14.929863072047318]
This paper proposes a novel nonlinear activation mechanism typically for convolutional neural network (CNN)
In sharp contrast to ReLU which cuts off the negative phase value, the reborn mechanism enjoys the capacity to reconstruct dead neurons.
arXiv Detail & Related papers (2021-06-13T15:33:49Z) - RIFLE: Backpropagation in Depth for Deep Transfer Learning through
Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task.
We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings.
RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.