Related papers: Learn to Enhance the Negative Information in Convolutional Neural Network

Learn to Enhance the Negative Information in Convolutional Neural Network

URL: http://arxiv.org/abs/2306.10536v1
Date: Sun, 18 Jun 2023 12:02:36 GMT
Title: Learn to Enhance the Negative Information in Convolutional Neural Network
Authors: Zhicheng Cai, Chenglei Peng, Qiu Shen
Abstract summary: This paper proposes a learnable nonlinear activation mechanism specifically for convolutional neural network (CNN) termed as LENI. In sharp contrast to ReLU which cuts off the negative neurons and suffers from the issue of ''dying ReLU'', LENI enjoys the capacity to reconstruct the dead neurons and reduce the information loss.
Score: 6.910916428810853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a learnable nonlinear activation mechanism specifically for convolutional neural network (CNN) termed as LENI, which learns to enhance the negative information in CNNs. In sharp contrast to ReLU which cuts off the negative neurons and suffers from the issue of ''dying ReLU'', LENI enjoys the capacity to reconstruct the dead neurons and reduce the information loss. Compared to improved ReLUs, LENI introduces a learnable approach to process the negative phase information more properly. In this way, LENI can enhance the model representational capacity significantly while maintaining the original advantages of ReLU. As a generic activation mechanism, LENI possesses the property of portability and can be easily utilized in any CNN models through simply replacing the activation layers with LENI block. Extensive experiments validate that LENI can improve the performance of various baseline models on various benchmark datasets by a clear margin (up to 1.24% higher top-1 accuracy on ImageNet-1k) with negligible extra parameters. Further experiments show that LENI can act as a channel compensation mechanism, offering competitive or even better performance but with fewer learned parameters than baseline models. In addition, LENI introduces the asymmetry to the model structure which contributes to the enhancement of representational capacity. Through visualization experiments, we validate that LENI can retain more information and learn more representations.

Related papers

Forgetting: A New Mechanism Towards Better Large Language Model Fine-tuning [53.398270878295754]
Supervised fine-tuning (SFT) plays a critical role for pretrained large language models (LLMs)<n>We suggest categorizing tokens within each corpus into two parts -- positive and negative tokens -- based on whether they are useful to improve model performance.<n>We conduct experiments on well-established benchmarks, finding that this forgetting mechanism not only improves overall model performance and also facilitate more diverse model responses.
arXiv Detail & Related papers (2025-08-06T11:22:23Z)
Knowledge Regularized Negative Feature Tuning of Vision-Language Models for Out-of-Distribution Detection [54.433899174017185]
Out-of-distribution (OOD) detection is crucial for building reliable machine learning models.<n>We propose a novel method called Knowledge Regularized Negative Feature Tuning (KR-NFT)<n>NFT applies distribution-aware transformations to pre-trained text features, effectively separating positive and negative features into distinct spaces.<n>When trained with few-shot samples from ImageNet dataset, KR-NFT not only improves ID classification accuracy and OOD detection but also significantly reduces the FPR95 by 5.44%.
arXiv Detail & Related papers (2025-07-26T07:44:04Z)
Understanding the Benefits of SimCLR Pre-Training in Two-Layer Convolutional Neural Networks [10.55004012983524]
SimCLR is one of the most popular contrastive learning methods for vision tasks. We consider training a two-layer convolutional neural network (CNN) to learn a toy image data model. We show that, under certain conditions on the number of labeled data, SimCLR pre-training combined with supervised fine-tuning achieves almost optimal test loss.
arXiv Detail & Related papers (2024-09-27T12:19:41Z)
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs [25.91643745340183]
Large Language Models (LLMs) have demonstrated strong reasoning and memorization capabilities via pretraining on massive textual corpora. This poses risk of privacy and copyright violations, highlighting the need for efficient machine unlearning methods. We propose Low-rank Knowledge Unlearning (LoKU), a novel framework that enables robust and efficient unlearning for LLMs.
arXiv Detail & Related papers (2024-08-13T04:18:32Z)
Forget but Recall: Incremental Latent Rectification in Continual Learning [21.600690867361617]
Intrinsic capability to continuously learn a changing data stream is a desideratum of deep neural networks (DNNs) Existing Continual Learning approaches either retain exemplars for replay, regularize learning, or allocate dedicated capacity for new tasks. This paper investigates an unexplored CL direction for incremental learning called Incremental Latent Rectification or ILR.
arXiv Detail & Related papers (2024-06-25T08:57:47Z)
U-KAN Makes Strong Backbone for Medical Image Segmentation and Generation [48.40120035775506]
Kolmogorov-Arnold Networks (KANs) reshape the neural network learning via the stack of non-linear learnable activation functions. We investigate, modify and re-design the established U-Net pipeline by integrating the dedicated KAN layers on the tokenized intermediate representation, termed U-KAN. We further delved into the potential of U-KAN as an alternative U-Net noise predictor in diffusion models, demonstrating its applicability in generating task-oriented model architectures.
arXiv Detail & Related papers (2024-06-05T04:13:03Z)
A Method on Searching Better Activation Functions [15.180864683908878]
We propose Entropy-based Activation Function Optimization (EAFO) methodology for designing static activation functions in deep neural networks. We derive a novel activation function from ReLU, known as Correction Regularized ReLU (CRReLU)
arXiv Detail & Related papers (2024-05-19T03:48:05Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
KLIF: An optimized spiking neuron unit for tuning surrogate gradient slope and membrane potential [0.0]
Spiking neural networks (SNNs) have attracted much attention due to their ability to process temporal information. It is still challenging to develop efficient and high-performing learning algorithms for SNNs. We propose a novel k-based leaky Integrate-and-Fire neuron model to improve the learning ability of SNNs.
arXiv Detail & Related papers (2023-02-18T05:18:18Z)
Distribution-sensitive Information Retention for Accurate Binary Neural Network [49.971345958676196]
We present a novel Distribution-sensitive Information Retention Network (DIR-Net) to retain the information of the forward activations and backward gradients. Our DIR-Net consistently outperforms the SOTA binarization approaches under mainstream and compact architectures. We conduct our DIR-Net on real-world resource-limited devices which achieves 11.1 times storage saving and 5.4 times speedup.
arXiv Detail & Related papers (2021-09-25T10:59:39Z)
Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks. This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z)
Reborn Mechanism: Rethinking the Negative Phase Information Flow in Convolutional Neural Network [14.929863072047318]
This paper proposes a novel nonlinear activation mechanism typically for convolutional neural network (CNN) In sharp contrast to ReLU which cuts off the negative phase value, the reborn mechanism enjoys the capacity to reconstruct dead neurons.
arXiv Detail & Related papers (2021-06-13T15:33:49Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr [60.07531696857743]
Fine-tuning the deep convolution neural network(CNN) using a pre-trained model helps transfer knowledge learned from larger datasets to the target task. We propose RIFLE - a strategy that deepens backpropagation in transfer learning settings. RIFLE brings meaningful updates to the weights of deep CNN layers and improves low-level feature learning.
arXiv Detail & Related papers (2020-07-07T11:27:43Z)
Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN) This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization. A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.