SECNN: Squeeze-and-Excitation Convolutional Neural Network for Sentence
Classification
- URL: http://arxiv.org/abs/2312.06088v1
- Date: Mon, 11 Dec 2023 03:26:36 GMT
- Title: SECNN: Squeeze-and-Excitation Convolutional Neural Network for Sentence
Classification
- Authors: Shandong Yuan
- Abstract summary: Convolution neural network (CNN) has the ability to extract n-grams features through convolutional filters.
We propose a Squeeze-and-Excitation Convolutional neural Network (SECNN) for sentence classification.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentence classification is one of the basic tasks of natural language
processing. Convolution neural network (CNN) has the ability to extract n-grams
features through convolutional filters and capture local correlations between
consecutive words in parallel, so CNN is a popular neural network architecture
to dealing with the task. But restricted by the width of convolutional filters,
it is difficult for CNN to capture long term contextual dependencies. Attention
is a mechanism that considers global information and pays more attention to
keywords in sentences, thus attention mechanism is cooperated with CNN network
to improve performance in sentence classification task. In our work, we don't
focus on keyword in a sentence, but on which CNN's output feature map is more
important. We propose a Squeeze-and-Excitation Convolutional neural Network
(SECNN) for sentence classification. SECNN takes the feature maps from multiple
CNN as different channels of sentence representation, and then, we can utilize
channel attention mechanism, that is SE attention mechanism, to enable the
model to learn the attention weights of different channel features. The results
show that our model achieves advanced performance in the sentence
classification task.
Related papers
- CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation.
The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z) - PICNN: A Pathway towards Interpretable Convolutional Neural Networks [12.31424771480963]
We introduce a novel pathway to alleviate the entanglement between filters and image classes.
We use the Bernoulli sampling to generate the filter-cluster assignment matrix from a learnable filter-class correspondence matrix.
We evaluate the effectiveness of our method on ten widely used network architectures.
arXiv Detail & Related papers (2023-12-19T11:36:03Z) - A novel feature-scrambling approach reveals the capacity of
convolutional neural networks to learn spatial relations [0.0]
Convolutional neural networks (CNNs) are one of the most successful computer vision systems to solve object recognition.
Yet it remains poorly understood how CNNs actually make their decisions, what the nature of their internal representations is, and how their recognition strategies differ from humans.
arXiv Detail & Related papers (2022-12-12T16:40:29Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Training Interpretable Convolutional Neural Networks by Differentiating
Class-specific Filters [64.46270549587004]
Convolutional neural networks (CNNs) have been successfully used in a range of tasks.
CNNs are often viewed as "black-box" and lack of interpretability.
We propose a novel strategy to train interpretable CNNs by encouraging class-specific filters.
arXiv Detail & Related papers (2020-07-16T09:12:26Z) - Multichannel CNN with Attention for Text Classification [5.1545224296246275]
This paper proposes Attention-based Multichannel Convolutional Neural Network (AMCNN) for text classification.
AMCNN uses a bi-directional long short-term memory to encode the history and future information of words into high dimensional representations.
The experimental results on the benchmark datasets demonstrate that AMCNN achieves better performance than state-of-the-art methods.
arXiv Detail & Related papers (2020-06-29T16:37:51Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z) - Hybrid Tiled Convolutional Neural Networks for Text Sentiment
Classification [3.0204693431381515]
We adjust the architecture of the tiled convolutional neural network (tiled CNN) to improve its extraction of salient features for sentiment analysis.
Knowing that the major drawback of the tiled CNN in the NLP field is its inflexible filter structure, we propose a novel architecture called hybrid tiled CNN.
Experiments on the datasets of IMDB movie reviews and SemEval 2017 demonstrate the efficiency of the hybrid tiled CNN.
arXiv Detail & Related papers (2020-01-31T14:08:15Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.