Multichannel CNN with Attention for Text Classification
- URL: http://arxiv.org/abs/2006.16174v1
- Date: Mon, 29 Jun 2020 16:37:51 GMT
- Title: Multichannel CNN with Attention for Text Classification
- Authors: Zhenyu Liu, Haiwei Huang, Chaohong Lu, Shengfei Lyu
- Abstract summary: This paper proposes Attention-based Multichannel Convolutional Neural Network (AMCNN) for text classification.
AMCNN uses a bi-directional long short-term memory to encode the history and future information of words into high dimensional representations.
The experimental results on the benchmark datasets demonstrate that AMCNN achieves better performance than state-of-the-art methods.
- Score: 5.1545224296246275
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent years, the approaches based on neural networks have shown remarkable
potential for sentence modeling. There are two main neural network structures:
recurrent neural network (RNN) and convolution neural network (CNN). RNN can
capture long term dependencies and store the semantics of the previous
information in a fixed-sized vector. However, RNN is a biased model and its
ability to extract global semantics is restricted by the fixed-sized vector.
Alternatively, CNN is able to capture n-gram features of texts by utilizing
convolutional filters. But the width of convolutional filters restricts its
performance. In order to combine the strengths of the two kinds of networks and
alleviate their shortcomings, this paper proposes Attention-based Multichannel
Convolutional Neural Network (AMCNN) for text classification. AMCNN utilizes a
bi-directional long short-term memory to encode the history and future
information of words into high dimensional representations, so that the
information of both the front and back of the sentence can be fully expressed.
Then the scalar attention and vectorial attention are applied to obtain
multichannel representations. The scalar attention can calculate the word-level
importance and the vectorial attention can calculate the feature-level
importance. In the classification task, AMCNN uses a CNN structure to cpture
word relations on the representations generated by the scalar and vectorial
attention mechanism instead of calculating the weighted sums. It can
effectively extract the n-gram features of the text. The experimental results
on the benchmark datasets demonstrate that AMCNN achieves better performance
than state-of-the-art methods. In addition, the visualization results verify
the semantic richness of multichannel representations.
Related papers
- CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation.
The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z) - SECNN: Squeeze-and-Excitation Convolutional Neural Network for Sentence
Classification [0.0]
Convolution neural network (CNN) has the ability to extract n-grams features through convolutional filters.
We propose a Squeeze-and-Excitation Convolutional neural Network (SECNN) for sentence classification.
arXiv Detail & Related papers (2023-12-11T03:26:36Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Block-term Tensor Neural Networks [29.442026567710435]
We show that block-term tensor layers (BT-layers) can be easily adapted to neural network models, such as CNNs and RNNs.
BT-layers in CNNs and RNNs can achieve a very large compression ratio on the number of parameters while preserving or improving the representation power of the original DNNs.
arXiv Detail & Related papers (2020-10-10T09:58:43Z) - Text Classification based on Multi-granularity Attention Hybrid Neural
Network [4.718408602093766]
We propose a hybrid architecture based on a novel hierarchical multi-granularity attention mechanism, named Multi-granularity Attention-based Hybrid Neural Network (MahNN)
The attention mechanism is to assign different weights to different parts of the input sequence to increase the computation efficiency and performance of neural models.
arXiv Detail & Related papers (2020-08-12T13:02:48Z) - Combine Convolution with Recurrent Networks for Text Classification [12.92202472766078]
We propose a novel method to keep the strengths of the two networks to a great extent.
In the proposed model, a convolutional neural network is applied to learn a 2D weight matrix where each row reflects the importance of each word from different aspects.
We use a bi-directional RNN to process each word and employ a neural tensor layer that fuses forward and backward hidden states to get word representations.
arXiv Detail & Related papers (2020-06-29T03:36:04Z) - Visual Commonsense R-CNN [102.5061122013483]
We present a novel unsupervised feature representation learning method, Visual Commonsense Region-based Convolutional Neural Network (VC R-CNN)
VC R-CNN serves as an improved visual region encoder for high-level tasks such as captioning and VQA.
We extensively apply VC R-CNN features in prevailing models of three popular tasks: Image Captioning, VQA, and VCR, and observe consistent performance boosts across them.
arXiv Detail & Related papers (2020-02-27T15:51:19Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.