Related papers: Text Classification based on Multi-granularity Attention Hybrid Neural Network

Text Classification based on Multi-granularity Attention Hybrid Neural Network

URL: http://arxiv.org/abs/2008.05282v1
Date: Wed, 12 Aug 2020 13:02:48 GMT
Title: Text Classification based on Multi-granularity Attention Hybrid Neural Network
Authors: Zhenyu Liu, Chaohong Lu, Haiwei Huang, Shengfei Lyu, Zhenchao Tao
Abstract summary: We propose a hybrid architecture based on a novel hierarchical multi-granularity attention mechanism, named Multi-granularity Attention-based Hybrid Neural Network (MahNN) The attention mechanism is to assign different weights to different parts of the input sequence to increase the computation efficiency and performance of neural models.
Score: 4.718408602093766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural network-based approaches have become the driven forces for Natural Language Processing (NLP) tasks. Conventionally, there are two mainstream neural architectures for NLP tasks: the recurrent neural network (RNN) and the convolution neural network (ConvNet). RNNs are good at modeling long-term dependencies over input texts, but preclude parallel computation. ConvNets do not have memory capability and it has to model sequential data as un-ordered features. Therefore, ConvNets fail to learn sequential dependencies over the input texts, but it is able to carry out high-efficient parallel computation. As each neural architecture, such as RNN and ConvNets, has its own pro and con, integration of different architectures is assumed to be able to enrich the semantic representation of texts, thus enhance the performance of NLP tasks. However, few investigation explores the reconciliation of these seemingly incompatible architectures. To address this issue, we propose a hybrid architecture based on a novel hierarchical multi-granularity attention mechanism, named Multi-granularity Attention-based Hybrid Neural Network (MahNN). The attention mechanism is to assign different weights to different parts of the input sequence to increase the computation efficiency and performance of neural models. In MahNN, two types of attentions are introduced: the syntactical attention and the semantical attention. The syntactical attention computes the importance of the syntactic elements (such as words or sentence) at the lower symbolic level and the semantical attention is used to compute the importance of the embedded space dimension corresponding to the upper latent semantics. We adopt the text classification as an exemplifying way to illustrate the ability of MahNN to understand texts.

Related papers

Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks. By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead. We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction. Our approach is capable of encoding neural networks in a model zoo of mixed architecture. We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks [53.31941519245432]
Brain-inspired spiking neural networks (SNNs) have demonstrated promising capabilities in solving pattern recognition tasks. These SNNs are grounded on homogeneous neurons that utilize a uniform neural coding for information representation. In this study, we argue that SNN architectures should be holistically designed to incorporate heterogeneous coding schemes.
arXiv Detail & Related papers (2023-05-26T02:52:12Z)
Split-Et-Impera: A Framework for the Design of Distributed Deep Learning Applications [8.434224141580758]
Split-Et-Impera determines the set of the best-split points of a neural network based on deep network interpretability principles. It performs a communication-aware simulation for the rapid evaluation of different neural network rearrangements. It suggests the best match between the quality of service requirements of the application and the performance in terms of accuracy and latency time.
arXiv Detail & Related papers (2023-03-22T13:00:00Z)
SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization [0.0]
Convolutional neural networks (CNNs) are a representative class of deep learning algorithms. We introduce SA-CNN neural networks for text classification tasks based on Text-CNN neural networks.
arXiv Detail & Related papers (2023-03-13T14:27:34Z)
Seeking Interpretability and Explainability in Binary Activated Neural Networks [2.828173677501078]
We study the use of binary activated neural networks as interpretable and explainable predictors in the context of regression tasks. We present an approach based on the efficient computation of SHAP values for quantifying the relative importance of the features, hidden neurons and even weights.
arXiv Detail & Related papers (2022-09-07T20:11:17Z)
Modeling Structure with Undirected Neural Networks [20.506232306308977]
We propose undirected neural networks, a flexible framework for specifying computations that can be performed in any order. We demonstrate the effectiveness of undirected neural architectures, both unstructured and structured, on a range of tasks.
arXiv Detail & Related papers (2022-02-08T10:06:51Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Multichannel CNN with Attention for Text Classification [5.1545224296246275]
This paper proposes Attention-based Multichannel Convolutional Neural Network (AMCNN) for text classification. AMCNN uses a bi-directional long short-term memory to encode the history and future information of words into high dimensional representations. The experimental results on the benchmark datasets demonstrate that AMCNN achieves better performance than state-of-the-art methods.
arXiv Detail & Related papers (2020-06-29T16:37:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.