Related papers: A Multi-Size Neural Network with Attention Mechanism for Answer Selection

A Multi-Size Neural Network with Attention Mechanism for Answer Selection

URL: http://arxiv.org/abs/2105.03278v1
Date: Sat, 24 Apr 2021 02:13:26 GMT
Title: A Multi-Size Neural Network with Attention Mechanism for Answer Selection
Authors: Jie Huang
Abstract summary: An effective architecture,multi-size neural network with attention mechanism (AM-MSNN),is introduced into the answer selection task. It captures more levels of language granularities in parallel, because of the various sizes of filters comparing with single-layer CNN and multi-layer CNNs. It extends the sentence representations by attention mechanism, thus containing more information for different types of questions.
Score: 3.310455595316906
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Semantic matching is of central significance to the answer selection task which aims to select correct answers for a given question from a candidate answer pool. A useful method is to employ neural networks with attention to generate sentences representations in a way that information from pair sentences can mutually influence the computation of representations. In this work, an effective architecture,multi-size neural network with attention mechanism (AM-MSNN),is introduced into the answer selection task. This architecture captures more levels of language granularities in parallel, because of the various sizes of filters comparing with single-layer CNN and multi-layer CNNs. Meanwhile it extends the sentence representations by attention mechanism, thus containing more information for different types of questions. The empirical study on three various benchmark tasks of answer selection demonstrates the efficacy of the proposed model in all the benchmarks and its superiority over competitors. The experimental results show that (1) multi-size neural network (MSNN) is a more useful method to capture abstract features on different levels of granularities than single/multi-layer CNNs; (2) the attention mechanism (AM) is a better strategy to derive more informative representations; (3) AM-MSNN is a better architecture for the answer selection task for the moment.

Related papers

Application of convolutional neural networks in image super-resolution [99.25287909319401]
convolutional neural networks (CNNs) have become mainstream methods for image super-resolution.<n>There are big differences of different deep learning methods with different types.<n>This paper first introduces principles of CNNs in image super-resolution, then introduces CNNs based bicubic, nearest neighbor, bilinear, transposed convolution, sub-pixel layer, meta-up-sampling for image super-resolution.<n>Finally, this paper gives potential research points and drawbacks and summarizes the whole paper, which can facilitate developments of CNNs in image super-resolution.
arXiv Detail & Related papers (2025-06-03T08:28:08Z)
Unveiling the Power of Sparse Neural Networks for Feature Selection [60.50319755984697]
Sparse Neural Networks (SNNs) have emerged as powerful tools for efficient feature selection. We show that SNNs trained with dynamic sparse training (DST) algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction. Our findings show that feature selection with SNNs trained with DST algorithms can achieve, on average, more than $50%$ memory and $55%$ FLOPs reduction.
arXiv Detail & Related papers (2024-08-08T16:48:33Z)
Effective Subset Selection Through The Lens of Neural Network Pruning [31.43307762723943]
It is important to select the data to be annotated wisely, which is known as the subset selection problem. We investigate the relationship between subset selection and neural network pruning, which is more widely studied. We propose utilizing the norm criterion of neural network features to improve subset selection methods.
arXiv Detail & Related papers (2024-06-03T08:12:32Z)
CNN2GNN: How to Bridge CNN with GNN [59.42117676779735]
We propose a novel CNN2GNN framework to unify CNN and GNN together via distillation. The performance of distilled boosted'' two-layer GNN on Mini-ImageNet is much higher than CNN containing dozens of layers such as ResNet152.
arXiv Detail & Related papers (2024-04-23T08:19:08Z)
SECNN: Squeeze-and-Excitation Convolutional Neural Network for Sentence Classification [0.0]
Convolution neural network (CNN) has the ability to extract n-grams features through convolutional filters. We propose a Squeeze-and-Excitation Convolutional neural Network (SECNN) for sentence classification.
arXiv Detail & Related papers (2023-12-11T03:26:36Z)
Deception Detection from Linguistic and Physiological Data Streams Using Bimodal Convolutional Neural Networks [19.639533220155965]
This paper explores the application of convolutional neural networks for the purpose of multimodal deception detection. We use a dataset built by interviewing 104 subjects about two topics, with one truthful and one falsified response from each subject about each topic.
arXiv Detail & Related papers (2023-11-18T02:44:33Z)
Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification. The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z)
Classification of diffraction patterns using a convolutional neural network in single particle imaging experiments performed at X-ray free-electron lasers [53.65540150901678]
Single particle imaging (SPI) at X-ray free electron lasers (XFELs) is particularly well suited to determine the 3D structure of particles in their native environment. For a successful reconstruction, diffraction patterns originating from a single hit must be isolated from a large number of acquired patterns. We propose to formulate this task as an image classification problem and solve it using convolutional neural network (CNN) architectures.
arXiv Detail & Related papers (2021-12-16T17:03:14Z)
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer. Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z)
Neuron-based explanations of neural networks sacrifice completeness and interpretability [67.53271920386851]
We show that for AlexNet pretrained on ImageNet, neuron-based explanation methods sacrifice both completeness and interpretability. We show the most important principal components provide more complete and interpretable explanations than the most important neurons. Our findings suggest that explanation methods for networks like AlexNet should avoid using neurons as a basis for embeddings.
arXiv Detail & Related papers (2020-11-05T21:26:03Z)
Multichannel CNN with Attention for Text Classification [5.1545224296246275]
This paper proposes Attention-based Multichannel Convolutional Neural Network (AMCNN) for text classification. AMCNN uses a bi-directional long short-term memory to encode the history and future information of words into high dimensional representations. The experimental results on the benchmark datasets demonstrate that AMCNN achieves better performance than state-of-the-art methods.
arXiv Detail & Related papers (2020-06-29T16:37:51Z)
Analyzing Neural Networks Based on Random Graphs [77.34726150561087]
We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types. We find that none of the classical numerical graph invariants by itself allows to single out the best networks. We also find that networks with primarily short-range connections perform better than networks which allow for many long-range connections.
arXiv Detail & Related papers (2020-02-19T11:04:49Z)
Inferring Convolutional Neural Networks' accuracies from their architectural characterizations [0.0]
We study the relationships between a CNN's architecture and its performance. We show that the attributes can be predictive of the networks' performance in two specific computer vision-based physics problems. We use machine learning models to predict whether a network can perform better than a certain threshold accuracy before training.
arXiv Detail & Related papers (2020-01-07T16:41:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.