Question Type Classification Methods Comparison
- URL: http://arxiv.org/abs/2001.00571v1
- Date: Fri, 3 Jan 2020 00:16:46 GMT
- Title: Question Type Classification Methods Comparison
- Authors: Tamirlan Seidakhmetov
- Abstract summary: The paper presents a comparative study of state-of-the-art approaches for question classification task: Logistic Regression, Convolutional Neural Networks (CNN), Long Short-Term Memory Network (LSTM) and Quasi-Recurrent Neural Networks (QRNN)
All models use pre-trained GLoVe word embeddings and trained on human-labeled data.
The best accuracy is achieved using CNN model with five convolutional layers and various kernel sizes stacked in parallel, followed by one fully connected layer.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paper presents a comparative study of state-of-the-art approaches for
question classification task: Logistic Regression, Convolutional Neural
Networks (CNN), Long Short-Term Memory Network (LSTM) and Quasi-Recurrent
Neural Networks (QRNN). All models use pre-trained GLoVe word embeddings and
trained on human-labeled data. The best accuracy is achieved using CNN model
with five convolutional layers and various kernel sizes stacked in parallel,
followed by one fully connected layer. The model reached 90.7% accuracy on TREC
10 test set. All the model architectures in this paper were developed from
scratch on PyTorch, in few cases based on reliable open-source implementation.
Related papers
- [Re] Network Deconvolution [3.2149341556907256]
"Network deconvolution" is used to remove pixel-wise and channel-wise correlations before data is fed into each layer.
We successfully reproduce the results reported in Tables 1 and 2 of the original paper.
arXiv Detail & Related papers (2024-10-02T02:48:13Z) - A model for multi-attack classification to improve intrusion detection
performance using deep learning approaches [0.0]
The objective here is to create a reliable intrusion detection mechanism to help identify malicious attacks.
Deep learning based solution framework is developed consisting of three approaches.
The first approach is Long-Short Term Memory Recurrent Neural Network (LSTM-RNN) with seven functions such as adamax, SGD, adagrad, adam, RMSprop, nadam and adadelta.
The models self-learnt the features and classifies the attack classes as multi-attack classification.
arXiv Detail & Related papers (2023-10-25T05:38:44Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - CoV-TI-Net: Transferred Initialization with Modified End Layer for
COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations.
In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset.
The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - Neural Capacitance: A New Perspective of Neural Network Selection via
Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction.
We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training.
Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z) - Animal Behavior Classification via Accelerometry Data and Recurrent
Neural Networks [11.099308746733028]
We study the classification of animal behavior using accelerometry data through various recurrent neural network (RNN) models.
We evaluate the classification performance and complexity of the considered models.
We also include two state-of-the-art convolutional neural network (CNN)-based time-series classification models in the evaluations.
arXiv Detail & Related papers (2021-11-24T23:28:25Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks.
We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.