Light-Weighted CNN for Text Classification
- URL: http://arxiv.org/abs/2004.07922v1
- Date: Thu, 16 Apr 2020 20:23:52 GMT
- Title: Light-Weighted CNN for Text Classification
- Authors: Ritu Yadav
- Abstract summary: We introduce a new architecture based on separable convolution.
The idea of separable convolution already exists in the field of image classification.
With the help of this architecture, we can achieve a drastic reduction in trainable parameters.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For management, documents are categorized into a specific category, and to do
these, most of the organizations use manual labor. In today's automation era,
manual efforts on such a task are not justified, and to avoid this, we have so
many software out there in the market. However, efficiency and minimal resource
consumption is the focal point which is also creating a competition. The
categorization of such documents into specified classes by machine provides
excellent help. One of categorization technique is text classification using a
Convolutional neural network(TextCNN). TextCNN uses multiple sizes of filters,
as in the case of the inception layer introduced in Googlenet. The network
provides good accuracy but causes high memory consumption due to a large number
of trainable parameters. As a solution to this problem, we introduced a whole
new architecture based on separable convolution. The idea of separable
convolution already exists in the field of image classification but not yet
introduces to text classification tasks. With the help of this architecture, we
can achieve a drastic reduction in trainable parameters.
Related papers
- Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task.
A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks.
Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - Few-Shot Learning with Siamese Networks and Label Tuning [5.006086647446482]
We show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative.
We introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings.
arXiv Detail & Related papers (2022-03-28T11:16:46Z) - Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network.
We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z) - CvS: Classification via Segmentation For Small Datasets [52.821178654631254]
This paper presents CvS, a cost-effective classifier for small datasets that derives the classification labels from predicting the segmentation maps.
We evaluate the effectiveness of our framework on diverse problems showing that CvS is able to achieve much higher classification results compared to previous methods when given only a handful of examples.
arXiv Detail & Related papers (2021-10-29T18:41:15Z) - Deep Learning for Technical Document Classification [6.787004826008753]
This paper describes a novel multimodal deep learning architecture, called TechDoc, for technical document classification.
The trained model can potentially be scaled to millions of real-world technical documents with both text and figures.
arXiv Detail & Related papers (2021-06-27T16:12:47Z) - Train your classifier first: Cascade Neural Networks Training from upper
layers to lower layers [54.47911829539919]
We develop a novel top-down training method which can be viewed as an algorithm for searching for high-quality classifiers.
We tested this method on automatic speech recognition (ASR) tasks and language modelling tasks.
The proposed method consistently improves recurrent neural network ASR models on Wall Street Journal, self-attention ASR models on Switchboard, and AWD-LSTM language models on WikiText-2.
arXiv Detail & Related papers (2021-02-09T08:19:49Z) - Does a Hybrid Neural Network based Feature Selection Model Improve Text
Classification? [9.23545668304066]
We propose a hybrid feature selection method for obtaining relevant features.
We then present three ways of implementing a feature selection and neural network pipeline.
We also observed a slight increase in accuracy on some datasets.
arXiv Detail & Related papers (2021-01-22T09:12:19Z) - Adaptive Hierarchical Decomposition of Large Deep Networks [4.272649614101117]
As datasets get larger, a natural question is if existing deep learning architectures can be extended to handle the 50+K classes thought to be perceptible by a typical human.
This paper introduces a framework that automatically analyzes and configures a family of smaller deep networks as a replacement to a singular, larger network.
The resulting smaller networks are highly scalable, parallel and more practical to train, and achieve higher classification accuracy.
arXiv Detail & Related papers (2020-07-17T21:04:50Z) - Text Classification with Few Examples using Controlled Generalization [58.971750512415134]
Current practice relies on pre-trained word embeddings to map words unseen in training to similar seen ones.
Our alternative begins with sparse pre-trained representations derived from unlabeled parsed corpora.
We show that a feed-forward network over these vectors is especially effective in low-data scenarios.
arXiv Detail & Related papers (2020-05-18T06:04:58Z) - Semantic Drift Compensation for Class-Incremental Learning [48.749630494026086]
Class-incremental learning of deep networks sequentially increases the number of classes to be classified.
We propose a new method to estimate the drift, called semantic drift, of features and compensate for it without the need of any exemplars.
arXiv Detail & Related papers (2020-04-01T13:31:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.