Dual-State Capsule Networks for Text Classification
- URL: http://arxiv.org/abs/2109.04762v1
- Date: Fri, 10 Sep 2021 09:59:55 GMT
- Title: Dual-State Capsule Networks for Text Classification
- Authors: Piyumal Demotte, Surangika Ranathunga
- Abstract summary: This paper presents a novel Dual-State Capsule (DS-Caps) network-based technique for text classification.
Two varieties of states, namely sentence-level and word-level, are integrated with capsule layers to capture deeper context-level information.
The DS-Caps networks outperform the existing capsule network architectures for multiple datasets.
- Score: 2.0305676256390934
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text classification systems based on contextual embeddings are not viable
options for many of the low resource languages. On the other hand, recently
introduced capsule networks have shown performance in par with these text
classification models. Thus, they could be considered as a viable alternative
for text classification for languages that do not have pre-trained contextual
embedding models. However, current capsule networks depend upon spatial
patterns without considering the sequential features of the text. They are also
sub-optimal in capturing the context-level information in longer sequences.
This paper presents a novel Dual-State Capsule (DS-Caps) network-based
technique for text classification, which is optimized to mitigate these issues.
Two varieties of states, namely sentence-level and word-level, are integrated
with capsule layers to capture deeper context-level information for language
modeling. The dynamic routing process among capsules was also optimized using
the context-level information obtained through sentence-level states. The
DS-Caps networks outperform the existing capsule network architectures for
multiple datasets, particularly for tasks with longer sequences of text. We
also demonstrate the superiority of DS-Caps in text classification for a low
resource language.
Related papers
- Transfer-Free Data-Efficient Multilingual Slot Labeling [82.02076369811402]
Slot labeling is a core component of task-oriented dialogue (ToD) systems.
To mitigate the inherent data scarcity issue, current research on multilingual ToD assumes that sufficient English-language annotated data are always available.
We propose a two-stage slot labeling approach (termed TWOSL) which transforms standard multilingual sentence encoders into effective slot labelers.
arXiv Detail & Related papers (2023-05-22T22:47:32Z) - Like a Good Nearest Neighbor: Practical Content Moderation and Text
Classification [66.02091763340094]
Like a Good Nearest Neighbor (LaGoNN) is a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor.
LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit.
arXiv Detail & Related papers (2023-02-17T15:43:29Z) - MGDoc: Pre-training with Multi-granular Hierarchy for Document Image
Understanding [53.03978356918377]
spatial hierarchical relationships between content at different levels of granularity are crucial for document image understanding tasks.
Existing methods learn features from either word-level or region-level but fail to consider both simultaneously.
We propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time.
arXiv Detail & Related papers (2022-11-27T22:47:37Z) - Attend, Memorize and Generate: Towards Faithful Table-to-Text Generation
in Few Shots [58.404516361586325]
Few-shot table-to-text generation is a task of composing fluent and faithful sentences to convey table content using limited data.
This paper proposes a novel approach, Memorize and Generate (called AMG), inspired by the text generation process of humans.
arXiv Detail & Related papers (2022-03-01T20:37:20Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - Hierarchical Text Classification of Urdu News using Deep Neural Network [0.0]
This paper proposes a deep learning model for hierarchical text classification of news in Urdu language.
It consists of 51,325 sentences from 8 online news websites belonging to the following genres: Sports; Technology; and Entertainment.
arXiv Detail & Related papers (2021-07-07T11:06:11Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Cascaded Semantic and Positional Self-Attention Network for Document
Classification [9.292885582770092]
We propose a new architecture to aggregate the two sources of information using cascaded semantic and positional self-attention network (CSPAN)
The CSPAN uses a semantic self-attention layer cascaded with Bi-LSTM to process the semantic and positional information in a sequential manner, and then adaptively combine them together through a residue connection.
We evaluate the CSPAN model on several benchmark data sets for document classification with careful ablation studies, and demonstrate the encouraging results compared with state of the art.
arXiv Detail & Related papers (2020-09-15T15:02:28Z) - A Novel BGCapsule Network for Text Classification [5.010425616264462]
We propose a novel hybrid architecture viz., BGCapsule, which is a Capsule model preceded by an ensemble of Bidirectional Gated Recurrent Units (BiGRU) for several text classification tasks.
BGCapsule achieves better accuracy compared to the existing methods without the help of any external linguistic knowledge.
arXiv Detail & Related papers (2020-07-02T06:07:29Z) - VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification [21.96079052962283]
VGCN-BERT model combines the capability of BERT with a Vocabulary Graph Convolutional Network (VGCN)
In our experiments on several text classification datasets, our approach outperforms BERT and GCN alone.
arXiv Detail & Related papers (2020-04-12T22:02:33Z) - Adapting Deep Learning for Sentiment Classification of Code-Switched
Informal Short Text [1.6752182911522517]
We present a labeled dataset called MultiSenti for sentiment classification of code-switched informal short text.
We propose a deep learning-based model for sentiment classification of code-switched informal short text.
arXiv Detail & Related papers (2020-01-04T06:31:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.