Related papers: Depth-Adaptive Graph Recurrent Network for Text Classification

Depth-Adaptive Graph Recurrent Network for Text Classification

URL: http://arxiv.org/abs/2003.00166v1
Date: Sat, 29 Feb 2020 03:09:55 GMT
Title: Depth-Adaptive Graph Recurrent Network for Text Classification
Authors: Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu and Jie Zhou
Abstract summary: Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network. We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
Score: 71.20237659479703
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network, which views words as nodes and performs layer-wise recurrent steps between them simultaneously. Despite its successes on text representations, the S-LSTM still suffers from two drawbacks. Firstly, given a sentence, certain words are usually more ambiguous than others, and thus more computation steps need to be taken for these difficult words and vice versa. However, the S-LSTM takes fixed computation steps for all words, irrespective of their hardness. The secondary one comes from the lack of sequential information (e.g., word order) that is inherently important for natural language. In this paper, we try to address these issues and propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required. In addition, we integrate an extra RNN layer to inject sequential information, which also serves as an input feature for the decision of adaptive depths. Results on the classic text classification task (24 datasets in various sizes and domains) show that our model brings significant improvements against the conventional S-LSTM and other high-performance models (e.g., the Transformer), meanwhile achieving a good accuracy-speed trade off.

Related papers

SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models [64.40250409933752]
We build upon our previous publication by implementing a simple and efficient non-autoregressive (NAR) TTS framework, termed SimpleSpeech 2. SimpleSpeech 2 effectively combines the strengths of both autoregressive (AR) and non-autoregressive (NAR) methods. We show a significant improvement in generation performance and generation speed compared to our previous work and other state-of-the-art (SOTA) large-scale TTS models.
arXiv Detail & Related papers (2024-08-25T17:07:39Z)
Spatial Semantic Recurrent Mining for Referring Image Segmentation [63.34997546393106]
We propose Stextsuperscript2RM to achieve high-quality cross-modality fusion. It follows a working strategy of trilogy: distributing language feature, spatial semantic recurrent coparsing, and parsed-semantic balancing. Our proposed method performs favorably against other state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-15T00:17:48Z)
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative. This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm. Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z)
Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks. We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset. The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z)
SLTUNET: A Simple Unified Model for Sign Language Translation [40.93099095994472]
We propose a simple unified neural model designed to support multiple sign-to-gloss, gloss-to-text and sign-to-text translation tasks. Jointly modeling different tasks endows SLTUNET with the capability to explore the cross-task relatedness that could help narrow the modality gap. We show in experiments that SLTUNET achieves competitive and even state-of-the-art performance on ENIX-2014T and CSL-Daily.
arXiv Detail & Related papers (2023-05-02T20:41:59Z)
Co-Driven Recognition of Semantic Consistency via the Fusion of Transformer and HowNet Sememes Knowledge [6.184249194474601]
This paper proposes a co-driven semantic consistency recognition method based on the fusion of Transformer and HowNet sememes knowledge. BiLSTM is exploited to encode the conceptual semantic information and infer the semantic consistency.
arXiv Detail & Related papers (2023-02-21T09:53:19Z)
Sentence-Level Sign Language Recognition Framework [0.0]
Sentence-level SLR required mapping videos of sign language sentences to sequences of gloss labels. CTC is used to avoid pre-segmenting the sentences into individual words. We evaluate the performance of proposed models on RWTH-PHOENIX-Weather.
arXiv Detail & Related papers (2022-11-13T01:45:41Z)
Predictive Representation Learning for Language Modeling [33.08232449211759]
Correlates of secondary information appear in LSTM representations even though they are not part of an emphexplicitly supervised prediction task. We propose Predictive Representation Learning (PRL), which explicitly constrains LSTMs to encode specific predictions.
arXiv Detail & Related papers (2021-05-29T05:03:47Z)
Bidirectional LSTM-CRF Attention-based Model for Chinese Word Segmentation [2.3991565023534087]
We propose a Bidirectional LSTM-CRF Attention-based Model for Chinese word segmentation. Our model performs better than the baseline methods modeling by other neural networks.
arXiv Detail & Related papers (2021-05-20T11:46:53Z)
A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks. We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator. To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
Future Vector Enhanced LSTM Language Model for LVCSR [67.03726018635174]
This paper proposes a novel enhanced long short-term memory (LSTM) LM using the future vector. Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction. Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.
arXiv Detail & Related papers (2020-07-31T08:38:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.