Related papers: Recurrent Neural Networks with Mixed Hierarchical Structures for Natural Language Processing

Recurrent Neural Networks with Mixed Hierarchical Structures for Natural Language Processing

URL: http://arxiv.org/abs/2106.02562v1
Date: Fri, 4 Jun 2021 15:50:42 GMT
Title: Recurrent Neural Networks with Mixed Hierarchical Structures for Natural Language Processing
Authors: Zhaoxin Luo and Michael Zhu
Abstract summary: Hierarchical structures exist in both linguistics and Natural Language Processing (NLP) tasks. How to design RNNs to learn hierarchical representations of natural languages remains a long-standing challenge. In this paper, we define two different types of boundaries referred to as static and dynamic boundaries, respectively, and then use them to construct a multi-layer hierarchical structure for document classification tasks.
Score: 13.960152426268767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hierarchical structures exist in both linguistics and Natural Language Processing (NLP) tasks. How to design RNNs to learn hierarchical representations of natural languages remains a long-standing challenge. In this paper, we define two different types of boundaries referred to as static and dynamic boundaries, respectively, and then use them to construct a multi-layer hierarchical structure for document classification tasks. In particular, we focus on a three-layer hierarchical structure with static word- and sentence- layers and a dynamic phrase-layer. LSTM cells and two boundary detectors are used to implement the proposed structure, and the resulting network is called the {\em Recurrent Neural Network with Mixed Hierarchical Structures} (MHS-RNN). We further add three layers of attention mechanisms to the MHS-RNN model. Incorporating attention mechanisms allows our model to use more important content to construct document representation and enhance its performance on document classification tasks. Experiments on five different datasets show that the proposed architecture outperforms previous methods on all the five tasks.

Related papers

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding [55.48936731641802]
We present the SRFUND, a hierarchically structured multi-task form understanding benchmark. SRFUND provides refined annotations on top of the original FUNSD and XFUND datasets. The dataset includes eight languages including English, Chinese, Japanese, German, French, Spanish, Italian, and Portuguese.
arXiv Detail & Related papers (2024-06-13T02:35:55Z)
A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames [30.200413352223347]
We first propose a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue System, called MIVS. The target semantic frame is organized in a 3-layer hierarchical structure to tackle the alignment and assignment problems in multi-intent cases. We devise a BiRGAT model to encode the hierarchy of items, the backbone of which is a dual relational graph attention network.
arXiv Detail & Related papers (2024-02-28T11:39:26Z)
Implant Global and Local Hierarchy Information to Sequence based Code Representation Models [25.776540440893257]
We analyze how the complete hierarchical structure influences the tokens in code sequences and abstract this influence as a property of code tokens called hierarchical embedding. We propose the Hierarchy Transformer (HiT), a simple but effective sequence model to incorporate the complete hierarchical embeddings of source code into a Transformer model.
arXiv Detail & Related papers (2023-03-14T12:01:39Z)
Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
Deep Hierarchical Semantic Segmentation [76.40565872257709]
hierarchical semantic segmentation (HSS) aims at structured, pixel-wise description of visual observation in terms of a class hierarchy. HSSN casts HSS as a pixel-wise multi-label classification task, only bringing minimal architecture change to current segmentation models. With hierarchy-induced margin constraints, HSSN reshapes the pixel embedding space, so as to generate well-structured pixel representations.
arXiv Detail & Related papers (2022-03-27T15:47:44Z)
Recurrent Neural Networks with Mixed Hierarchical Structures and EM Algorithm for Natural Language Processing [9.645196221785694]
We develop an approach called the latent indicator layer to identify and learn implicit hierarchical information. We also develop an EM algorithm to handle the latent indicator layer in training. We show that the EM-HRNN model with bootstrap training outperforms other RNN-based models in document classification tasks.
arXiv Detail & Related papers (2022-01-21T23:08:33Z)
HS3: Learning with Proper Task Complexity in Hierarchically Supervised Semantic Segmentation [81.87943324048756]
We propose Hierarchically Supervised Semantic (HS3), a training scheme that supervises intermediate layers in a segmentation network to learn meaningful representations by varying task complexity. Our proposed HS3-Fuse framework further improves segmentation predictions and achieves state-of-the-art results on two large segmentation benchmarks: NYUD-v2 and Cityscapes.
arXiv Detail & Related papers (2021-11-03T16:33:29Z)
Nested and Balanced Entity Recognition using Multi-Task Learning [0.0]
This paper introduces a partly-layered network architecture that deals with the complexity of overlapping and nested cases. We train and evaluate this architecture to recognise two kinds of entities - Concepts (CR) and Named Entities (NER) Our approach achieves state-of-the-art NER performances, while it outperforms previous CR approaches.
arXiv Detail & Related papers (2021-06-11T07:52:32Z)
BiTe-GCN: A New GCN Architecture via BidirectionalConvolution of Topology and Features on Text-Rich Networks [44.74164340799386]
BiTe-GCN is a novel GCN architecture with bidirectional convolution of both topology and features on text-rich networks. Our new architecture outperforms state-of-the-art by a breakout improvement. This architecture can also be applied to several e-commerce searching scenes such as JD searching.
arXiv Detail & Related papers (2020-10-23T04:38:30Z)
Dual-constrained Deep Semi-Supervised Coupled Factorization Network with Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net. To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network. Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z)
Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity. Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.