Neural Contract Element Extraction Revisited: Letters from Sesame Street
- URL: http://arxiv.org/abs/2101.04355v2
- Date: Mon, 22 Feb 2021 13:55:41 GMT
- Title: Neural Contract Element Extraction Revisited: Letters from Sesame Street
- Authors: Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Ion
Androutsopoulos
- Abstract summary: LSTM-based encoders perform better than dilated CNNs, Transformers, and BERT in this task.
domain-specific WORD2VEC embeddings outperform generic pre-trained GLOVE embeddings.
Morpho-syntactic features in the form of POS tag and token shape embeddings, as well as context-aware ELMO embeddings do not improve performance.
- Score: 13.389396129468745
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We investigate contract element extraction. We show that LSTM-based encoders
perform better than dilated CNNs, Transformers, and BERT in this task. We also
find that domain-specific WORD2VEC embeddings outperform generic pre-trained
GLOVE embeddings. Morpho-syntactic features in the form of POS tag and token
shape embeddings, as well as context-aware ELMO embeddings do not improve
performance. Several of these observations contradict choices or findings of
previous work on contract element extraction and generic sequence labeling
tasks, indicating that contract element extraction requires careful
task-specific choices. Analyzing the results of (i) plain TRANSFORMER-based and
(ii) BERT-based models, we find that in the examined task, where the entities
are highly context-sensitive, the lack of recurrency in TRANSFORMERs greatly
affects their performance.
Related papers
- HeterRec: Heterogeneous Information Transformer for Scalable Sequential Recommendation [21.435064492654494]
HeterRec is a sequential recommendation model that integrates item-side heterogeneous features.
HeterRec incorporates Heterogeneous Token Flatten Layer (HTFL) and Hierarchical Causal Transformer Layer (HCT)
Extensive experiments on both offline and online datasets show that the HeterRec model achieves superior performance.
arXiv Detail & Related papers (2025-03-03T12:23:54Z) - Bring Adaptive Binding Prototypes to Generalized Referring Expression Segmentation [18.806738617249426]
Generalized Referring Expression introduces new challenges by allowing expressions to describe multiple objects or lack specific object references.
Existing RES methods, usually rely on sophisticated encoder-decoder and feature fusion modules.
We propose a novel Model with Adaptive Binding Prototypes (MABP) that adaptively binds queries to object features in the corresponding region.
arXiv Detail & Related papers (2024-05-24T03:07:38Z) - Revisiting Sparse Retrieval for Few-shot Entity Linking [33.15662306409253]
We propose an ELECTRA-based keyword extractor to denoise the mention context and construct a better query expression.
For training the extractor, we propose a distant supervision method to automatically generate training data based on overlapping tokens between mention contexts and entity descriptions.
Experimental results on the ZESHEL dataset demonstrate that the proposed method outperforms state-of-the-art models by a significant margin across all test domains.
arXiv Detail & Related papers (2023-10-19T03:51:10Z) - BERM: Training the Balanced and Extractable Representation for Matching
to Improve Generalization Ability of Dense Retrieval [54.66399120084227]
We propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM.
Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets.
arXiv Detail & Related papers (2023-05-18T15:43:09Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z) - Be Your Own Neighborhood: Detecting Adversarial Example by the
Neighborhood Relations Built on Self-Supervised Learning [64.78972193105443]
This paper presents a novel AE detection framework, named trustworthy for predictions.
performs the detection by distinguishing the AE's abnormal relation with its augmented versions.
An off-the-shelf Self-Supervised Learning (SSL) model is used to extract the representation and predict the label.
arXiv Detail & Related papers (2022-08-31T08:18:44Z) - Task-guided Disentangled Tuning for Pretrained Language Models [16.429787408467703]
We propose Task-guided Disentangled Tuning (TDT) for pretrained language models (PLMs)
TDT enhances the generalization of representations by disentangling task-relevant signals from entangled representations.
Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs.
arXiv Detail & Related papers (2022-03-22T03:11:39Z) - Cross-Domain Contract Element Extraction with a Bi-directional Feedback
Clause-Element Relation Network [70.00960496773938]
Bi-directional Feedback cLause-Element relaTion network (Bi-FLEET) is proposed for the cross-domain contract element extraction task.
Bi-FLEET has three main components: (1) a context encoder, (2) a clause-element relation encoder, and (3) an inference layer.
The experimental results over both cross-domain NER and CEE tasks show that Bi-FLEET significantly outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-13T05:14:36Z) - GRIT: Generative Role-filler Transformers for Document-level Event
Entity Extraction [134.5580003327839]
We introduce a generative transformer-based encoder-decoder framework (GRIT) to model context at the document level.
We evaluate our approach on the MUC-4 dataset, and show that our model performs substantially better than prior work.
arXiv Detail & Related papers (2020-08-21T01:07:36Z) - Do Syntax Trees Help Pre-trained Transformers Extract Information? [8.133145094593502]
We study the utility of incorporating dependency trees into pre-trained transformers on information extraction tasks.
We propose and investigate two distinct strategies for incorporating dependency structure.
We find that their performance gains are highly contingent on the availability of human-annotated dependency parses.
arXiv Detail & Related papers (2020-08-20T17:17:38Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.