End-to-end Neural Information Status Classification
- URL: http://arxiv.org/abs/2109.02753v1
- Date: Mon, 6 Sep 2021 21:44:11 GMT
- Title: End-to-end Neural Information Status Classification
- Authors: Yufang Hou
- Abstract summary: We propose an end-to-end neural approach for information status classification.
During the inference time, our system takes a raw text as the input and generates mentions together with their information status.
Our system achieves competitive results on bridging anaphora recognition compared to the previous state-of-the-art system.
- Score: 17.976752792350933
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most previous studies on information status (IS) classification and bridging
anaphora recognition assume that the gold mention or syntactic tree information
is given (Hou et al., 2013; Roesiger et al., 2018; Hou, 2020; Yu and Poesio,
2020). In this paper, we propose an end-to-end neural approach for information
status classification. Our approach consists of a mention extraction component
and an information status assignment component. During the inference time, our
system takes a raw text as the input and generates mentions together with their
information status. On the ISNotes corpus (Markert et al., 2012), we show that
our information status assignment component achieves new state-of-the-art
results on fine-grained IS classification based on gold mentions. Furthermore,
our system performs significantly better than other baselines for both mention
extraction and fine-grained IS classification in the end-to-end setting.
Finally, we apply our system on BASHI (Roesiger, 2018) and SciCorp (Roesiger,
2016) to recognize referential bridging anaphora. We find that our end-to-end
system trained on ISNotes achieves competitive results on bridging anaphora
recognition compared to the previous state-of-the-art system that relies on
syntactic information and is trained on the in-domain datasets (Yu and Poesio,
2020).
Related papers
- Like a Good Nearest Neighbor: Practical Content Moderation and Text
Classification [66.02091763340094]
Like a Good Nearest Neighbor (LaGoNN) is a modification to SetFit that introduces no learnable parameters but alters input text with information from its nearest neighbor.
LaGoNN is effective at flagging undesirable content and text classification, and improves the performance of SetFit.
arXiv Detail & Related papers (2023-02-17T15:43:29Z) - Distant finetuning with discourse relations for stance classification [55.131676584455306]
We propose a new method to extract data with silver labels from raw text to finetune a model for stance classification.
We also propose a 3-stage training framework where the noisy level in the data used for finetuning decreases over different stages.
Our approach ranks 1st among 26 competing teams in the stance classification track of the NLPCC 2021 shared task Argumentative Text Understanding for AI Debater.
arXiv Detail & Related papers (2022-04-27T04:24:35Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - CIM: Class-Irrelevant Mapping for Few-Shot Classification [58.02773394658623]
Few-shot classification (FSC) is one of the most concerned hot issues in recent years.
How to appraise the pre-trained FEM is the most crucial focus in the FSC community.
We propose a simple, flexible method, dubbed as Class-Irrelevant Mapping (CIM)
arXiv Detail & Related papers (2021-09-07T03:26:24Z) - Self-Supervised Detection of Contextual Synonyms in a Multi-Class
Setting: Phenotype Annotation Use Case [11.912581294872767]
Contextualised word embeddings is a powerful tool to detect contextual synonyms.
We propose a self-supervised pre-training approach which is able to detect contextual synonyms of concepts being training on the data created by shallow matching.
arXiv Detail & Related papers (2021-09-04T21:35:01Z) - OntoGUM: Evaluating Contextualized SOTA Coreference Resolution on 12
More Genres [3.5420134832331325]
This paper provides a dataset and comprehensive evaluation showing that the latest neural LM based end-to-end systems degrade very substantially out of domain.
We make an OntoNotes-like coreference dataset called OntoGUM publicly available, converted from GUM, an English corpus covering 12 genres, using deterministic rules, which we evaluate.
arXiv Detail & Related papers (2021-06-02T04:42:51Z) - A Benchmark of Rule-Based and Neural Coreference Resolution in Dutch
Novels and News [4.695687634290403]
The results provide insight into the relative strengths of data-driven and knowledge-driven systems.
The neural system performs best on news/Wikipedia text, while the rule-based system performs best on literature.
arXiv Detail & Related papers (2020-11-03T10:52:00Z) - Fine-grained Information Status Classification Using Discourse
Context-Aware BERT [10.81197069967052]
We propose a simple discourse context-aware BERT model for fine-grained information status classification.
Our model achieves new state-of-the-art performance on fine-grained IS classification.
We also show an improvement of 10.5 F1 points for bridging anaphora recognition.
arXiv Detail & Related papers (2020-10-26T22:30:17Z) - Neural Networks Enhancement with Logical Knowledge [83.9217787335878]
We propose an extension of KENN for relational data.
The results show that KENN is capable of increasing the performances of the underlying neural network even in the presence relational data.
arXiv Detail & Related papers (2020-09-13T21:12:20Z) - Rank over Class: The Untapped Potential of Ranking in Natural Language
Processing [8.637110868126546]
We argue that many tasks which are currently addressed using classification are in fact being shoehorned into a classification mould.
We propose a novel end-to-end ranking approach consisting of a Transformer network responsible for producing representations for a pair of text sequences.
In an experiment on a heavily-skewed sentiment analysis dataset, converting ranking results to classification labels yields an approximately 22% improvement over state-of-the-art text classification.
arXiv Detail & Related papers (2020-09-10T22:18:57Z) - Overview of the TREC 2019 Fair Ranking Track [65.15263872493799]
The goal of the TREC Fair Ranking track was to develop a benchmark for evaluating retrieval systems in terms of fairness to different content providers.
This paper presents an overview of the track, including the task definition, descriptions of the data and the annotation process.
arXiv Detail & Related papers (2020-03-25T21:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.