Full-Text Argumentation Mining on Scientific Publications
- URL: http://arxiv.org/abs/2210.13084v1
- Date: Mon, 24 Oct 2022 10:05:30 GMT
- Title: Full-Text Argumentation Mining on Scientific Publications
- Authors: Arne Binder, Bhuvanesh Verma, Leonhard Hennig
- Abstract summary: We introduce a sequential pipeline model combining ADUR and ARE for full-text SAM.
We provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks.
Our detailed error analysis reveals that non-contiguous ADUs as well as the interpretation of discourse connectors pose major challenges.
- Score: 3.8754200816873787
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Scholarly Argumentation Mining (SAM) has recently gained attention due to its
potential to help scholars with the rapid growth of published scientific
literature. It comprises two subtasks: argumentative discourse unit recognition
(ADUR) and argumentative relation extraction (ARE), both of which are
challenging since they require e.g. the integration of domain knowledge, the
detection of implicit statements, and the disambiguation of argument structure.
While previous work focused on dataset construction and baseline methods for
specific document sections, such as abstract or results, full-text scholarly
argumentation mining has seen little progress. In this work, we introduce a
sequential pipeline model combining ADUR and ARE for full-text SAM, and provide
a first analysis of the performance of pretrained language models (PLMs) on
both subtasks. We establish a new SotA for ADUR on the Sci-Arg corpus,
outperforming the previous best reported result by a large margin (+7% F1). We
also present the first results for ARE, and thus for the full AM pipeline, on
this benchmark dataset. Our detailed error analysis reveals that non-contiguous
ADUs as well as the interpretation of discourse connectors pose major
challenges and that data annotation needs to be more consistent.
Related papers
- GEGA: Graph Convolutional Networks and Evidence Retrieval Guided Attention for Enhanced Document-level Relation Extraction [15.246183329778656]
Document-level relation extraction (DocRE) aims to extract relations between entities from unstructured document text.
To overcome these challenges, we propose GEGA, a novel model for DocRE.
We evaluate the GEGA model on three widely used benchmark datasets: DocRED, Re-DocRED, and Revisit-DocRED.
arXiv Detail & Related papers (2024-07-31T07:15:33Z) - Scalable and Domain-General Abstractive Proposition Segmentation [20.532804009152255]
We focus on the task of abstractive proposition segmentation (APS): transforming text into simple, self-contained, well-formed sentences.
We first introduce evaluation metrics for the task to measure several dimensions of quality.
We then propose a scalable, yet accurate, proposition segmentation model.
arXiv Detail & Related papers (2024-06-28T10:24:31Z) - SOUL: Towards Sentiment and Opinion Understanding of Language [96.74878032417054]
We propose a new task called Sentiment and Opinion Understanding of Language (SOUL)
SOUL aims to evaluate sentiment understanding through two subtasks: Review (RC) and Justification Generation (JG)
arXiv Detail & Related papers (2023-10-27T06:48:48Z) - Enhancing Argument Structure Extraction with Efficient Leverage of
Contextual Information [79.06082391992545]
We propose an Efficient Context-aware model (ECASE) that fully exploits contextual information.
We introduce a sequence-attention module and distance-weighted similarity loss to aggregate contextual information and argumentative information.
Our experiments on five datasets from various domains demonstrate that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-10-08T08:47:10Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z) - A Data-driven Latent Semantic Analysis for Automatic Text Summarization
using LDA Topic Modelling [0.0]
This study presents the Latent Dirichlet Allocation (LDA) approach used to perform topic modelling.
The visualisation provides an overarching view of the main topics while allowing and attributing deep meaning to the prevalence individual topic.
The results suggest the terms ranked purely by considering their probability of the topic prevalence within the processed document.
arXiv Detail & Related papers (2022-07-23T11:04:03Z) - IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument
Mining Tasks [59.457948080207174]
In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks.
Near 70k sentences in the dataset are fully annotated based on their argument properties.
We propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE)
arXiv Detail & Related papers (2022-03-23T08:07:32Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z) - Entity and Evidence Guided Relation Extraction for DocRED [33.69481141963074]
We pro-pose a joint training frameworkE2GRE(Entity and Evidence Guided Relation Extraction)for this task.
We introduce entity-guided sequences as inputs to a pre-trained language model (e.g. BERT, RoBERTa)
These entity-guided sequences help a pre-trained language model (LM) to focus on areas of the document related to the entity.
We evaluate our E2GRE approach on DocRED, a recently released large-scale dataset for relation extraction.
arXiv Detail & Related papers (2020-08-27T17:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.