IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument
Mining Tasks
- URL: http://arxiv.org/abs/2203.12257v2
- Date: Thu, 24 Mar 2022 03:27:52 GMT
- Title: IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument
Mining Tasks
- Authors: Liying Cheng, Lidong Bing, Ruidan He, Qian Yu, Yan Zhang, Luo Si
- Abstract summary: In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks.
Near 70k sentences in the dataset are fully annotated based on their argument properties.
We propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE)
- Score: 59.457948080207174
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Traditionally, a debate usually requires a manual preparation process,
including reading plenty of articles, selecting the claims, identifying the
stances of the claims, seeking the evidence for the claims, etc. As the AI
debate attracts more attention these years, it is worth exploring the methods
to automate the tedious process involved in the debating system. In this work,
we introduce a comprehensive and large dataset named IAM, which can be applied
to a series of argument mining tasks, including claim extraction, stance
classification, evidence extraction, etc. Our dataset is collected from over 1k
articles related to 123 topics. Near 70k sentences in the dataset are fully
annotated based on their argument properties (e.g., claims, stances, evidence,
etc.). We further propose two new integrated argument mining tasks associated
with the debate preparation process: (1) claim extraction with stance
classification (CESC) and (2) claim-evidence pair extraction (CEPE). We adopt a
pipeline approach and an end-to-end method for each integrated task separately.
Promising experimental results are reported to show the values and challenges
of our proposed tasks, and motivate future research on argument mining.
Related papers
- OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset [10.385189302526246]
OpenDebateEvidence is a comprehensive dataset for argument mining and summarization sourced from the American Debate Competitive community.
This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence.
arXiv Detail & Related papers (2024-06-20T18:22:59Z) - Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation [13.205613282888676]
We introduce an argument mining dataset that captures the end-to-end process of preparing an argumentative essay for a debate.
Our dataset contains 14k examples of claims that are fully annotated with the various properties supporting the aforementioned tasks.
We find, that while they show promising results on individual tasks in our benchmark, their end-to-end performance on all four tasks in succession deteriorates significantly.
arXiv Detail & Related papers (2024-06-05T11:15:45Z) - MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation [104.6065882758648]
MAVEN-Arg is the first all-in-one dataset supporting event detection, event argument extraction, and event relation extraction.
As an EAE benchmark, MAVEN-Arg offers three main advantages: (1) a comprehensive schema covering 162 event types and 612 argument roles, all with expert-written definitions and examples; (2) a large data scale, containing 98,591 events and 290,613 arguments obtained with laborious human annotation; and (3) the exhaustive annotation supporting all task variants of EAE.
arXiv Detail & Related papers (2023-11-15T16:52:14Z) - Aspect-based Meeting Transcript Summarization: A Two-Stage Approach with
Weak Supervision on Sentence Classification [91.13086984529706]
Aspect-based meeting transcript summarization aims to produce multiple summaries.
Traditional summarization methods produce one summary mixing information of all aspects.
We propose a two-stage method for aspect-based meeting transcript summarization.
arXiv Detail & Related papers (2023-11-07T19:06:31Z) - AQE: Argument Quadruplet Extraction via a Quad-Tagging Augmented
Generative Approach [40.510976649949576]
We propose a challenging argument quadruplet extraction task (AQE)
AQE can provide an all-in-one extraction of four argumentative components, i.e., claims, evidence, evidence types, and stances.
We propose a novel quad-tagging augmented generative approach, which leverages a quadruplet tagging module to augment the training of the generative framework.
arXiv Detail & Related papers (2023-05-31T14:35:53Z) - Full-Text Argumentation Mining on Scientific Publications [3.8754200816873787]
We introduce a sequential pipeline model combining ADUR and ARE for full-text SAM.
We provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks.
Our detailed error analysis reveals that non-contiguous ADUs as well as the interpretation of discourse connectors pose major challenges.
arXiv Detail & Related papers (2022-10-24T10:05:30Z) - ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
Finance Question Answering [70.6359636116848]
We propose a new large-scale dataset, ConvFinQA, to study the chain of numerical reasoning in conversational question answering.
Our dataset poses great challenge in modeling long-range, complex numerical reasoning paths in real-world conversations.
arXiv Detail & Related papers (2022-10-07T23:48:50Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets [49.65208986436848]
We investigate the effect of Argument Mining dataset composition in few- and zero-shot settings.
Our findings show that, while fine-tuning is mandatory to achieve acceptable model performance, using carefully composed training samples and reducing the training sample size by up to almost 90% can still yield 95% of the maximum performance.
arXiv Detail & Related papers (2022-05-23T17:14:32Z) - Aspect-Based Argument Mining [2.3148470932285665]
We present the task of Aspect-Based Argument Mining (ABAM) with the essential subtasks of Aspect Term Extraction (ATE) and Nested Term Extraction (NS)
We consider aspects as the main point(s) argument units are addressing.
This information is important for further downstream tasks such as argument ranking, argument summarization and generation, as well as the search for counter-arguments on the aspect-level.
arXiv Detail & Related papers (2020-11-01T21:57:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.