DebateSum: A large-scale argument mining and summarization dataset
- URL: http://arxiv.org/abs/2011.07251v1
- Date: Sat, 14 Nov 2020 10:06:57 GMT
- Title: DebateSum: A large-scale argument mining and summarization dataset
- Authors: Allen Roush and Arvind Balaji
- Abstract summary: DebateSum consists of 187,386 unique pieces of evidence with corresponding argument and extractive summaries.
We train several transformer summarization models to benchmark summarization performance on DebateSum.
We present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prior work in Argument Mining frequently alludes to its potential
applications in automatic debating systems. Despite this focus, almost no
datasets or models exist which apply natural language processing techniques to
problems found within competitive formal debate. To remedy this, we present the
DebateSum dataset. DebateSum consists of 187,386 unique pieces of evidence with
corresponding argument and extractive summaries. DebateSum was made using data
compiled by competitors within the National Speech and Debate Association over
a 7-year period. We train several transformer summarization models to benchmark
summarization performance on DebateSum. We also introduce a set of fasttext
word-vectors trained on DebateSum called debate2vec. Finally, we present a
search engine for this dataset which is utilized extensively by members of the
National Speech and Debate Association today. The DebateSum search engine is
available to the public here: http://www.debate.cards
Related papers
- Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate [22.813887723656023]
Agent for Debate (Agent4Debate) is a dynamic multi-agent framework based on Large Language Models (LLMs)
The evaluation employs the Debatrix automatic scoring system and professional human reviewers based on the established Debatrix-Elo and Human-Elo ranking.
Experimental results indicate that the state-of-the-art Agent4Debate exhibits capabilities comparable to those of humans.
arXiv Detail & Related papers (2024-08-08T14:02:45Z) - OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset [10.385189302526246]
OpenDebateEvidence is a comprehensive dataset for argument mining and summarization sourced from the American Debate Competitive community.
This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence.
arXiv Detail & Related papers (2024-06-20T18:22:59Z) - Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM [51.43102092480804]
Debatrix is an automated debate judge based on Large Language Models (LLMs)
To align with real-world debate scenarios, we introduced the PanelBench benchmark, comparing our system's performance to actual debate outcomes.
The findings indicate a notable enhancement over directly using LLMs for debate evaluation.
arXiv Detail & Related papers (2024-03-12T18:19:47Z) - Argue with Me Tersely: Towards Sentence-Level Counter-Argument
Generation [62.069374456021016]
We present the ArgTersely benchmark for sentence-level counter-argument generation.
We also propose Arg-LlaMA for generating high-quality counter-argument.
arXiv Detail & Related papers (2023-12-21T06:51:34Z) - DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge
Graphs [0.0]
We show that effective debate cases can be constructed using constrained shortest path traversals on Argumentative Semantic Knowledge Graphs.
We significantly improve upon DebateSum by introducing 53180 new examples.
We create a unique method for evaluating which knowledge graphs are better in the context of producing policy debate cases.
arXiv Detail & Related papers (2023-07-09T04:19:19Z) - IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument
Mining Tasks [59.457948080207174]
In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks.
Near 70k sentences in the dataset are fully annotated based on their argument properties.
We propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE)
arXiv Detail & Related papers (2022-03-23T08:07:32Z) - DEBACER: a method for slicing moderated debates [55.705662163385966]
Partitioning debates into blocks with the same subject is essential for understanding.
We propose a new algorithm, DEBACER, which partitions moderated debates.
arXiv Detail & Related papers (2021-12-10T10:39:07Z) - ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive
Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads.
We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z) - High Quality Real-Time Structured Debate Generation [0.0]
We define debate trees and paths for generating debates while enforcing a high level structure and grammar.
We leverage a large corpus of tree-structured debates that have metadata associated with each argument.
Our results demonstrate the ability to generate debates in real-time on complex topics at a quality that is close to humans.
arXiv Detail & Related papers (2020-12-01T01:39:38Z) - Aspect-Controlled Neural Argument Generation [65.91772010586605]
We train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect.
Our evaluation shows that our generation model is able to generate high-quality, aspect-specific arguments.
These arguments can be used to improve the performance of stance detection models via data augmentation and to generate counter-arguments.
arXiv Detail & Related papers (2020-04-30T20:17:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.