Related papers: DebateSum: A large-scale argument mining and summarization dataset

DebateSum: A large-scale argument mining and summarization dataset

URL: http://arxiv.org/abs/2011.07251v1
Date: Sat, 14 Nov 2020 10:06:57 GMT
Title: DebateSum: A large-scale argument mining and summarization dataset
Authors: Allen Roush and Arvind Balaji
Abstract summary: DebateSum consists of 187,386 unique pieces of evidence with corresponding argument and extractive summaries. We train several transformer summarization models to benchmark summarization performance on DebateSum. We present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prior work in Argument Mining frequently alludes to its potential applications in automatic debating systems. Despite this focus, almost no datasets or models exist which apply natural language processing techniques to problems found within competitive formal debate. To remedy this, we present the DebateSum dataset. DebateSum consists of 187,386 unique pieces of evidence with corresponding argument and extractive summaries. DebateSum was made using data compiled by competitors within the National Speech and Debate Association over a 7-year period. We train several transformer summarization models to benchmark summarization performance on DebateSum. We also introduce a set of fasttext word-vectors trained on DebateSum called debate2vec. Finally, we present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association today. The DebateSum search engine is available to the public here: http://www.debate.cards

Related papers

DS@GT at Touché: Large Language Models for Retrieval-Augmented Debate [0.0]
We deploy six leading publicly available models for the Retrieval-Augmented Debate and Evaluation.<n>The evaluation is performed by measuring four key metrics: Quality, Quantity, Manner, and Relation.<n>Although LLMs perform well in debates when given related arguments, they tend to be verbose in responses yet consistent in evaluation.
arXiv Detail & Related papers (2025-07-12T00:20:00Z)
DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models [1.8197265299982013]
We introduce DebateBench, a novel dataset consisting of an extensive collection of transcripts and metadata from some of the world's most prestigious competitive debates. The dataset consists of British Parliamentary debates from prestigious debating tournaments on diverse topics, annotated with detailed speech-level scores and house rankings sourced from official adjudication data. We curate 256 speeches across 32 debates with each debate being over 1 hour long with each input being an average of 32,000 tokens.
arXiv Detail & Related papers (2025-02-10T09:23:03Z)
Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate [22.813887723656023]
Agent for Debate (Agent4Debate) is a dynamic multi-agent framework based on Large Language Models (LLMs) The evaluation employs the Debatrix automatic scoring system and professional human reviewers based on the established Debatrix-Elo and Human-Elo ranking. Experimental results indicate that the state-of-the-art Agent4Debate exhibits capabilities comparable to those of humans.
arXiv Detail & Related papers (2024-08-08T14:02:45Z)
OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset [10.385189302526246]
OpenDebateEvidence is a comprehensive dataset for argument mining and summarization sourced from the American Debate Competitive community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence.
arXiv Detail & Related papers (2024-06-20T18:22:59Z)
Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM [51.43102092480804]
Debatrix is an automated debate judge based on Large Language Models (LLMs) To align with real-world debate scenarios, we introduced the PanelBench benchmark, comparing our system's performance to actual debate outcomes. The findings indicate a notable enhancement over directly using LLMs for debate evaluation.
arXiv Detail & Related papers (2024-03-12T18:19:47Z)
Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation [62.069374456021016]
We present the ArgTersely benchmark for sentence-level counter-argument generation. We also propose Arg-LlaMA for generating high-quality counter-argument.
arXiv Detail & Related papers (2023-12-21T06:51:34Z)
DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs [0.0]
We show that effective debate cases can be constructed using constrained shortest path traversals on Argumentative Semantic Knowledge Graphs. We significantly improve upon DebateSum by introducing 53180 new examples. We create a unique method for evaluating which knowledge graphs are better in the context of producing policy debate cases.
arXiv Detail & Related papers (2023-07-09T04:19:19Z)
IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks [59.457948080207174]
In this work, we introduce a comprehensive and large dataset named IAM, which can be applied to a series of argument mining tasks. Near 70k sentences in the dataset are fully annotated based on their argument properties. We propose two new integrated argument mining tasks associated with the debate preparation process: (1) claim extraction with stance classification (CESC) and (2) claim-evidence pair extraction (CEPE)
arXiv Detail & Related papers (2022-03-23T08:07:32Z)
DEBACER: a method for slicing moderated debates [55.705662163385966]
Partitioning debates into blocks with the same subject is essential for understanding. We propose a new algorithm, DEBACER, which partitions moderated debates.
arXiv Detail & Related papers (2021-12-10T10:39:07Z)
ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining [61.82562838486632]
We crowdsource four new datasets on diverse online conversation forms of news comments, discussion forums, community question answering forums, and email threads. We benchmark state-of-the-art models on our datasets and analyze characteristics associated with the data.
arXiv Detail & Related papers (2021-06-01T22:17:13Z)
High Quality Real-Time Structured Debate Generation [0.0]
We define debate trees and paths for generating debates while enforcing a high level structure and grammar. We leverage a large corpus of tree-structured debates that have metadata associated with each argument. Our results demonstrate the ability to generate debates in real-time on complex topics at a quality that is close to humans.
arXiv Detail & Related papers (2020-12-01T01:39:38Z)
Aspect-Controlled Neural Argument Generation [65.91772010586605]
We train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect. Our evaluation shows that our generation model is able to generate high-quality, aspect-specific arguments. These arguments can be used to improve the performance of stance detection models via data augmentation and to generate counter-arguments.
arXiv Detail & Related papers (2020-04-30T20:17:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.