Related papers: DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates

DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates

URL: http://arxiv.org/abs/2603.05459v1
Date: Thu, 05 Mar 2026 18:30:10 GMT
Title: DEBISS: a Corpus of Individual, Semi-structured and Spoken Debates
Authors: Klaywert Danillo Ferreira de Souza, David Eduardo Pereira, Cláudio E. C. Campelo, Larissa Lucena Vasconcelos,
Abstract summary: The DEBISS corpus is a collection of spoken and individual debates with semi-structured features.<n>With a broad range of NLP task annotations, such as speech-to-text, speaker diarization, argument mining, and debater quality assessment.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The process of debating is essential in our daily lives, whether in studying, work activities, simple everyday discussions, political debates on TV, or online discussions on social networks. The range of uses for debates is broad. Due to the diverse applications, structures, and formats of debates, developing corpora that account for these variations can be challenging, and the scarcity of debate corpora in the state of the art is notable. For this reason, the current research proposes the DEBISS corpus: a collection of spoken and individual debates with semi-structured features. With a broad range of NLP task annotations, such as speech-to-text, speaker diarization, argument mining, and debater quality assessment.

Related papers

DebateBench: A Challenging Long Context Reasoning Benchmark For Large Language Models [1.8197265299982013]
We introduce DebateBench, a novel dataset consisting of an extensive collection of transcripts and metadata from some of the world's most prestigious competitive debates.<n>The dataset consists of British Parliamentary debates from prestigious debating tournaments on diverse topics, annotated with detailed speech-level scores and house rankings sourced from official adjudication data.<n>We curate 256 speeches across 32 debates with each debate being over 1 hour long with each input being an average of 32,000 tokens.
arXiv Detail & Related papers (2025-02-10T09:23:03Z)
WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain. These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech. Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z)
Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM [51.43102092480804]
Debatrix is an automated debate judge based on Large Language Models (LLMs) To align with real-world debate scenarios, we introduced the PanelBench benchmark, comparing our system's performance to actual debate outcomes. The findings indicate a notable enhancement over directly using LLMs for debate evaluation.
arXiv Detail & Related papers (2024-03-12T18:19:47Z)
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents [70.08842857515141]
SpokenWOZ is a large-scale speech-text dataset for spoken TOD.<n>Cross-turn slot and reasoning slot detection are new challenges for SpokenWOZ.
arXiv Detail & Related papers (2023-05-22T13:47:51Z)
DEBACER: a method for slicing moderated debates [55.705662163385966]
Partitioning debates into blocks with the same subject is essential for understanding. We propose a new algorithm, DEBACER, which partitions moderated debates.
arXiv Detail & Related papers (2021-12-10T10:39:07Z)
Who Responded to Whom: The Joint Effects of Latent Topics and Discourse in Conversation Structure [53.77234444565652]
We identify the responding relations in the conversation discourse, which link response utterances to their initiations. We propose a model to learn latent topics and discourse in word distributions, and predict pairwise initiation-response links. Experimental results on both English and Chinese conversations show that our model significantly outperforms the previous state of the arts.
arXiv Detail & Related papers (2021-04-17T17:46:00Z)
High Quality Real-Time Structured Debate Generation [0.0]
We define debate trees and paths for generating debates while enforcing a high level structure and grammar. We leverage a large corpus of tree-structured debates that have metadata associated with each argument. Our results demonstrate the ability to generate debates in real-time on complex topics at a quality that is close to humans.
arXiv Detail & Related papers (2020-12-01T01:39:38Z)
DebateSum: A large-scale argument mining and summarization dataset [0.0]
DebateSum consists of 187,386 unique pieces of evidence with corresponding argument and extractive summaries. We train several transformer summarization models to benchmark summarization performance on DebateSum. We present a search engine for this dataset which is utilized extensively by members of the National Speech and Debate Association.
arXiv Detail & Related papers (2020-11-14T10:06:57Z)
Exploring the Role of Argument Structure in Online Debate Persuasion [39.74040217761505]
We investigate the role of discourse structure of the arguments from online debates in their persuasiveness. We find that argument structure features play an essential role in achieving the better predictive performance.
arXiv Detail & Related papers (2020-10-07T17:34:50Z)
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation [66.99734491847076]
We propose a Chinese multi-domain knowledge-driven conversation dataset, KdConv, which grounds the topics in multi-turn conversations to knowledge graphs. Our corpus contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0.
arXiv Detail & Related papers (2020-04-08T16:25:39Z)
What Changed Your Mind: The Roles of Dynamic Topics and Discourse in Argumentation Process [78.4766663287415]
This paper presents a study that automatically analyzes the key factors in argument persuasiveness. We propose a novel neural model that is able to track the changes of latent topics and discourse in argumentative conversations.
arXiv Detail & Related papers (2020-02-10T04:27:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.