Controllable Conversational Theme Detection Track at DSTC 12
- URL: http://arxiv.org/abs/2508.18783v1
- Date: Tue, 26 Aug 2025 08:10:01 GMT
- Title: Controllable Conversational Theme Detection Track at DSTC 12
- Authors: Igor Shalyminov, Hang Su, Jake Vincent, Siffi Singh, Jason Cai, James Gung, Raphael Shu, Saab Mansour,
- Abstract summary: We introduce Theme Detection as a critical task in conversational analytics.<n>Unlike traditional dialog intent detection, themes are intended as a direct, user-facing summary of the conversation's core inquiry.<n>We pose Controllable Conversational Theme Detection problem as a public competition track at Dialog System Technology Challenge 12.
- Score: 24.160077192565087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational analytics has been on the forefront of transformation driven by the advances in Speech and Natural Language Processing techniques. Rapid adoption of Large Language Models (LLMs) in the analytics field has taken the problems that can be automated to a new level of complexity and scale. In this paper, we introduce Theme Detection as a critical task in conversational analytics, aimed at automatically identifying and categorizing topics within conversations. This process can significantly reduce the manual effort involved in analyzing expansive dialogs, particularly in domains like customer support or sales. Unlike traditional dialog intent detection, which often relies on a fixed set of intents for downstream system logic, themes are intended as a direct, user-facing summary of the conversation's core inquiry. This distinction allows for greater flexibility in theme surface forms and user-specific customizations. We pose Controllable Conversational Theme Detection problem as a public competition track at Dialog System Technology Challenge (DSTC) 12 -- it is framed as joint clustering and theme labeling of dialog utterances, with the distinctive aspect being controllability of the resulting theme clusters' granularity achieved via the provided user preference data. We give an overview of the problem, the associated dataset and the evaluation metrics, both automatic and human. Finally, we discuss the participant teams' submissions and provide insights from those. The track materials (data and code) are openly available in the GitHub repository.
Related papers
- CATCH: A Controllable Theme Detection Framework with Contextualized Clustering and Hierarchical Generation [33.065240934374586]
We propose a unified framework that integrates three core components: context-aware topic representation, preference-guided topic clustering, and a hierarchical theme generation mechanism.<n>Experiments on a multi-domain customer dialogue benchmark (DSTC-12) demonstrate the effectiveness of CATCH with 8B LLM in both theme clustering and topic generation quality.
arXiv Detail & Related papers (2025-12-25T15:33:25Z) - DTECT: Dynamic Topic Explorer & Context Tracker [0.8962460460173959]
We introduce DTECT (Dynamic Topic Explorer & Context Tracker), an end-to-end system that bridges the gap between raw textual data and meaningful temporal insights.<n>DTECT provides a unified workflow that supports data preprocessing, multiple model architectures, and dedicated evaluation metrics to analyze the topic quality of temporal topic models.<n>It significantly enhances interpretability by introducing LLM-driven automatic topic labeling, trend analysis via temporally salient words, interactive visualizations with document-level summarization, and a natural language chat interface for intuitive data querying.
arXiv Detail & Related papers (2025-07-10T16:44:33Z) - Personalized Topic Selection Model for Topic-Grounded Dialogue [24.74527189182273]
Current models tend to predict user-uninteresting and contextually irrelevant topics.
We propose a textbfPersonalized topic stextbfElection model for textbfTopic-grounded textbfDialogue, named textbfPETD.
Our proposed method can generate engaging and diverse responses, outperforming state-of-the-art baselines.
arXiv Detail & Related papers (2024-06-04T06:09:49Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - Multi-Granularity Prompts for Topic Shift Detection in Dialogue [13.739991183173494]
The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change.
Previous work focused on detecting topic shifts using pre-trained models to encode the utterance.
We take a prompt-based approach to fully extract topic information from dialogues at multiple-granularity, i.e., label, turn, and topic.
arXiv Detail & Related papers (2023-05-23T12:35:49Z) - SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation [55.82577086422923]
We provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues.
We release a large-scale supervised dataset called SuperDialseg, containing 9,478 dialogues.
We also provide a benchmark including 18 models across five categories for the dialogue segmentation task.
arXiv Detail & Related papers (2023-05-15T06:08:01Z) - Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and
Context-Aware Auto-Encoders [59.038157066874255]
We propose a novel framework called RankAE to perform chat summarization without employing manually labeled data.
RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously.
A denoising auto-encoder is designed to generate succinct but context-informative summaries based on the selected utterances.
arXiv Detail & Related papers (2020-12-14T07:31:17Z) - Response Selection for Multi-Party Conversations with Dynamic Topic
Tracking [63.15158355071206]
We frame response selection as a dynamic topic tracking task to match the topic between the response and relevant conversation context.
We propose a novel multi-task learning framework that supports efficient encoding through large pretrained models.
Experimental results on the DSTC-8 Ubuntu IRC dataset show state-of-the-art results in response selection and topic disentanglement tasks.
arXiv Detail & Related papers (2020-10-15T14:21:38Z) - Topic-Aware Multi-turn Dialogue Modeling [91.52820664879432]
This paper presents a novel solution for multi-turn dialogue modeling, which segments and extracts topic-aware utterances in an unsupervised way.
Our topic-aware modeling is implemented by a newly proposed unsupervised topic-aware segmentation algorithm and Topic-Aware Dual-attention Matching (TADAM) Network.
arXiv Detail & Related papers (2020-09-26T08:43:06Z) - Detecting and Classifying Malevolent Dialogue Responses: Taxonomy, Data
and Methodology [68.8836704199096]
Corpus-based conversational interfaces are able to generate more diverse and natural responses than template-based or retrieval-based agents.
With their increased generative capacity of corpusbased conversational agents comes the need to classify and filter out malevolent responses.
Previous studies on the topic of recognizing and classifying inappropriate content are mostly focused on a certain category of malevolence.
arXiv Detail & Related papers (2020-08-21T22:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.