Related papers: Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies

Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies

URL: http://arxiv.org/abs/2210.13783v1
Date: Tue, 25 Oct 2022 06:02:28 GMT
Title: Topical Segmentation of Spoken Narratives: A Test Case on Holocaust Survivor Testimonies
Authors: Eitan Wagner, Renana Keydar, Amit Pinchevski, Omri Abend
Abstract summary: We tackle the task of segmenting running (spoken) narratives. As a test case, we address Holocaust survivor testimonies, given in English.
Score: 18.80663780272046
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The task of topical segmentation is well studied, but previous work has mostly addressed it in the context of structured, well-defined segments, such as segmentation into paragraphs, chapters, or segmenting text that originated from multiple sources. We tackle the task of segmenting running (spoken) narratives, which poses hitherto unaddressed challenges. As a test case, we address Holocaust survivor testimonies, given in English. Other than the importance of studying these testimonies for Holocaust research, we argue that they provide an interesting test case for topical segmentation, due to their unstructured surface level, relative abundance (tens of thousands of such testimonies were collected), and the relatively confined domain that they cover. We hypothesize that boundary points between segments correspond to low mutual information between the sentences proceeding and following the boundary. Based on this hypothesis, we explore a range of algorithmic approaches to the task, building on previous work on segmentation that uses generative Bayesian modeling and state-of-the-art neural machinery. Compared to manually annotated references, we find that the developed approaches show considerable improvements over previous work.

Related papers

A Bottom-Up Approach to Class-Agnostic Image Segmentation [4.086366531569003]
We present a novel bottom-up formulation for addressing the class-agnostic segmentation problem. We supervise our network directly on the projective sphere of its feature space. Our bottom-up formulation exhibits exceptional generalization capability, even when trained on datasets designed for class-based segmentation.
arXiv Detail & Related papers (2024-09-20T17:56:02Z)
Image Segmentation in Foundation Model Era: A Survey [99.19456390358211]
Current research in image segmentation lacks a detailed analysis of distinct characteristics, challenges, and solutions associated with these advancements. This survey seeks to fill this gap by providing a thorough review of cutting-edge research centered around FM-driven image segmentation. An exhaustive overview of over 300 segmentation approaches is provided to encapsulate the breadth of current research efforts.
arXiv Detail & Related papers (2024-08-23T10:07:59Z)
From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse. We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z)
Towards Open Vocabulary Learning: A Survey [146.90188069113213]
Deep neural networks have made impressive advancements in various core tasks like segmentation, tracking, and detection. Recently, open vocabulary settings were proposed due to the rapid progress of vision language pre-training. This paper provides a thorough review of open vocabulary learning, summarizing and analyzing recent developments in the field.
arXiv Detail & Related papers (2023-06-28T02:33:06Z)
Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence [0.0]
We propose a method that incorporates a deeper understanding of both sentence and document themes. This allows our model to detect latent topics that may include uncommon words or neologisms. We present correlation coefficients with human identification of intruder words and achieve near-human level results at the word-intrusion task.
arXiv Detail & Related papers (2023-03-30T12:24:25Z)
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation [80.48979302400868]
We focus on open vocabulary instance segmentation to expand a segmentation model to classify and segment instance-level novel categories. Previous approaches have relied on massive caption datasets and complex pipelines to establish one-to-one mappings between image regions and captions in nouns. We devise a joint textbfCaption Grounding and Generation (CGG) framework, which incorporates a novel grounding loss that only focuses on matching object to improve learning efficiency.
arXiv Detail & Related papers (2023-01-02T18:52:12Z)
Toward Unifying Text Segmentation and Long Document Summarization [31.084738269628748]
We study the role that section segmentation plays in extractive summarization of written and spoken documents. Our approach learns robust sentence representations by performing summarization and segmentation simultaneously. Our findings suggest that the model can not only achieve state-of-the-art performance on publicly available benchmarks, but demonstrate better cross-genre transferability.
arXiv Detail & Related papers (2022-10-28T22:07:10Z)
A Survey on Label-efficient Deep Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction [115.9169213834476]
This paper offers a comprehensive review on label-efficient segmentation methods. We first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels. Next, we summarize the existing label-efficient segmentation methods from a unified perspective.
arXiv Detail & Related papers (2022-07-04T06:21:01Z)
Neural Sequence Segmentation as Determining the Leftmost Segments [25.378188980430256]
We propose a novel framework that incrementally segments natural language sentences at segment level. For every step in segmentation, it recognizes the leftmost segment of the remaining sequence. We have conducted extensive experiments on syntactic chunking and Chinese part-of-speech tagging across 3 datasets.
arXiv Detail & Related papers (2021-04-15T03:35:03Z)
Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc. We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.