Structured Summarization: Unified Text Segmentation and Segment Labeling
as a Generation Task
- URL: http://arxiv.org/abs/2209.13759v1
- Date: Wed, 28 Sep 2022 01:08:50 GMT
- Title: Structured Summarization: Unified Text Segmentation and Segment Labeling
as a Generation Task
- Authors: Hakan Inan, Rashi Rungta, Yashar Mehdad
- Abstract summary: We propose a single encoder-decoder neural network that can handle long documents and conversations.
We successfully show a way to solve the combined task as a pure generation task.
Our results establish a strong case for considering text segmentation and segment labeling as a whole.
- Score: 16.155438404910043
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text segmentation aims to divide text into contiguous, semantically coherent
segments, while segment labeling deals with producing labels for each segment.
Past work has shown success in tackling segmentation and labeling for documents
and conversations. This has been possible with a combination of task-specific
pipelines, supervised and unsupervised learning objectives. In this work, we
propose a single encoder-decoder neural network that can handle long documents
and conversations, trained simultaneously for both segmentation and segment
labeling using only standard supervision. We successfully show a way to solve
the combined task as a pure generation task, which we refer to as structured
summarization. We apply the same technique to both document and conversational
data, and we show state of the art performance across datasets for both
segmentation and labeling, under both high- and low-resource settings. Our
results establish a strong case for considering text segmentation and segment
labeling as a whole, and moving towards general-purpose techniques that don't
depend on domain expertise or task-specific components.
Related papers
- LESS: Label-Efficient and Single-Stage Referring 3D Segmentation [55.06002976797879]
Referring 3D is a visual-language task that segments all points of the specified object from a 3D point cloud described by a sentence of query.
We propose a novel Referring 3D pipeline, Label-Efficient and Single-Stage, dubbed LESS, which is only under the supervision of efficient binary mask.
We achieve state-of-the-art performance on ScanRefer dataset by surpassing the previous methods about 3.7% mIoU using only binary labels.
arXiv Detail & Related papers (2024-10-17T07:47:41Z) - Scribbles for All: Benchmarking Scribble Supervised Segmentation Across Datasets [51.74296438621836]
We introduce Scribbles for All, a label and training data generation algorithm for semantic segmentation trained on scribble labels.
The main limitation of scribbles as source for weak supervision is the lack of challenging datasets for scribble segmentation.
Scribbles for All provides scribble labels for several popular segmentation datasets and provides an algorithm to automatically generate scribble labels for any dataset with dense annotations.
arXiv Detail & Related papers (2024-08-22T15:29:08Z) - USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation [33.11010205890195]
The main challenge in open-vocabulary image segmentation now lies in accurately classifying these segments into text-defined categories.
We introduce the Universal Segment Embedding (USE) framework to address this challenge.
This framework is comprised of two key components: 1) a data pipeline designed to efficiently curate a large amount of segment-text pairs at various granularities, and 2) a universal segment embedding model that enables precise segment classification into a vast range of text-defined categories.
arXiv Detail & Related papers (2024-06-07T21:41:18Z) - From Text Segmentation to Smart Chaptering: A Novel Benchmark for
Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse.
We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z) - Segmenting Messy Text: Detecting Boundaries in Text Derived from
Historical Newspaper Images [0.0]
We consider a challenging text segmentation task: dividing newspaper marriage announcement lists into units of one announcement each.
In many cases the information is not structured into sentences, and adjacent segments are not topically distinct from each other.
We present a novel deep learning-based model for segmenting such text and show that it significantly outperforms an existing state-of-the-art method on our task.
arXiv Detail & Related papers (2023-12-20T05:17:06Z) - Segment Everything Everywhere All at Once [124.90835636901096]
We present SEEM, a promptable and interactive model for segmenting everything everywhere all at once in an image.
We propose a novel decoding mechanism that enables diverse prompting for all types of segmentation tasks.
We conduct a comprehensive empirical study to validate the effectiveness of SEEM across diverse segmentation tasks.
arXiv Detail & Related papers (2023-04-13T17:59:40Z) - SegGPT: Segmenting Everything In Context [98.98487097934067]
We present SegGPT, a model for segmenting everything in context.
We unify various segmentation tasks into a generalist in-context learning framework.
SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference.
arXiv Detail & Related papers (2023-04-06T17:59:57Z) - Open-world Semantic Segmentation via Contrasting and Clustering
Vision-Language Embedding [95.78002228538841]
We propose a new open-world semantic segmentation pipeline that makes the first attempt to learn to segment semantic objects of various open-world categories without any efforts on dense annotations.
Our method can directly segment objects of arbitrary categories, outperforming zero-shot segmentation methods that require data labeling on three benchmark datasets.
arXiv Detail & Related papers (2022-07-18T09:20:04Z) - Learning Panoptic Segmentation from Instance Contours [9.347742071428918]
Panopticpixel aims to provide an understanding of background (stuff) and instances of objects (things) at a pixel level.
It combines the separate tasks of semantic segmentation (level classification) and instance segmentation to build a single unified scene understanding task.
We present a fully convolution neural network that learns instance segmentation from semantic segmentation and instance contours.
arXiv Detail & Related papers (2020-10-16T03:05:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.