Text Segmentation by Cross Segment Attention
- URL: http://arxiv.org/abs/2004.14535v2
- Date: Mon, 7 Dec 2020 16:00:42 GMT
- Title: Text Segmentation by Cross Segment Attention
- Authors: Michal Lukasik, Boris Dadachev, Gon\c{c}alo Sim\~oes, Kishore Papineni
- Abstract summary: Document and discourse segmentation are two fundamental NLP tasks pertaining to breaking up text into constituents.
We establish a new state-of-the-art, reducing in particular the error rates by a large margin in all cases.
- Score: 2.525236250247906
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Document and discourse segmentation are two fundamental NLP tasks pertaining
to breaking up text into constituents, which are commonly used to help
downstream tasks such as information retrieval or text summarization. In this
work, we propose three transformer-based architectures and provide
comprehensive comparisons with previously proposed approaches on three standard
datasets. We establish a new state-of-the-art, reducing in particular the error
rates by a large margin in all cases. We further analyze model sizes and find
that we can build models with many fewer parameters while keeping good
performance, thus facilitating real-world applications.
Related papers
- Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation [9.703886326323644]
We introduce a new model - Segment any Text (SaT) - to solve this problem.
To enhance robustness, we propose a new pretraining scheme that ensures less reliance on punctuation.
To address adaptability, we introduce an extra stage of parameter-efficient fine-tuning, establishing state-of-the-art performance in distinct domains.
arXiv Detail & Related papers (2024-06-24T14:36:11Z) - From Text Segmentation to Smart Chaptering: A Novel Benchmark for
Structuring Video Transcriptions [63.11097464396147]
We introduce a novel benchmark YTSeg focusing on spoken content that is inherently more unstructured and both topically and structurally diverse.
We also introduce an efficient hierarchical segmentation model MiniSeg, that outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-27T15:59:37Z) - Building blocks for complex tasks: Robust generative event extraction
for radiology reports under domain shifts [11.845850292404768]
We show that multi-pass T5-based text-to-text generative models exhibit better generalization across exam modalities compared to approaches that employ BERT-based task-specific classification layers.
We then develop methods that reduce the inference cost of the model, making large-scale corpus processing more feasible for clinical applications.
arXiv Detail & Related papers (2023-06-15T23:16:58Z) - TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - BATS: A Spectral Biclustering Approach to Single Document Topic Modeling
and Segmentation [17.003488045214972]
Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available.
In developing a methodology to handle single documents, we face two major challenges.
First is sparse information: with access to only one document, we cannot train traditional topic models or deep learning algorithms.
Second is significant noise: a considerable portion of words in any single document will produce only noise and not help discern topics or segments.
arXiv Detail & Related papers (2020-08-05T16:34:33Z) - Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text
Segmentation [9.416757363901295]
We introduce a novel supervised model for text segmentation with simple but explicit coherence modeling.
Our model -- a neural architecture consisting of two hierarchically connected Transformer networks -- is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones.
arXiv Detail & Related papers (2020-01-03T17:06:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.