Unified Line and Paragraph Detection by Graph Convolutional Networks
- URL: http://arxiv.org/abs/2203.09638v1
- Date: Thu, 17 Mar 2022 22:27:12 GMT
- Title: Unified Line and Paragraph Detection by Graph Convolutional Networks
- Authors: Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii
- Abstract summary: We formulate the task of detecting lines and paragraphs in a document into a unified two-level clustering problem.
We use a graph convolutional network to predict the relations between text detection boxes and then build both levels of clusters from these predictions.
Experimentally, we demonstrate that the unified approach can be highly efficient while still achieving state-of-the-art quality for detecting paragraphs in public benchmarks and real-world images.
- Score: 5.298581058536571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We formulate the task of detecting lines and paragraphs in a document into a
unified two-level clustering problem. Given a set of text detection boxes that
roughly correspond to words, a text line is a cluster of boxes and a paragraph
is a cluster of lines. These clusters form a two-level tree that represents a
major part of the layout of a document. We use a graph convolutional network to
predict the relations between text detection boxes and then build both levels
of clusters from these predictions. Experimentally, we demonstrate that the
unified approach can be highly efficient while still achieving state-of-the-art
quality for detecting paragraphs in public benchmarks and real-world images.
Related papers
- Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis [52.01356859448068]
HTS can recognize text in an image and identify its 4-level hierarchical structure: characters, words, lines, and paragraphs.
HTS achieves state-of-the-art results on multiple word-level text spotting benchmark datasets as well as geometric layout analysis tasks.
arXiv Detail & Related papers (2023-10-25T22:23:54Z) - Text Reading Order in Uncontrolled Conditions by Sparse Graph
Segmentation [71.40119152422295]
We propose a lightweight, scalable and generalizable approach to identify text reading order.
The model is language-agnostic and runs effectively across multi-language datasets.
It is small enough to be deployed on virtually any platform including mobile devices.
arXiv Detail & Related papers (2023-05-04T06:21:00Z) - Towards End-to-End Unified Scene Text Detection and Layout Analysis [60.68100769639923]
We introduce the task of unified scene text detection and layout analysis.
The first hierarchical scene text dataset is introduced to enable this novel research task.
We also propose a novel method that is able to simultaneously detect scene text and form text clusters in a unified way.
arXiv Detail & Related papers (2022-03-28T23:35:45Z) - Controversy Detection: a Text and Graph Neural Network Based Approach [0.0]
Controversial content refers to any content that attracts both positive and negative feedback.
Most of the existing approaches rely on the graph structure of a topic-discussion and/or the content of messages.
This paper proposes a controversy detection approach based on both graph structure of a discussion and text features.
arXiv Detail & Related papers (2021-12-03T09:06:46Z) - StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks [31.76016966100244]
StrokeNet is proposed to effectively detect the texts by capturing the fine-grained strokes.
Different from existing approaches that represent the text area by a series of points or rectangular boxes, we directly localize strokes of each text instance.
arXiv Detail & Related papers (2021-11-23T08:26:42Z) - Vec2GC -- A Graph Based Clustering Method for Text Representations [0.0]
Vec2GC is an end-to-end pipeline to cluster terms or documents for any given text corpus.
Vec2GC clustering algorithm is a density based approach, that supports hierarchical clustering as well.
arXiv Detail & Related papers (2021-04-15T12:52:30Z) - SOLD2: Self-supervised Occlusion-aware Line Description and Detection [95.8719432775724]
We introduce the first joint detection and description of line segments in a single deep network.
Our method does not require any annotated line labels and can therefore generalize to any dataset.
We evaluate our approach against previous line detection and description methods on several multi-view datasets.
arXiv Detail & Related papers (2021-04-07T19:27:17Z) - Minimally-Supervised Structure-Rich Text Categorization via Learning on
Text-Rich Networks [61.23408995934415]
We propose a novel framework for minimally supervised categorization by learning from the text-rich network.
Specifically, we jointly train two modules with different inductive biases -- a text analysis module for text understanding and a network learning module for class-discriminative, scalable network learning.
Our experiments show that given only three seed documents per category, our framework can achieve an accuracy of about 92%.
arXiv Detail & Related papers (2021-02-23T04:14:34Z) - Graph-based Topic Extraction from Vector Embeddings of Text Documents:
Application to a Corpus of News Articles [0.0]
We present an unsupervised framework that brings together powerful vector embeddings from natural language processing with tools from multiscale graph partitioning.
We show the advantages of graph-based clustering through end-to-end comparisons with other popular clustering and topic modelling methods.
This work is showcased through an analysis of a corpus of US news coverage during the presidential election year of 2016.
arXiv Detail & Related papers (2020-10-28T16:20:05Z) - Heterogeneous Graph Neural Networks for Extractive Document
Summarization [101.17980994606836]
Cross-sentence relations are a crucial step in extractive document summarization.
We present a graph-based neural network for extractive summarization (HeterSumGraph)
We introduce different types of nodes into graph-based neural networks for extractive document summarization.
arXiv Detail & Related papers (2020-04-26T14:38:11Z) - ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene
Text Detection with Graph Convolutional Networks [6.533254660400229]
We introduce a new arbitrary-shaped text detection approach named ReLaText.
To demonstrate the effectiveness of this new formulation, we start from using a "link" relationship to address the challenging text-line grouping problem.
Our GCN based text-line grouping approach can achieve better text detection accuracy than previous text-line grouping methods.
arXiv Detail & Related papers (2020-03-16T03:33:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.