Related papers: SepFormer: Coarse-to-fine Separator Regression Network for Table Structure Recognition

SepFormer: Coarse-to-fine Separator Regression Network for Table Structure Recognition

URL: http://arxiv.org/abs/2506.21920v1
Date: Fri, 27 Jun 2025 05:20:42 GMT
Title: SepFormer: Coarse-to-fine Separator Regression Network for Table Structure Recognition
Authors: Nam Quan Nguyen, Xuan Phong Pham, Tuan-Anh Tran,
Abstract summary: We present SepFormer, which integrates the split-and-merge paradigm into a single step through separator regression with a DETR-style architecture.<n>SepFormer can run on average at 25.6 FPS while achieving comparable performance with state-of-the-art methods on several benchmark datasets.
Score: 0.5120567378386615
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The automated reconstruction of the logical arrangement of tables from image data, termed Table Structure Recognition (TSR), is fundamental for semantic data extraction. Recently, researchers have explored a wide range of techniques to tackle this problem, demonstrating significant progress. Each table is a set of vertical and horizontal separators. Following this realization, we present SepFormer, which integrates the split-and-merge paradigm into a single step through separator regression with a DETR-style architecture, improving speed and robustness. SepFormer is a coarse-to-fine approach that predicts table separators from single-line to line-strip separators with a stack of two transformer decoders. In the coarse-grained stage, the model learns to gradually refine single-line segments through decoder layers with additional angle loss. At the end of the fine-grained stage, the model predicts line-strip separators by refining sampled points from each single-line segment. Our SepFormer can run on average at 25.6 FPS while achieving comparable performance with state-of-the-art methods on several benchmark datasets, including SciTSR, PubTabNet, WTW, and iFLYTAB.

Related papers

SEMv3: A Fast and Robust Approach to Table Separation Line Detection [48.75713662571455]
Table structure recognition (TSR) aims to parse the inherent structure of a table from its input image. "Split-and-merge" paradigm is a pivotal approach to parse table structure, where the table separation line detection is crucial. We propose SEMv3 (SEM: Split, Embed and Merge), a method that is both fast and robust for detecting table separation lines.
arXiv Detail & Related papers (2024-05-20T08:13:46Z)
Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer [15.708108572696062]
We present a new table structure recognition approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet, WTW and FinTabNet.
arXiv Detail & Related papers (2023-03-21T06:20:49Z)
SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge) We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution. To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z)
TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers [56.56591337457137]
We propose an accurate and end-to-end transformer-based table structure recognition method, referred to as TRUST. Transformers are suitable for table structure recognition because of their global computations, perfect memory, and parallel computation. We conduct experiments on several popular benchmarks including PubTabNet and SynthTable, our method achieves new state-of-the-art results.
arXiv Detail & Related papers (2022-08-31T08:33:36Z)
TSRFormer: Table Structure Recognition with Transformers [15.708108572696064]
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images. We propose a new two-stage DETR based separator prediction approach, dubbed textbfSeparator textbfREgression textbfTRansformer (SepRETR) We achieve state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
arXiv Detail & Related papers (2022-08-09T17:36:13Z)
Improving Video Instance Segmentation via Temporal Pyramid Routing [61.10753640148878]
Video Instance (VIS) is a new and inherently multi-task problem, which aims to detect, segment and track each instance in a video sequence. We propose a Temporal Pyramid Routing (TPR) strategy to conditionally align and conduct pixel-level aggregation from a feature pyramid pair of two adjacent frames. Our approach is a plug-and-play module and can be easily applied to existing instance segmentation methods.
arXiv Detail & Related papers (2021-07-28T03:57:12Z)
SOLD2: Self-supervised Occlusion-aware Line Description and Detection [95.8719432775724]
We introduce the first joint detection and description of line segments in a single deep network. Our method does not require any annotated line labels and can therefore generalize to any dataset. We evaluate our approach against previous line detection and description methods on several multi-view datasets.
arXiv Detail & Related papers (2021-04-07T19:27:17Z)
Holistically-Attracted Wireframe Parsing [123.58263152571952]
This paper presents a fast and parsimonious parsing method to detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification.
arXiv Detail & Related papers (2020-03-03T17:43:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.