LORE: Logical Location Regression Network for Table Structure
Recognition
- URL: http://arxiv.org/abs/2303.03730v1
- Date: Tue, 7 Mar 2023 08:42:46 GMT
- Title: LORE: Logical Location Regression Network for Table Structure
Recognition
- Authors: Hangdi Xing, Feiyu Gao, Rujiao Long, Jiajun Bu, Qi Zheng, Liangcheng
Li, Cong Yao, Zhi Yu
- Abstract summary: Table structure recognition aims at extracting tables in images into machine-understandable formats.
Recent methods solve this problem by predicting the adjacency relations of detected cell boxes.
We propose a new TSR framework called LORE, standing for LOgical location REgression network.
- Score: 24.45544796305824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Table structure recognition (TSR) aims at extracting tables in images into
machine-understandable formats. Recent methods solve this problem by predicting
the adjacency relations of detected cell boxes, or learning to generate the
corresponding markup sequences from the table images. However, they either
count on additional heuristic rules to recover the table structures, or require
a huge amount of training data and time-consuming sequential decoders. In this
paper, we propose an alternative paradigm. We model TSR as a logical location
regression problem and propose a new TSR framework called LORE, standing for
LOgical location REgression network, which for the first time combines logical
location regression together with spatial location regression of table cells.
Our proposed LORE is conceptually simpler, easier to train and more accurate
than previous TSR models of other paradigms. Experiments on standard benchmarks
demonstrate that LORE consistently outperforms prior arts. Code is available at
https://
github.com/AlibabaResearch/AdvancedLiterateMachinery/tree/main/DocumentUnderstanding/LORE-TSR.
Related papers
- TableRAG: Million-Token Table Understanding with Language Models [53.039560091592215]
TableRAG is a Retrieval-Augmented Generation (RAG) framework specifically designed for LM-based table understanding.
TableRAG leverages query expansion combined with schema and cell retrieval to pinpoint crucial information before providing it to the LMs.
Our results demonstrate that TableRAG achieves the highest retrieval quality, leading to the new state-of-the-art performance on large-scale table understanding.
arXiv Detail & Related papers (2024-10-07T04:15:02Z) - LORE++: Logical Location Regression Network for Table Structure
Recognition with Pre-training [45.80561537971478]
Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats.
We model TSR as a logical location regression problem and propose a new TSR framework called LORE.
Our proposed LORE is conceptually simpler, easier to train, and more accurate than other paradigms of TSR.
arXiv Detail & Related papers (2024-01-03T03:14:55Z) - Retrieval-Based Transformer for Table Augmentation [14.460363647772745]
We introduce a novel approach toward automatic data wrangling.
We aim to address table augmentation tasks, including row/column population and data imputation.
Our model consistently and substantially outperforms both supervised statistical methods and the current state-of-the-art transformer-based models.
arXiv Detail & Related papers (2023-06-20T18:51:21Z) - Robust Table Structure Recognition with Dynamic Queries Enhanced
Detection Transformer [15.708108572696062]
We present a new table structure recognition approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images.
With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet, WTW and FinTabNet.
arXiv Detail & Related papers (2023-03-21T06:20:49Z) - A Local-Pattern Related Look-Up Table [9.260657061050887]
A Relevance-Zone pattern table (RZT) can be used to replace a traditional transposition table.
RZS is the current state-of-the-art in solving L&D problems in Go.
The overhead of traversing the radix tree in practice during lookup remain flat logarithmically in relation to the number of entries stored in the table.
arXiv Detail & Related papers (2022-12-22T06:02:13Z) - Learning Cross-view Geo-localization Embeddings via Dynamic Weighted
Decorrelation Regularization [52.493240055559916]
Cross-view geo-localization aims to spot images of the same location shot from two platforms, e.g., the drone platform and the satellite platform.
Existing methods usually focus on optimizing the distance between one embedding with others in the feature space.
In this paper, we argue that the low redundancy is also of importance, which motivates the model to mine more diverse patterns.
arXiv Detail & Related papers (2022-11-10T02:13:10Z) - TSRFormer: Table Structure Recognition with Transformers [15.708108572696064]
We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognize the structures of complex tables with geometrical distortions from various table images.
We propose a new two-stage DETR based separator prediction approach, dubbed textbfSeparator textbfREgression textbfTRansformer (SepRETR)
We achieve state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW.
arXiv Detail & Related papers (2022-08-09T17:36:13Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.