Deep learning for table detection and structure recognition: A survey
- URL: http://arxiv.org/abs/2211.08469v1
- Date: Tue, 15 Nov 2022 19:42:27 GMT
- Title: Deep learning for table detection and structure recognition: A survey
- Authors: Mahmoud Kasem, Abdelrahman Abdallah, Alexander Berendeyev, Ebrahem
Elkady, Mahmoud Abdalla, Mohamed Mahmoud, Mohamed Hamada, Daniyar Nurseitov,
Islam Taj-Eddin
- Abstract summary: The goal of this survey is to provide a profound comprehension of the major developments in the field of Table Detection.
We provide an analysis of both classic and new applications in the field.
The datasets and source code of the existing models are organized to provide the reader with a compass on this vast literature.
- Score: 49.09628624903334
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Tables are everywhere, from scientific journals, papers, websites, and
newspapers all the way to items we buy at the supermarket. Detecting them is
thus of utmost importance to automatically understanding the content of a
document. The performance of table detection has substantially increased thanks
to the rapid development of deep learning networks. The goals of this survey
are to provide a profound comprehension of the major developments in the field
of Table Detection, offer insight into the different methodologies, and provide
a systematic taxonomy of the different approaches. Furthermore, we provide an
analysis of both classic and new applications in the field. Lastly, the
datasets and source code of the existing models are organized to provide the
reader with a compass on this vast literature. Finally, we go over the
architecture of utilizing various object detection and table structure
recognition methods to create an effective and efficient system, as well as a
set of development trends to keep up with state-of-the-art algorithms and
future research. We have also set up a public GitHub repository where we will
be updating the most recent publications, open data, and source code. The
GitHub repository is available at
https://github.com/abdoelsayed2016/table-detection-structure-recognition.
Related papers
- DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models [98.41645229835493]
Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making.
Large foundation models, such as large language models, have revolutionized various natural language processing tasks.
This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis.
arXiv Detail & Related papers (2024-03-18T17:57:09Z) - FaKnow: A Unified Library for Fake News Detection [11.119667583594483]
FaKnow is a unified and comprehensive fake news detection algorithm library.
It covers the full spectrum of the model training and evaluation process.
It furnishes a series of auxiliary functionalities and tools, including visualization, and logging.
arXiv Detail & Related papers (2024-01-27T13:29:17Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Source Code Data Augmentation for Deep Learning: A Survey [32.035973285175075]
We conduct a comprehensive survey of data augmentation for source code.
We highlight the general strategies and techniques to optimize the DA quality.
We outline the prevailing challenges and potential opportunities for future research.
arXiv Detail & Related papers (2023-05-31T14:47:44Z) - ALBench: A Framework for Evaluating Active Learning in Object Detection [102.81795062493536]
This paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
Developed on an automatic deep model training system, this ALBench framework is easy-to-use, compatible with different active learning algorithms, and ensures the same training and testing protocols.
arXiv Detail & Related papers (2022-07-27T07:46:23Z) - A Survey of Deep Learning Models for Structural Code Understanding [21.66270320648155]
We present a comprehensive overview of the structures formed from code data.
We categorize the models for understanding code in recent years into two groups: sequence-based and graph-based models.
We also introduce metrics, datasets and the downstream tasks.
arXiv Detail & Related papers (2022-05-03T03:56:17Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - Tell Me How to Survey: Literature Review Made Simple with Automatic
Reading Path Generation [16.07200776251764]
How to glean papers worth reading from the massive literature to do a quick survey or keep up with the latest advancement about a specific research topic has become a challenging task.
Existing academic search engines such as Google Scholar return relevant papers by individually calculating the relevance between each paper and query.
We introduce Reading Path Generation (RPG) which aims at automatically producing a path of papers to read for a given query.
arXiv Detail & Related papers (2021-10-12T20:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.