Related papers: Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach

Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach

URL: http://arxiv.org/abs/2208.06031v1
Date: Thu, 11 Aug 2022 20:29:45 GMT
Title: Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach
Authors: Bin Xiao, Murat Simsek, Burak Kantarci and Ala Abu Alkheir
Abstract summary: We define a Table Structure Recognition (TSR) task and a Table Cell Type Classification (CTC) task. Our proposed method can outperform state-of-the-art methods on ICDAR2013 and UNLV datasets.
Score: 13.976736586808308
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Due to the characteristics of Information and Communications Technology (ICT) products, the critical information of ICT devices is often summarized in big tabular data shared across supply chains. Therefore, it is critical to automatically interpret tabular structures with the surging amount of electronic assets. To transform the tabular data in electronic documents into a machine-interpretable format and provide layout and semantic information for information extraction and interpretation, we define a Table Structure Recognition (TSR) task and a Table Cell Type Classification (CTC) task. We use a graph to represent complex table structures for the TSR task. Meanwhile, table cells are categorized into three groups based on their functional roles for the CTC task, namely Header, Attribute, and Data. Subsequently, we propose a multi-task model to solve the defined two tasks simultaneously by using the text modal and image modal features. Our experimental results show that our proposed method can outperform state-of-the-art methods on ICDAR2013 and UNLV datasets.

Related papers

Theme-Explanation Structure for Table Summarization using Large Language Models: A Case Study on Korean Tabular Data [1.0621665950143144]
This paper proposes the Theme-Explanation Structure-based Table Summarization pipeline (Tabular-TX) It generates summary sentences following a structured format, where the Theme Part appears as an adverbial phrase, and the Explanation Part follows as a predictive clause. Experimental results demonstrate that Tabular-TX significantly outperforms conventional fine-tuning-based methods.
arXiv Detail & Related papers (2025-01-17T08:42:49Z)
TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning [61.14586098005874]
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning. We introduce our Tool-Augmented Reasoning framework for Tables (TART), which integrates LLMs with specialized tools. TART contains three key components: a table formatter to ensure accurate data representation, a tool maker to develop specific computational tools, and an explanation generator to maintain explainability.
arXiv Detail & Related papers (2024-09-18T06:19:59Z)
Knowledge-Aware Reasoning over Multimodal Semi-structured Tables [85.24395216111462]
This study investigates whether current AI models can perform knowledge-aware reasoning on multimodal structured data. We introduce MMTabQA, a new dataset designed for this purpose. Our experiments highlight substantial challenges for current AI models in effectively integrating and interpreting multiple text and image inputs.
arXiv Detail & Related papers (2024-08-25T15:17:43Z)
TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy [81.76462101465354]
We present a novel large vision-hugging model, TabPedia, equipped with a concept synergy mechanism. This unified framework allows TabPedia to seamlessly integrate VTU tasks, such as table detection, table structure recognition, table querying, and table question answering. To better evaluate the VTU task in real-world scenarios, we establish a new and comprehensive table VQA benchmark, ComTQA.
arXiv Detail & Related papers (2024-06-03T13:54:05Z)
UniTable: Towards a Unified Framework for Table Recognition via Self-Supervised Pretraining [22.031699293366486]
We present UniTable, a training framework that unifies the training paradigm and training objective of table recognition. Our framework unifies the training objectives of all three TR tasks into a unified task-agnostic training objective: language modeling. UniTable's table parsing capability has surpassed both existing TR methods and general large vision-language models.
arXiv Detail & Related papers (2024-03-07T15:44:50Z)
Efficient Information Sharing in ICT Supply Chain Social Network via Table Structure Recognition [12.79419287446918]
Table Structure Recognition (TSR) aims to represent tables with complex structures in a machine-interpretable format. We implement our proposed method based on Faster-RCNN and achieve 94.79% on mean Average Precision (AP)
arXiv Detail & Related papers (2022-11-03T20:03:07Z)
SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning [5.5616364225463055]
We introduce a new framework, Subsetting features of Tabular data (SubTab) In this paper, we introduce a new framework, Subsetting features of Tabular data (SubTab) We argue that reconstructing the data from the subset of its features rather than its corrupted version in an autoencoder setting can better capture its underlying representation.
arXiv Detail & Related papers (2021-10-08T20:11:09Z)
Multi-Type-TD-TSR -- Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: from OCR to Structured Table Representations [63.98463053292982]
The recognition of tables consists of two main tasks, namely table detection and table structure recognition. Recent work shows a clear trend towards deep learning approaches coupled with the use of transfer learning for the task of table structure recognition. We present a multistage pipeline named Multi-Type-TD-TSR, which offers an end-to-end solution for the problem of table recognition.
arXiv Detail & Related papers (2021-05-23T21:17:18Z)
TCN: Table Convolutional Network for Web Table Interpretation [52.32515851633981]
We propose a novel table representation learning approach considering both the intra- and inter-table contextual information. Our method can outperform competitive baselines by +4.8% of F1 for column type prediction and by +4.1% of F1 for column pairwise relation prediction.
arXiv Detail & Related papers (2021-02-17T02:18:10Z)
A Graph Representation of Semi-structured Data for Web Question Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations. Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.