Data augmentation on graphs for table type classification
- URL: http://arxiv.org/abs/2208.11210v1
- Date: Tue, 23 Aug 2022 21:54:46 GMT
- Title: Data augmentation on graphs for table type classification
- Authors: Davide del Bimbo and Andrea Gemelli and Simone Marinai
- Abstract summary: We address the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use.
We achieve promising preliminary results, proposing a data augmentation method suitable for graph-based table representation.
- Score: 1.1859913430860336
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Tables are widely used in documents because of their compact and structured
representation of information. In particular, in scientific papers, tables can
sum up novel discoveries and summarize experimental results, making the
research comparable and easily understandable by scholars. Since the layout of
tables is highly variable, it would be useful to interpret their content and
classify them into categories. This could be helpful to directly extract
information from scientific papers, for instance comparing performance of some
models given their paper result tables. In this work, we address the
classification of tables using a Graph Neural Network, exploiting the table
structure for the message passing algorithm in use. We evaluate our model on a
subset of the Tab2Know dataset. Since it contains few examples manually
annotated, we propose data augmentation techniques directly on the table graph
structures. We achieve promising preliminary results, proposing a data
augmentation method suitable for graph-based table representation.
Related papers
- TabGraphs: A Benchmark and Strong Baselines for Learning on Graphs with Tabular Node Features [17.277932238538302]
Tabular machine learning may benefit from graph machine learning methods.
graph neural networks (GNNs) can indeed often bring gains in predictive performance.
Simple feature preprocessing enables them to compete with and even outperform GNNs.
arXiv Detail & Related papers (2024-09-22T15:53:19Z) - UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - StructChart: Perception, Structuring, Reasoning for Visual Chart
Understanding [58.38480335579541]
Current chart-related tasks focus on either chart perception which refers to extracting information from the visual charts, or performing reasoning given the extracted data.
In this paper, we aim to establish a unified and label-efficient learning paradigm for joint perception and reasoning tasks.
Experiments are conducted on various chart-related tasks, demonstrating the effectiveness and promising potential for a unified chart perception-reasoning paradigm.
arXiv Detail & Related papers (2023-09-20T12:51:13Z) - TabGSL: Graph Structure Learning for Tabular Data Prediction [10.66048003460524]
We present a novel solution, Tabular Graph Structure Learning (TabGSL), to enhance tabular data prediction.
Experiments conducted on 30 benchmark datasets demonstrate that TabGSL markedly outperforms both tree-based models and recent deep learning-based models.
arXiv Detail & Related papers (2023-05-25T08:33:48Z) - Doc2SoarGraph: Discrete Reasoning over Visually-Rich Table-Text
Documents via Semantic-Oriented Hierarchical Graphs [79.0426838808629]
We propose TAT-DQA, i.e. to answer the question over a visually-rich table-text document.
Specifically, we propose a novel Doc2SoarGraph framework with enhanced discrete reasoning capability.
We conduct extensive experiments on TAT-DQA dataset, and the results show that our proposed framework outperforms the best baseline model by 17.73% and 16.91% in terms of Exact Match (EM) and F1 score respectively on the test set.
arXiv Detail & Related papers (2023-05-03T07:30:32Z) - Graph Neural Networks and Representation Embedding for Table Extraction
in PDF Documents [1.1859913430860336]
The main contribution of this work is to tackle the problem of table extraction, exploiting Graph Neural Networks.
We experimentally evaluated the proposed approach on a new dataset obtained by merging the information provided in the PubLayNet and PubTables-1M datasets.
arXiv Detail & Related papers (2022-08-23T21:36:01Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - Generating Table Vector Representations [11.092714216647245]
This paper is an evaluation of methods for table-to-class annotation.
We provide a formal definition for table classification as a machine learning task.
arXiv Detail & Related papers (2021-10-28T14:05:21Z) - TGRNet: A Table Graph Reconstruction Network for Table Structure
Recognition [76.06530816349763]
We propose an end-to-end trainable table graph reconstruction network (TGRNet) for table structure recognition.
Specifically, the proposed method has two main branches, a cell detection branch and a cell logical location branch, to jointly predict the spatial location and the logical location of different cells.
arXiv Detail & Related papers (2021-06-20T01:57:05Z) - Retrieving Complex Tables with Multi-Granular Graph Representation
Learning [20.72341939868327]
The task of natural language table retrieval seeks to retrieve semantically relevant tables based on natural language queries.
Existing learning systems treat tables as plain text based on the assumption that tables are structured as dataframes.
We propose Graph-based Table Retrieval (GTR), a generalizable NLTR framework with multi-granular graph representation learning.
arXiv Detail & Related papers (2021-05-04T20:19:03Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.