Tab2KG: Semantic Table Interpretation with Lightweight Semantic Profiles
- URL: http://arxiv.org/abs/2302.01150v1
- Date: Thu, 2 Feb 2023 15:12:30 GMT
- Title: Tab2KG: Semantic Table Interpretation with Lightweight Semantic Profiles
- Authors: Simon Gottschalk, Elena Demidova
- Abstract summary: This article proposes Tab2KG - a novel method that targets at the semantic interpretation of tables with previously unseen data.
We introduce original semantic profiles that enrich a domain's concepts and relations and represent domain and table characteristics.
In contrast to the existing semantic table interpretation approaches, Tab2KG relies on the semantic profiles only and does not require any instance lookup.
- Score: 3.655021726150368
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tabular data plays an essential role in many data analytics and machine
learning tasks. Typically, tabular data does not possess any machine-readable
semantics. In this context, semantic table interpretation is crucial for making
data analytics workflows more robust and explainable. This article proposes
Tab2KG - a novel method that targets at the interpretation of tables with
previously unseen data and automatically infers their semantics to transform
them into semantic data graphs. We introduce original lightweight semantic
profiles that enrich a domain ontology's concepts and relations and represent
domain and table characteristics. We propose a one-shot learning approach that
relies on these profiles to map a tabular dataset containing previously unseen
instances to a domain ontology. In contrast to the existing semantic table
interpretation approaches, Tab2KG relies on the semantic profiles only and does
not require any instance lookup. This property makes Tab2KG particularly
suitable in the data analytics context, in which data tables typically contain
new instances. Our experimental evaluation on several real-world datasets from
different application domains demonstrates that Tab2KG outperforms
state-of-the-art semantic table interpretation baselines.
Related papers
- UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition [55.153629718464565]
We introduce UniTabNet, a novel framework for table structure parsing based on the image-to-text model.
UniTabNet employs a divide-and-conquer'' strategy, utilizing an image-to-text model to decouple table cells and integrating both physical and logical decoders to reconstruct the complete table structure.
arXiv Detail & Related papers (2024-09-20T01:26:32Z) - Training-Free Generalization on Heterogeneous Tabular Data via
Meta-Representation [67.30538142519067]
We propose Tabular data Pre-Training via Meta-representation (TabPTM)
A deep neural network is then trained to associate these meta-representations with dataset-specific classification confidences.
Experiments validate that TabPTM achieves promising performance in new datasets, even under few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T18:03:54Z) - SEMv2: Table Separation Line Detection Based on Instance Segmentation [96.36188168694781]
We propose an accurate table structure recognizer, termed SEMv2 (SEM: Split, Embed and Merge)
We address the table separation line instance-level discrimination problem and introduce a table separation line detection strategy based on conditional convolution.
To comprehensively evaluate the SEMv2, we also present a more challenging dataset for table structure recognition, dubbed iFLYTAB.
arXiv Detail & Related papers (2023-03-08T05:15:01Z) - Data augmentation on graphs for table type classification [1.1859913430860336]
We address the classification of tables using a Graph Neural Network, exploiting the table structure for the message passing algorithm in use.
We achieve promising preliminary results, proposing a data augmentation method suitable for graph-based table representation.
arXiv Detail & Related papers (2022-08-23T21:54:46Z) - Table Retrieval May Not Necessitate Table-specific Model Design [83.27735758203089]
We focus on the task of table retrieval, and ask: "is table-specific model design necessary for table retrieval?"
Based on an analysis on a table-based portion of the Natural Questions dataset (NQ-table), we find that structure plays a negligible role in more than 70% of the cases.
We then experiment with three modules to explicitly encode table structures, namely auxiliary row/column embeddings, hard attention masks, and soft relation-based attention biases.
None of these yielded significant improvements, suggesting that table-specific model design may not be necessary for table retrieval.
arXiv Detail & Related papers (2022-05-19T20:35:23Z) - TABBIE: Pretrained Representations of Tabular Data [22.444607481407633]
We devise a simple pretraining objective that learns exclusively from tabular data.
Unlike competing approaches, our model (TABBIE) provides embeddings of all table substructures.
A qualitative analysis of our model's learned cell, column, and row representations shows that it understands complex table semantics and numerical trends.
arXiv Detail & Related papers (2021-05-06T11:15:16Z) - Semantic Labeling Using a Deep Contextualized Language Model [9.719972529205101]
We propose a context-aware semantic labeling method using both the column values and context.
Our new method is based on a new setting for semantic labeling, where we sequentially predict labels for an input table with missing headers.
To our knowledge, we are the first to successfully apply BERT to solve the semantic labeling task.
arXiv Detail & Related papers (2020-10-30T03:04:22Z) - A Graph Representation of Semi-structured Data for Web Question
Answering [96.46484690047491]
We propose a novel graph representation of Web tables and lists based on a systematic categorization of the components in semi-structured data as well as their relations.
Our method improves F1 score by 3.90 points over the state-of-the-art baselines.
arXiv Detail & Related papers (2020-10-14T04:01:54Z) - TabEAno: Table to Knowledge Graph Entity Annotation [7.451544182579802]
We propose a novel approach, namely TabEAno, to semantically annotate table rows toward knowledge graph entities.
We introduce a "two-cells" lookup strategy bases on the assumption that there is an existing logical relation occurring in the knowledge graph between the two closed cells in the same row of the table.
Despite the simplicity of the approach, TabEAno outperforms the state of the art approaches in the two standard datasets.
arXiv Detail & Related papers (2020-10-05T07:39:02Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.