Unraveling the graph structure of tabular datasets through Bayesian and
spectral analysis
- URL: http://arxiv.org/abs/2110.01421v1
- Date: Mon, 4 Oct 2021 12:51:55 GMT
- Title: Unraveling the graph structure of tabular datasets through Bayesian and
spectral analysis
- Authors: Bruno Messias F. de Resende, Eric K. Tokuda, Luciano da Fontoura Costa
- Abstract summary: We show that the inference of the hierarchical modular structure obtained by the nested block model (nSBM) can help us identify the classes of features and unravel non-trivial relationships.
We analyzed a socioeconomic survey conducted with students in Brazil: the PeNSE survey.
- Score: 3.128267020893596
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the big-data age tabular datasets are being generated and analyzed
everywhere. As a consequence, finding and understanding the relationships
between the features of these datasets are of great relevance. Here, to
encompass these relationships we propose a methodology that maps an entire
tabular dataset or just an observation into a weighted directed graph using the
Shapley additive explanations technique. With this graph of relationships, we
show that the inference of the hierarchical modular structure obtained by the
nested stochastic block model (nSBM) as well as the study of the spectral space
of the magnetic Laplacian can help us identify the classes of features and
unravel non-trivial relationships. As a case study, we analyzed a socioeconomic
survey conducted with students in Brazil: the PeNSE survey. The spectral
embedding of the columns suggested that questions related to physical
activities form a separate group. The application of the nSBM approach,
corroborated with that and allowed complementary findings about the modular
structure: some groups of questions showed a high adherence with the divisions
qualitatively defined by the designers of the survey. However, questions from
the class \textit{Safety} were partly grouped by our method in the class
\textit{Drugs}. Surprisingly, by inspecting these questions, we observed that
they were related to both these topics, suggesting an alternative
interpretation of these questions. Our method can provide guidance for tabular
data analysis as well as the design of future surveys.
Related papers
- Survey on Semantic Interpretation of Tabular Data: Challenges and Directions [2.324913904215885]
This survey aims to provide a comprehensive overview of the Semantic Table Interpretation landscape.
It starts by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations.
It also examines available tools, assessing them based on 12 criteria.
arXiv Detail & Related papers (2024-11-07T14:28:56Z) - Dissecting embedding method: learning higher-order structures from data [0.0]
Geometric deep learning methods for data learning often include set of assumptions on the geometry of the feature space.
These assumptions together with data being discrete and finite can cause some generalisations, which are likely to create wrong interpretations of the data and models outputs.
arXiv Detail & Related papers (2024-10-14T08:19:39Z) - Integrating Large Language Models with Graph-based Reasoning for Conversational Question Answering [58.17090503446995]
We focus on a conversational question answering task which combines the challenges of understanding questions in context and reasoning over evidence gathered from heterogeneous sources like text, knowledge graphs, tables, and infoboxes.
Our method utilizes a graph structured representation to aggregate information about a question and its context.
arXiv Detail & Related papers (2024-06-14T13:28:03Z) - Geometric Relational Embeddings: A Survey [39.57716353191535]
We survey methods that underly geometric relational embeddings and categorize them based on the embedding geometries that are used to represent the data.
We identify the desired properties (i.e., inductive biases) of each kind of embedding and discuss some potential future work.
arXiv Detail & Related papers (2023-04-24T09:33:30Z) - Modeling Relational Patterns for Logical Query Answering over Knowledge Graphs [29.47155614953955]
We develop a novel query embedding method, RoConE, that defines query regions as geometric cones and algebraic query operators by rotations in complex space.
Our experimental results on several benchmark datasets confirm the advantage of relational patterns for enhancing logical query answering task.
arXiv Detail & Related papers (2023-03-21T13:59:15Z) - ACTIVE:Augmentation-Free Graph Contrastive Learning for Partial
Multi-View Clustering [52.491074276133325]
We propose an augmentation-free graph contrastive learning framework to solve the problem of partial multi-view clustering.
The proposed approach elevates instance-level contrastive learning and missing data inference to the cluster-level, effectively mitigating the impact of individual missing data on clustering.
arXiv Detail & Related papers (2022-03-01T02:32:25Z) - A Survey of Embedding Space Alignment Methods for Language and Knowledge
Graphs [77.34726150561087]
We survey the current research landscape on word, sentence and knowledge graph embedding algorithms.
We provide a classification of the relevant alignment techniques and discuss benchmark datasets used in this field of research.
arXiv Detail & Related papers (2020-10-26T16:08:13Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - INFOTABS: Inference on Tables as Semi-structured Data [39.84930221015755]
We introduce a new dataset called INFOTABS, comprising of human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes.
Our analysis shows that the semi-structured, multi-domain and heterogeneous nature of the premises admits complex, multi-faceted reasoning.
Experiments reveal that, while human annotators agree on the relationships between a table-hypothesis pair, several standard modeling strategies are unsuccessful at the task.
arXiv Detail & Related papers (2020-05-13T02:07:54Z) - A Revised Generative Evaluation of Visual Dialogue [80.17353102854405]
We propose a revised evaluation scheme for the VisDial dataset.
We measure consensus between answers generated by the model and a set of relevant answers.
We release these sets and code for the revised evaluation scheme as DenseVisDial.
arXiv Detail & Related papers (2020-04-20T13:26:45Z) - Relational Message Passing for Knowledge Graph Completion [78.47976646383222]
We propose a relational message passing method for knowledge graph completion.
It passes relational messages among edges iteratively to aggregate neighborhood information.
Results show our method outperforms stateof-the-art knowledge completion methods by a large margin.
arXiv Detail & Related papers (2020-02-17T03:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.