Pre and Post Counting for Scalable Statistical-Relational Model
Discovery
- URL: http://arxiv.org/abs/2110.09767v1
- Date: Tue, 19 Oct 2021 07:03:35 GMT
- Title: Pre and Post Counting for Scalable Statistical-Relational Model
Discovery
- Authors: Richard Mar and Oliver Schulte
- Abstract summary: Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data.
As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts.
This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning.
- Score: 19.18886406228943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Statistical-Relational Model Discovery aims to find statistically relevant
patterns in relational data. For example, a relational dependency pattern may
stipulate that a user's gender is associated with the gender of their friends.
As with propositional (non-relational) graphical models, the major scalability
bottleneck for model discovery is computing instantiation counts: the number of
times a relational pattern is instantiated in a database. Previous work on
propositional learning utilized pre-counting or post-counting to solve this
task. This paper takes a detailed look at the memory and speed trade-offs
between pre-counting and post-counting strategies for relational learning. A
pre-counting approach computes and caches instantiation counts for a large set
of relational patterns before model search. A post-counting approach computes
an instantiation count dynamically on-demand for each candidate pattern
generated during the model search. We describe a novel hybrid approach,
tailored to relational data, that achieves a sweet spot with pre-counting for
patterns involving positive relationships (e.g. pairs of users who are friends)
and post-counting for patterns involving negative relationships (e.g. pairs of
users who are not friends). Our hybrid approach scales model discovery to
millions of data facts.
Related papers
- Probabilistic Modeling for Sequences of Sets in Continuous-Time [14.423456635520084]
We develop a general framework for modeling set-valued data in continuous-time.
We also develop inference methods that can use such models to answer probabilistic queries.
arXiv Detail & Related papers (2023-12-22T20:16:10Z) - A Federated Data Fusion-Based Prognostic Model for Applications with Multi-Stream Incomplete Signals [1.2277343096128712]
This article proposes a federated prognostic model that allows multiple users to jointly construct a failure time prediction model.
Numerical studies indicate that the performance of the proposed model is the same as that of classic non-federated prognostic models.
arXiv Detail & Related papers (2023-11-13T17:08:34Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt
Tuning [109.7767515627765]
We propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction.
Our model infers relation through knowledge stored in the weights during training.
Our method can achieve state-of-the-art in both standard supervised and few-shot settings.
arXiv Detail & Related papers (2022-05-04T23:38:37Z) - Can I see an Example? Active Learning the Long Tail of Attributes and
Relations [64.50739983632006]
We introduce a novel incremental active learning framework that asks for attributes and relations in visual scenes.
While conventional active learning methods ask for labels of specific examples, we flip this framing to allow agents to ask for examples from specific categories.
Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome.
arXiv Detail & Related papers (2022-03-11T19:28:19Z) - Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases.
Our method trains a lower capacity model in an ensemble with a higher capacity model.
We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z) - Clustering-based Unsupervised Generative Relation Extraction [3.342376225738321]
We propose a Clustering-based Unsupervised generative Relation Extraction framework (CURE)
We use an "Encoder-Decoder" architecture to perform self-supervised learning so the encoder can extract relation information.
Our model performs better than state-of-the-art models on both New York Times (NYT) and United Nations Parallel Corpus (UNPC) standard datasets.
arXiv Detail & Related papers (2020-09-26T20:36:40Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z) - Overcoming Statistical Shortcuts for Open-ended Visual Counting [54.858754825838865]
We aim to develop models that learn a proper mechanism of counting regardless of the output label.
First, we propose the Modifying Count Distribution protocol, which penalizes models that over-rely on statistical shortcuts.
Secondly, we introduce the Spatial Counting Network (SCN), which is dedicated to visual analysis and counting based on natural language questions.
arXiv Detail & Related papers (2020-06-17T18:02:01Z) - Neural Relation Prediction for Simple Question Answering over Knowledge
Graph [0.0]
We propose an instance-based method to capture the underlying relation of question and to this aim, we detect matching paraphrases of a new question.
Our experiments on the SimpleQuestions dataset show that the proposed model achieves better accuracy compared to the state-of-the-art relation extraction models.
arXiv Detail & Related papers (2020-02-18T16:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.