Related papers: Pre and Post Counting for Scalable Statistical-Relational Model Discovery

Pre and Post Counting for Scalable Statistical-Relational Model Discovery

URL: http://arxiv.org/abs/2110.09767v1
Date: Tue, 19 Oct 2021 07:03:35 GMT
Title: Pre and Post Counting for Scalable Statistical-Relational Model Discovery
Authors: Richard Mar and Oliver Schulte
Abstract summary: Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data. As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts. This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning.
Score: 19.18886406228943
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Statistical-Relational Model Discovery aims to find statistically relevant patterns in relational data. For example, a relational dependency pattern may stipulate that a user's gender is associated with the gender of their friends. As with propositional (non-relational) graphical models, the major scalability bottleneck for model discovery is computing instantiation counts: the number of times a relational pattern is instantiated in a database. Previous work on propositional learning utilized pre-counting or post-counting to solve this task. This paper takes a detailed look at the memory and speed trade-offs between pre-counting and post-counting strategies for relational learning. A pre-counting approach computes and caches instantiation counts for a large set of relational patterns before model search. A post-counting approach computes an instantiation count dynamically on-demand for each candidate pattern generated during the model search. We describe a novel hybrid approach, tailored to relational data, that achieves a sweet spot with pre-counting for patterns involving positive relationships (e.g. pairs of users who are friends) and post-counting for patterns involving negative relationships (e.g. pairs of users who are not friends). Our hybrid approach scales model discovery to millions of data facts.

Related papers

GenZ: Foundational models as latent variable generators within traditional statistical models [7.74887919885246]
We present GenZ, a hybrid model that bridges foundational models and statistical modeling through interpretable semantic features.<n>Our approach addresses this by discovering semantic feature descriptions through an iterative process.<n>For Netflix movie embeddings, our model predicts collaborative filtering representations with 0.59 cosine similarity purely from semantic descriptions.
arXiv Detail & Related papers (2025-12-31T12:56:01Z)
ARIES: Relation Assessment and Model Recommendation for Deep Time Series Forecasting [54.57031153712623]
ARIES is a framework for assessing relation between time series properties and modeling strategies.<n>We propose the first deep forecasting model recommender, capable of providing interpretable suggestions for real-world time series.
arXiv Detail & Related papers (2025-09-07T13:57:14Z)
Intention-Conditioned Flow Occupancy Models [69.79049994662591]
Large-scale pre-training has fundamentally changed how machine learning research is done today.<n>Applying this same framework to reinforcement learning is appealing because it offers compelling avenues for addressing core challenges in RL.<n>Recent advances in generative AI have provided new tools for modeling highly complex distributions.
arXiv Detail & Related papers (2025-06-10T15:27:46Z)
Human Guided Learning of Transparent Regression Models [4.592493651895646]
We present a human-in-the-loop (HIL) approach to permutation regression. The model is a gradient boosted regression model that incorporates simple human-understandable constraints. The approach, HuGuR, lets a human explore the search space of such transparent regression models.
arXiv Detail & Related papers (2025-02-21T23:15:12Z)
Probabilistic Modeling for Sequences of Sets in Continuous-Time [14.423456635520084]
We develop a general framework for modeling set-valued data in continuous-time. We also develop inference methods that can use such models to answer probabilistic queries.
arXiv Detail & Related papers (2023-12-22T20:16:10Z)
A Federated Data Fusion-Based Prognostic Model for Applications with Multi-Stream Incomplete Signals [1.2277343096128712]
This article proposes a federated prognostic model that allows multiple users to jointly construct a failure time prediction model. Numerical studies indicate that the performance of the proposed model is the same as that of classic non-federated prognostic models.
arXiv Detail & Related papers (2023-11-13T17:08:34Z)
Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution. We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well. Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z)
A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search. We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z)
Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning [109.7767515627765]
We propose a new semiparametric paradigm of retrieval-enhanced prompt tuning for relation extraction. Our model infers relation through knowledge stored in the weights during training. Our method can achieve state-of-the-art in both standard supervised and few-shot settings.
arXiv Detail & Related papers (2022-05-04T23:38:37Z)
Can I see an Example? Active Learning the Long Tail of Attributes and Relations [64.50739983632006]
We introduce a novel incremental active learning framework that asks for attributes and relations in visual scenes. While conventional active learning methods ask for labels of specific examples, we flip this framing to allow agents to ask for examples from specific categories. Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome.
arXiv Detail & Related papers (2022-03-11T19:28:19Z)
Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles [66.15398165275926]
We propose a method that can automatically detect and ignore dataset-specific patterns, which we call dataset biases. Our method trains a lower capacity model in an ensemble with a higher capacity model. We show improvement in all settings, including a 10 point gain on the visual question answering dataset.
arXiv Detail & Related papers (2020-11-07T22:20:03Z)
Clustering-based Unsupervised Generative Relation Extraction [3.342376225738321]
We propose a Clustering-based Unsupervised generative Relation Extraction framework (CURE) We use an "Encoder-Decoder" architecture to perform self-supervised learning so the encoder can extract relation information. Our model performs better than state-of-the-art models on both New York Times (NYT) and United Nations Parallel Corpus (UNPC) standard datasets.
arXiv Detail & Related papers (2020-09-26T20:36:40Z)
NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG) Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset. We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
Overcoming Statistical Shortcuts for Open-ended Visual Counting [54.858754825838865]
We aim to develop models that learn a proper mechanism of counting regardless of the output label. First, we propose the Modifying Count Distribution protocol, which penalizes models that over-rely on statistical shortcuts. Secondly, we introduce the Spatial Counting Network (SCN), which is dedicated to visual analysis and counting based on natural language questions.
arXiv Detail & Related papers (2020-06-17T18:02:01Z)
Neural Relation Prediction for Simple Question Answering over Knowledge Graph [0.0]
We propose an instance-based method to capture the underlying relation of question and to this aim, we detect matching paraphrases of a new question. Our experiments on the SimpleQuestions dataset show that the proposed model achieves better accuracy compared to the state-of-the-art relation extraction models.
arXiv Detail & Related papers (2020-02-18T16:41:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.