Related papers: Neuro-Symbolic Entropy Regularization

Neuro-Symbolic Entropy Regularization

URL: http://arxiv.org/abs/2201.11250v1
Date: Tue, 25 Jan 2022 06:23:10 GMT
Title: Neuro-Symbolic Entropy Regularization
Authors: Kareem Ahmed, Eric Wang, Kai-Wei Chang, Guy Van den Broeck
Abstract summary: In structured prediction, the goal is to jointly predict many output variables that together encode a structured object. One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
Score: 78.16196949641079
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In structured prediction, the goal is to jointly predict many output variables that together encode a structured object -- a path in a graph, an entity-relation triple, or an ordering of objects. Such a large output space makes learning hard and requires vast amounts of labeled data. Different approaches leverage alternate sources of supervision. One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions. It extracts supervision from unlabeled examples, but remains agnostic to the structure of the output space. Conversely, neuro-symbolic approaches exploit the knowledge that not every prediction corresponds to a valid structure in the output space. Yet, they does not further restrict the learned output distribution. This paper introduces a framework that unifies both approaches. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object. It is obtained by restricting entropy regularization to the distribution over only valid structures. This loss is efficiently computed when the output constraint is expressed as a tractable logic circuit. Moreover, it seamlessly integrates with other neuro-symbolic losses that eliminate invalid predictions. We demonstrate the efficacy of our approach on a series of semi-supervised and fully-supervised structured-prediction experiments, where we find that it leads to models whose predictions are more accurate and more likely to be valid.

Related papers

Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning [26.3914014514629]
In scientific domains -- from biology to the social sciences -- many questions boil down to textitWhat effect will we observe if we intervene on a particular variable?<n>We propose using meta-learning to create an end-to-end model: the Model-Averaged Causal Estimation Transformer Neural Process (MACE-TNP)<n>Our work establishes meta-learning as a flexible and scalable paradigm for approximating complex Bayesian causal inference.
arXiv Detail & Related papers (2025-07-07T22:48:32Z)
ACTIVA: Amortized Causal Effect Estimation without Graphs via Transformer-based Variational Autoencoder [7.987204219322316]
We propose a novel conditional variational autoencoder architecture, named ACTIVA, that extends causal transformer encoders to predict causal effects as mixtures of Gaussians. Our method requires no causal graph and predicts interventional distributions given only observational data and a queried intervention. By amortizing over many simulated instances, it enables zero-shot generalization to novel datasets without retraining.
arXiv Detail & Related papers (2025-03-03T08:28:25Z)
Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction [5.38274042816001]
In observational data, the distribution shift is often driven by unobserved confounding factors. This motivates us to study the domain adaptation problem with observational data. We show a model that uses the learned lower-dimensional subspace can incur nearly ideal gap between target and source risk.
arXiv Detail & Related papers (2024-06-22T17:43:08Z)
Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z)
Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training. It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby. It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z)
A Pseudo-Semantic Loss for Autoregressive Models with Logical Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning. We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution. We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z)
Uncovering the Missing Pattern: Unified Framework Towards Trajectory Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences. Current methods often assume that the observed sequences are complete while ignoring the potential for missing values. This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Unifying supervised learning and VAEs -- coverage, systematics and goodness-of-fit in normalizing-flow based neural network models for astro-particle reconstructions [0.0]
Statistical uncertainties, coverage, systematic uncertainties or a goodness-of-fit measure are often not calculated. We show that a KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning and variational autoencoders. We discuss how to calculate coverage probabilities without numerical integration for specific "base-ordered" contours.
arXiv Detail & Related papers (2020-08-13T11:28:57Z)
Learning Output Embeddings in Structured Prediction [73.99064151691597]
A powerful and flexible approach to structured prediction consists in embedding the structured objects to be predicted into a feature space of possibly infinite dimension. A prediction in the original space is computed by solving a pre-image problem. In this work, we propose to jointly learn a finite approximation of the output embedding and the regression function into the new feature space.
arXiv Detail & Related papers (2020-07-29T09:32:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.