Related papers: A Relative Ignorability Framework for Decision-Relevant Observability in Control Theory and Reinforcement Learning

Related papers

Supervised Optimism Correction: Be Confident When LLMs Are Sure [91.7459076316849]
We establish a novel theoretical connection between supervised fine-tuning and offline reinforcement learning.<n>We show that the widely used beam search method suffers from unacceptable over-optimism.<n>We propose Supervised Optimism Correction, which introduces a simple yet effective auxiliary loss for token-level $Q$-value estimations.
arXiv Detail & Related papers (2025-04-10T07:50:03Z)
Uncertainty quantification for Markov chains with application to temporal difference learning [63.49764856675643]
We develop novel high-dimensional concentration inequalities and Berry-Esseen bounds for vector- and matrix-valued functions of Markov chains. We analyze the TD learning algorithm, a widely used method for policy evaluation in reinforcement learning.
arXiv Detail & Related papers (2025-02-19T15:33:55Z)
Identifiability Guarantees for Causal Disentanglement from Purely Observational Data [10.482728002416348]
Causal disentanglement aims to learn about latent causal factors behind data.<n>Recent advances establish identifiability results assuming that interventions on (single) latent factors are available.<n>We provide a precise characterization of latent factors that can be identified in nonlinear causal models.
arXiv Detail & Related papers (2024-10-31T04:18:29Z)
Bounds on the Generalization Error in Active Learning [0.0]
We establish empirical risk principles for active learning by deriving a family of upper bounds on the generalization error. We systematically link diverse active learning scenarios, characterized by their loss functions and hypothesis classes to their corresponding upper bounds. Our results show that regularization techniques used to constraint the complexity of various hypothesis classes are sufficient conditions to ensure the validity of the bounds.
arXiv Detail & Related papers (2024-09-10T08:08:09Z)
Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning [26.34622544479565]
Causal dynamics learning is a promising approach to enhancing robustness in reinforcement learning. We propose a novel model that infers fine-grained causal structures and employs them for prediction.
arXiv Detail & Related papers (2024-06-05T13:13:58Z)
Learning Latent Graph Structures and their Uncertainty [63.95971478893842]
Graph Neural Networks (GNNs) use relational information as an inductive bias to enhance the model's accuracy. As task-relevant relations might be unknown, graph structure learning approaches have been proposed to learn them while solving the downstream prediction task.
arXiv Detail & Related papers (2024-05-30T10:49:22Z)
Explainability through uncertainty: Trustworthy decision-making with neural networks [1.104960878651584]
Uncertainty is a key feature of any machine learning model. It is particularly important in neural networks, which tend to be overconfident. Uncertainty as XAI improves the model's trustworthiness in downstream decision-making tasks.
arXiv Detail & Related papers (2024-03-15T10:22:48Z)
Extending Complex Logical Queries on Uncertain Knowledge Graphs [50.360531130930646]
The study of machine learning-based logical query answering enables reasoning with large-scale and incomplete knowledge graphs.<n>We propose a neural symbolic approach that incorporates both forward inference and backward calibration to answer soft queries on large-scale, incomplete, and uncertain knowledge graphs.
arXiv Detail & Related papers (2024-03-03T13:13:53Z)
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning [83.41487567765871]
Skipper is a model-based reinforcement learning framework. It automatically generalizes the task given into smaller, more manageable subtasks. It enables sparse decision-making and focused abstractions on the relevant parts of the environment.
arXiv Detail & Related papers (2023-09-30T02:25:18Z)
Topology-aware Robust Optimization for Out-of-distribution Generalization [18.436575017126323]
Out-of-distribution (OOD) generalization is a challenging machine learning problem yet highly desirable in many high-stake applications. We propose topology-aware robust optimization (TRO) that seamlessly integrates distributional topology in a principled optimization framework. We theoretically demonstrate the effectiveness of our approach and empirically show that it significantly outperforms the state of the arts in a wide range of tasks including classification, regression, and semantic segmentation.
arXiv Detail & Related papers (2023-07-26T03:48:37Z)
Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation [57.351098530477124]
We consider one critical type of robustness against spurious correlation, where different portions of the state do not have correlations induced by unobserved confounders. A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one. Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge.
arXiv Detail & Related papers (2023-07-15T23:53:37Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training [14.871738070617491]
We show that inconsistency is a more reliable indicator of generalization gap than the sharpness of the loss landscape. The results also provide a theoretical basis for existing methods such as co-distillation and ensemble.
arXiv Detail & Related papers (2023-05-31T20:28:13Z)
Fine-grained analysis of non-parametric estimation for pairwise learning [9.676007573960383]
We are concerned with the generalization performance of non-parametric estimation for pairwise learning. Our results can be used to handle a wide range of pairwise learning problems including ranking, AUC, pairwise regression and metric and similarity learning.
arXiv Detail & Related papers (2023-05-31T08:13:14Z)
Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous Unobserved Confounders [16.193776814471768]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders. We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z)
HiURE: Hierarchical Exemplar Contrastive Learning for Unsupervised Relation Extraction [60.80849503639896]
Unsupervised relation extraction aims to extract the relationship between entities from natural language sentences without prior information on relational scope or distribution. We propose a novel contrastive learning framework named HiURE, which has the capability to derive hierarchical signals from relational feature space using cross hierarchy attention. Experimental results on two public datasets demonstrate the advanced effectiveness and robustness of HiURE on unsupervised relation extraction when compared with state-of-the-art models.
arXiv Detail & Related papers (2022-05-04T17:56:48Z)
Pessimistic Q-Learning for Offline Reinforcement Learning: Towards Optimal Sample Complexity [51.476337785345436]
We study a pessimistic variant of Q-learning in the context of finite-horizon Markov decision processes. A variance-reduced pessimistic Q-learning algorithm is proposed to achieve near-optimal sample complexity.
arXiv Detail & Related papers (2022-02-28T15:39:36Z)
Optimal variance-reduced stochastic approximation in Banach spaces [114.8734960258221]
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. We establish non-asymptotic bounds for both the operator defect and the estimation error.
arXiv Detail & Related papers (2022-01-21T02:46:57Z)
Leveraging Unlabeled Data for Entity-Relation Extraction through Probabilistic Constraint Satisfaction [54.06292969184476]
We study the problem of entity-relation extraction in the presence of symbolic domain knowledge. Our approach employs semantic loss which captures the precise meaning of a logical sentence. With a focus on low-data regimes, we show that semantic loss outperforms the baselines by a wide margin.
arXiv Detail & Related papers (2021-03-20T00:16:29Z)
Disentangling Observed Causal Effects from Latent Confounders using Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions. We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z)
Robust Unsupervised Learning via L-Statistic Minimization [38.49191945141759]
We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. We prove uniform convergence bounds with respect to the proposed criterion for several popular models in unsupervised learning.
arXiv Detail & Related papers (2020-12-14T10:36:06Z)
Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions. We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems. This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.