Related papers: Towards the Causal Complete Cause of Multi-Modal Representation Learning

Related papers

A Relative-Budget Theory for Reinforcement Learning with Verifiable Rewards in Large Language Model Reasoning [48.70183357021465]
Reinforcement learning (RL) is a dominant paradigm for improving the reasoning abilities of large language models.<n>We propose a emphrelative-budget theory explaining this variation through a single quantity called relative budget $:= H/mathbbE[T]$.<n>We show that $$ determines sample efficiency by controlling reward variance and the likelihood of informative trajectories.
arXiv Detail & Related papers (2026-02-02T01:31:52Z)
Reliable Use of Lemmas via Eligibility Reasoning and Section$-$Aware Reinforcement Learning [27.01879432423409]
Recent large language models often misapply lemmas, importing conclusions without validating assumptions.<n>We present RULES, which encodes this specification via a two$-$section output and trains with reinforcement learning.<n>Training and evaluation draw on diverse natural language and formal proof corpora.
arXiv Detail & Related papers (2026-02-01T03:34:30Z)
Truth in the Few: High-Value Data Selection for Efficient Multi-Modal Reasoning [71.3533541927459]
We propose a novel data selection paradigm termed Activation Reasoning Potential (RAP)<n>RAP identifies cognitive samples by estimating each sample's potential to stimulate genuine multi-modal reasoning.<n>Our RAP method consistently achieves superior performance using only 9.3% of the training data, while reducing computational costs by over 43%.
arXiv Detail & Related papers (2025-06-05T08:40:24Z)
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits [58.63897489864948]
Reinforcement learning with outcome-based feedback faces a fundamental challenge.<n>How do we assign credit to the right actions?<n>This paper provides the first comprehensive analysis of this problem in online RL with general function approximation.
arXiv Detail & Related papers (2025-05-26T17:44:08Z)
Provably Sample-Efficient Robust Reinforcement Learning with Average Reward [4.530028899565083]
We propose a new algorithm designed for robust Markov Decision Processes (MDPs) with transition uncertainty characterized by $ell_p$-norm and contamination models.<n>Our algorithm operates without requiring any prior knowledge of the robust MDP.<n>Our work provides essential theoretical understanding of sample efficiency of robust average reward RL.
arXiv Detail & Related papers (2025-05-18T15:34:45Z)
Exogenous Isomorphism for Counterfactual Identifiability [2.209921757303168]
We introduce isomorphism and propose $sim_mathrmEI$-identifiability, reflecting the strength of model identifiability required for $sim_mathcalL_3$-identifiability.<n>Our results unify and generalize existing theories, providing theoretical guarantees for practical applications.
arXiv Detail & Related papers (2025-05-04T18:24:15Z)
Towards Understanding the Benefit of Multitask Representation Learning in Decision Process [47.57497888077687]
Multitask Representation Learning (MRL) has emerged as a prevalent technique to improve sample efficiency in Reinforcement Learning (RL)<n>Our work tries to fill the gap by extending the analysis to textitunknown non-linear representations, giving a comprehensive analysis for its mechanism in online and transfer learning setting.<n>We formally prove that this approach yields a regret upper bound that outperforms the lower bound associated with learning $M$ separate tasks, marking the first demonstration of MRL's efficacy in a general function class.
arXiv Detail & Related papers (2025-03-01T04:29:22Z)
Computational-Statistical Tradeoffs at the Next-Token Prediction Barrier: Autoregressive and Imitation Learning under Misspecification [50.717692060500696]
Next-token prediction with the logarithmic loss is a cornerstone of autoregressive sequence modeling.<n>Next-token prediction can be made robust so as to achieve $C=tilde O(H)$, representing moderate error amplification.<n>No computationally efficient algorithm can achieve sub-polynomial approximation factor $C=e(log H)1-Omega(1)$.
arXiv Detail & Related papers (2025-02-18T02:52:00Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Partial Identifiability and Misspecification in Inverse Reinforcement Learning [64.13583792391783]
The aim of Inverse Reinforcement Learning is to infer a reward function $R$ from a policy $pi$. This paper provides a comprehensive analysis of partial identifiability and misspecification in IRL.
arXiv Detail & Related papers (2024-11-24T18:35:46Z)
FLARE: Faithful Logic-Aided Reasoning and Exploration [50.9814063216852]
We introduce a novel approach for traversing the problem space using task decompositions.<n>We use the Large Language Models to plan a solution, soft-formalise the query into facts and predicates using a logic programming code.<n>Our method allows us to compute the faithfulness of the reasoning process w.r.t. the generated code and analyse the steps of the multi-hop search without relying on external solvers.
arXiv Detail & Related papers (2024-10-14T19:39:11Z)
Theoretical limits of descending $\ell_0$ sparse-regression ML algorithms [0.0]
We develop a generic analytical program for studying performance of the emphmaximum-likelihood (ML) decoding. Key ML performance parameter, the residual emphroot mean square error ($textbfRMSE$) is uncovered to exhibit the so-called emphphase-transition (PT) phenomenon. Concrete implementation and practical relevance of the Fl RDT typically rely on the ability to conduct a sizeable set of the underlying numerical evaluations.
arXiv Detail & Related papers (2024-10-10T06:33:41Z)
Identifiable Causal Representation Learning: Unsupervised, Multi-View, and Multi-Environment [10.814585613336778]
Causal representation learning aims to combine the core strengths of machine learning and causality. This thesis investigates what is possible for CRL without direct supervision, and thus contributes to its theoretical foundations.
arXiv Detail & Related papers (2024-06-19T09:14:40Z)
What Are the Odds? Language Models Are Capable of Probabilistic Reasoning [23.487484744911995]
We focus on evaluating the probabilistic reasoning capabilities of language models (LMs) using idealized and real-world statistical distributions. We perform a systematic evaluation of state-of-the-art LMs on three tasks: estimating percentiles, drawing samples, and calculating probabilities.
arXiv Detail & Related papers (2024-06-18T17:51:24Z)
Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption [60.958746600254884]
This study tackles the challenges of adversarial corruption in model-based reinforcement learning (RL) We introduce an algorithm called corruption-robust optimistic MLE (CR-OMLE), which leverages total-variation (TV)-based information ratios as uncertainty weights for MLE. We extend our weighting technique to the offline setting, and propose an algorithm named corruption-robust pessimistic MLE (CR-PMLE)
arXiv Detail & Related papers (2024-02-14T07:27:30Z)
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning [54.585428241509234]
We propose R$3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL) RL employs only outcome supervision to achieve the benefits of process supervision for large language models.
arXiv Detail & Related papers (2024-02-08T16:46:26Z)
Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space. However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts. We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z)
On Learning Latent Models with Multi-Instance Weak Supervision [57.18649648182171]
We consider a weakly supervised learning scenario where the supervision signal is generated by a transition function $sigma$ labels associated with multiple input instances. Our problem is met in different fields, including latent structural learning and neuro-symbolic integration.
arXiv Detail & Related papers (2023-06-23T22:05:08Z)
Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models [15.817239008727789]
In this work, we analyze a specific type of causal query called domain counterfactuals, which hypothesizes what a sample would have looked like if it had been generated in a different domain. We show that recovering the latent Structural Causal Model (SCM) is unnecessary for estimating domain counterfactuals. We also develop a theoretically grounded practical algorithm that simplifies the modeling process to generative model estimation.
arXiv Detail & Related papers (2023-06-20T04:19:06Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Nonparametric Identifiability of Causal Representations from Unknown Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables. Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z)
On the Provable Advantage of Unsupervised Pretraining [26.065736182939222]
Unsupervised pretraining is a critical component of modern large-scale machine learning systems. This paper studies a generic framework, where the unsupervised representation learning task is specified by an abstract class of latent variable models. Under a mild ''informative'' condition, our algorithm achieves an excess risk of $tildemathcalO(sqrtmathcalC_Phi/m + sqrtmathcalC_Psi/n)$ for downstream tasks.
arXiv Detail & Related papers (2023-03-02T20:42:05Z)
On the Stability and Generalization of Triplet Learning [55.75784102837832]
Triplet learning, i.e. learning from triplet data, has attracted much attention in computer vision tasks. This paper investigates the generalization guarantees of triplet learning by leveraging the stability analysis.
arXiv Detail & Related papers (2023-02-20T07:32:50Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Measuring Model Fairness under Noisy Covariates: A Theoretical Perspective [26.704446184314506]
We study the problem of measuring the fairness of a machine learning model under noisy information. We present a theoretical analysis that aims to characterize weaker conditions under which accurate fairness evaluation is possible.
arXiv Detail & Related papers (2021-05-20T18:36:28Z)
Modal Uncertainty Estimation via Discrete Latent Representation [4.246061945756033]
We introduce a deep learning framework that learns the one-to-many mappings between the inputs and outputs, together with faithful uncertainty measures. Our framework demonstrates significantly more accurate uncertainty estimation than the current state-of-the-art methods.
arXiv Detail & Related papers (2020-07-25T05:29:34Z)
The Information Bottleneck Problem and Its Applications in Machine Learning [53.57797720793437]
Inference capabilities of machine learning systems skyrocketed in recent years, now playing a pivotal role in various aspect of society. The information bottleneck (IB) theory emerged as a bold information-theoretic paradigm for analyzing deep learning (DL) systems. In this tutorial we survey the information-theoretic origins of this abstract principle, and its recent impact on DL.
arXiv Detail & Related papers (2020-04-30T16:48:51Z)
Weighted Empirical Risk Minimization: Sample Selection Bias Correction based on Importance Sampling [2.599882743586164]
We consider statistical learning problems when the distribution $P'$ of the training observations $Z'_i$ differs from the distribution $P'$ involved in the risk one seeks to minimize. We show that, in various situations frequently encountered in practice, it takes a simple form and can be directly estimated from the $Phi(z)$. We then prove that the capacity generalization of the approach aforementioned is preserved when plugging the resulting estimates of the $Phi(Z'_i)$'s into the weighted empirical risk.
arXiv Detail & Related papers (2020-02-12T18:42:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.