Related papers: Probability Distribution Learning and Its Application in Deep Learning

Probability Distribution Learning and Its Application in Deep Learning

URL: http://arxiv.org/abs/2406.05666v9
Date: Thu, 19 Dec 2024 12:13:26 GMT
Title: Probability Distribution Learning and Its Application in Deep Learning
Authors: Binchuan Qi,
Abstract summary: This paper introduces a novel theoretical learning framework, termed probability distribution learning (PD learning)<n>PD learning focuses on learning the underlying probability distribution, which is modeled as a random variable within the probability simplex.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper introduces a novel theoretical learning framework, termed probability distribution learning (PD learning). Departing from the traditional statistical learning framework, PD learning focuses on learning the underlying probability distribution, which is modeled as a random variable within the probability simplex. In this framework, the optimization objective is the learning error, which quantifies the posterior expected discrepancy between the model's predicted distribution and the underlying true distribution, given available sample data and prior knowledge. To optimize the learning error, this paper proposes the necessary conditions for loss functions, models, and optimization algorithms, ensuring that these conditions are met in real-world machine learning scenarios. Based on these conditions, the non-convex optimization mechanism corresponding to model training can be theoretically resolved. Moreover, this paper provides model-dependent and model-independent bounds on learning error, offering new insights into the model's fitting and generalization capabilities. Furthermore, the paper applies the PD learning framework to elucidate the mechanisms by which various techniques, including random parameter initialization, over-parameterization, and dropout, influence deep model training. Finally, the paper substantiates the key conclusions of the proposed framework through experimental results.

Related papers

Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
On uniqueness in structured model learning [0.542249320079018]
This paper addresses the problem of uniqueness in learning physical laws for systems of partial differential equations (PDEs) It considers a framework of structured model learning, where existing, approximately correct physical models are augmented with components that are learned from data. The uniqueness result shows that, in the idealized setting of full, noiseless measurements, a unique identification of the unknown model components is possible.
arXiv Detail & Related papers (2024-10-29T12:56:39Z)
A Probabilistic Perspective on Unlearning and Alignment for Large Language Models [48.96686419141881]
We introduce the first formal probabilistic evaluation framework in Large Language Models (LLMs) We derive novel metrics with high-probability guarantees concerning the output distribution of a model. Our metrics are application-independent and allow practitioners to make more reliable estimates about model capabilities before deployment.
arXiv Detail & Related papers (2024-10-04T15:44:23Z)
Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale [53.152460508207184]
Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. This paper proposes a novel approach that considers multiple prediction hypotheses for each sample and investigates the rationale behind each hypothesis. To achieve the optimal performance, we propose a three-step adaptation process: model pre-adaptation, hypothesis consolidation, and semi-supervised learning.
arXiv Detail & Related papers (2024-02-02T05:53:22Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime. We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z)
A Mathematical Framework for Learning Probability Distributions [0.0]
generative modeling and density estimation has become an immensely popular subject in recent years. This paper provides a mathematical framework such that all the well-known models can be derived based on simple principles. In particular, we prove that these models enjoy implicit regularization during training, so that the generalization error at early-stopping avoids the curse of dimensionality.
arXiv Detail & Related papers (2022-12-22T04:41:45Z)
Correcting Model Bias with Sparse Implicit Processes [0.9187159782788579]
We show that Sparse Implicit Processes (SIP) is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model. We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.
arXiv Detail & Related papers (2022-07-21T18:00:01Z)
Learning Neural Models for Natural Language Processing in the Face of Distributional Shift [10.990447273771592]
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications. It builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. It is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime
arXiv Detail & Related papers (2021-09-03T14:29:20Z)
MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio. We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z)
Transferring model structure in Bayesian transfer learning for Gaussian process regression [1.370633147306388]
This paper defines the task of conditioning a target probability distribution on a transferred source distribution. Fully probabilistic design is adopted to solve this optimal decision-making problem in the target. By successfully transferring higher moments of the source, the target can reject unreliable source knowledge.
arXiv Detail & Related papers (2021-01-18T05:28:02Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Bootstrapped model learning and error correction for planning with uncertainty in model-based RL [1.370633147306388]
A natural aim is to learn a model that reflects accurately the dynamics of the environment. This paper explores the problem of model misspecification through uncertainty-aware reinforcement learning agents. We propose a bootstrapped multi-headed neural network that learns the distribution of future states and rewards.
arXiv Detail & Related papers (2020-04-15T15:41:21Z)
A comprehensive study on the prediction reliability of graph neural networks for virtual screening [0.0]
We investigate the effects of model architectures, regularization methods, and loss functions on the prediction performance and reliability of classification results. Our result highlights that correct choice of regularization and inference methods is evidently important to achieve high success rate.
arXiv Detail & Related papers (2020-03-17T10:13:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.