Related papers: Full-Span Log-Linear Model and Fast Learning Algorithm

Full-Span Log-Linear Model and Fast Learning Algorithm

URL: http://arxiv.org/abs/2202.08472v1
Date: Thu, 17 Feb 2022 06:51:59 GMT
Title: Full-Span Log-Linear Model and Fast Learning Algorithm
Authors: Kazuya Takabatake, Shotaro Akaho
Abstract summary: The full-span log-linear(FSLL) model is considered an $n$-th order Boltzmann machine. The FSLL model has $|X|-1$ parameters and can represent arbitrary positive distributions of $X$. Experiments presented that the FSLL successfully learned six training datasets such that $|X|=220$ within one minute with a laptop PC.
Score: 1.9798034349981162
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The full-span log-linear(FSLL) model introduced in this paper is considered an $n$-th order Boltzmann machine, where $n$ is the number of all variables in the target system. Let $X=(X_0,...,X_{n-1})$ be finite discrete random variables that can take $|X|=|X_0|...|X_{n-1}|$ different values. The FSLL model has $|X|-1$ parameters and can represent arbitrary positive distributions of $X$. The FSLL model is a "highest-order" Boltzmann machine; nevertheless, we can compute the dual parameters of the model distribution, which plays important roles in exponential families, in $O(|X|\log|X|)$ time. Furthermore, using properties of the dual parameters of the FSLL model, we can construct an efficient learning algorithm. The FSLL model is limited to small probabilistic models up to $|X|\approx2^{25}$; however, in this problem domain, the FSLL model flexibly fits various true distributions underlying the training data without any hyperparameter tuning. The experiments presented that the FSLL successfully learned six training datasets such that $|X|=2^{20}$ within one minute with a laptop PC.

Related papers

Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction [57.93371273485736]
We consider a centralized distributed learning setup where all workers jointly find an unbiased bound LDeltaepsilon2,$ better poly-logarithmically in $n$, even in the homogeneous (i.i.d.) case, where all workers access the same distribution.
arXiv Detail & Related papers (2025-06-30T13:27:39Z)
Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$. For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z)
IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift. IT$3$ is based on the universal property of idempotence. We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization [65.8915778873691]
conditional distributions is a central problem in machine learning.<n>We propose a new paradigm that integrates both paired and unpaired data.<n>We show that our approach can theoretically recover true conditional distributions with arbitrarily small error.
arXiv Detail & Related papers (2024-10-03T16:12:59Z)
Transformer In-Context Learning for Categorical Data [51.23121284812406]
We extend research on understanding Transformers through the lens of in-context learning with functional data by considering categorical outcomes, nonlinear underlying models, and nonlinear attention. We present what is believed to be the first real-world demonstration of this few-shot-learning methodology, using the ImageNet dataset.
arXiv Detail & Related papers (2024-05-27T15:03:21Z)
Agnostic Active Learning of Single Index Models with Linear Sample Complexity [27.065175036001246]
We study active learning methods for single index models of the form $F(mathbf x) = f(langle mathbf w, mathbf xrangle)$.
arXiv Detail & Related papers (2024-05-15T13:11:28Z)
Convergence Analysis of Probability Flow ODE for Score-based Generative Models [5.939858158928473]
We study the convergence properties of deterministic samplers based on probability flow ODEs from both theoretical and numerical perspectives.<n>We prove the total variation between the target and the generated data distributions can be bounded above by $mathcalO(d3/4delta1/2)$ in the continuous time level.
arXiv Detail & Related papers (2024-04-15T12:29:28Z)
Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution. We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z)
Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix. It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$ Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z)
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models [14.708385906024546]
We introduce a new family of physics-inspired generative models termed PFGM++. These models realize generative trajectories for $N$ dimensional data by embedding paths in $N+D$ dimensional space. We show that models with finite $D$ can be superior to previous state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-02-08T18:58:02Z)
Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture. We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z)
TURF: A Two-factor, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$. We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model [3.749193647980305]
This paper considers a Markov decision process (MDP) that admits a set of state-action features. We show that a model-based approach (resp.$$Q-learning) provably learns an $varepsilon$-optimal policy with high probability.
arXiv Detail & Related papers (2021-05-28T17:49:39Z)
An Algorithm for Learning Smaller Representations of Models With Scarce Data [0.0]
We present a greedy algorithm for solving binary classification problems in situations where the dataset is too small or not fully representative. It relies on a trained model with loose accuracy constraints, an iterative hyperparameter pruning procedure, and a function used to generate new data.
arXiv Detail & Related papers (2020-10-15T19:17:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.