Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent
- URL: http://arxiv.org/abs/2107.05011v1
- Date: Sun, 11 Jul 2021 10:33:02 GMT
- Title: Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent
- Authors: Qiyou Duan and Hadi Ghauch and Taejoon Kim
- Abstract summary: Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables.
We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method.
It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
- Score: 8.714458129632158
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data representation techniques have made a substantial contribution to
advancing data processing and machine learning (ML). Improving predictive power
was the focus of previous representation techniques, which unfortunately
perform rather poorly on the interpretability in terms of extracting underlying
insights of the data. Recently, Kolmogorov model (KM) was studied, which is an
interpretable and predictable representation approach to learning the
underlying probabilistic structure of a set of random variables. The existing
KM learning algorithms using semi-definite relaxation with randomization
(SDRwR) or discrete monotonic optimization (DMO) have, however, limited utility
to big data applications because they do not scale well computationally. In
this paper, we propose a computationally scalable KM learning algorithm, based
on the regularized dual optimization combined with enhanced gradient descent
(GD) method. To make our method more scalable to large-dimensional problems, we
propose two acceleration schemes, namely, eigenvalue decomposition (EVD)
elimination strategy and proximal EVD algorithm. Furthermore, a thresholding
technique by exploiting the approximation error analysis and leveraging the
normalized Minkowski $\ell_1$-norm and its bounds, is provided for the
selection of the number of iterations of the proximal EVD algorithm. When
applied to big data applications, it is demonstrated that the proposed method
can achieve compatible training/prediction performance with significantly
reduced computational complexity; roughly two orders of magnitude improvement
in terms of the time overhead, compared to the existing KM learning algorithms.
Furthermore, it is shown that the accuracy of logical relation mining for
interpretability by using the proposed KM learning algorithm exceeds $80\%$.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - Randomized Dimension Reduction with Statistical Guarantees [0.27195102129095]
This thesis explores some of such algorithms for fast execution and efficient data utilization.
We focus on learning algorithms with various incorporations of data augmentation that improve generalization and distributional provably.
Specifically, Chapter 4 presents a sample complexity analysis for data augmentation consistency regularization.
arXiv Detail & Related papers (2023-10-03T02:01:39Z) - Learning the hub graphical Lasso model with the structured sparsity via
an efficient algorithm [1.0923877073891446]
We introduce a two-phase algorithm to estimate hub graphical models.
The proposed algorithm first generates a good initial point via a dual alternating direction method of multipliers.
It then warms a semismooth Newton (SSN) based augmented Lagrangian method (ALM) to compute a solution that is accurate enough for practical tasks.
arXiv Detail & Related papers (2023-08-17T08:24:28Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Scalable computation of prediction intervals for neural networks via
matrix sketching [79.44177623781043]
Existing algorithms for uncertainty estimation require modifying the model architecture and training procedure.
This work proposes a new algorithm that can be applied to a given trained neural network and produces approximate prediction intervals.
arXiv Detail & Related papers (2022-05-06T13:18:31Z) - Adaptive First- and Second-Order Algorithms for Large-Scale Machine
Learning [3.0204520109309843]
We consider first- and second-order techniques to address continuous optimization problems in machine learning.
In the first-order case, we propose a framework of transition from semi-deterministic to quadratic regularization methods.
In the second-order case, we propose a novel first-order algorithm with adaptive sampling and adaptive step size.
arXiv Detail & Related papers (2021-11-29T18:10:00Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL)
The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.