Scalable Optimal Margin Distribution Machine
- URL: http://arxiv.org/abs/2305.04837v4
- Date: Sun, 11 Jun 2023 05:55:54 GMT
- Title: Scalable Optimal Margin Distribution Machine
- Authors: Yilin Wang, Nan Cao, Teng Zhang, Xuanhua Shi and Hai Jin
- Abstract summary: Optimal margin Distribution Machine (ODM) is a newly proposed statistical learning framework rooting in the novel margin theory.
This paper proposes a scalable ODM, which can achieve nearly ten times speedup compared to the original ODM training method.
- Score: 50.281535710689795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Optimal margin Distribution Machine (ODM) is a newly proposed statistical
learning framework rooting in the novel margin theory, which demonstrates
better generalization performance than the traditional large margin based
counterparts. Nonetheless, it suffers from the ubiquitous scalability problem
regarding both computation time and memory as other kernel methods. This paper
proposes a scalable ODM, which can achieve nearly ten times speedup compared to
the original ODM training method. For nonlinear kernels, we propose a novel
distribution-aware partition method to make the local ODM trained on each
partition be close and converge fast to the global one. When linear kernel is
applied, we extend a communication efficient SVRG method to accelerate the
training further. Extensive empirical studies validate that our proposed method
is highly computational efficient and almost never worsen the generalization.
Related papers
- The Stochastic Conjugate Subgradient Algorithm For Kernel Support Vector Machines [1.738375118265695]
This paper proposes an innovative method specifically designed for kernel support vector machines (SVMs)
It not only achieves faster iteration per iteration but also exhibits enhanced convergence when compared to conventional SFO techniques.
Our experimental results demonstrate that the proposed algorithm not only maintains but potentially exceeds the scalability of SFO methods.
arXiv Detail & Related papers (2024-07-30T17:03:19Z) - Blind Super-Resolution via Meta-learning and Markov Chain Monte Carlo Simulation [46.5310645609264]
We propose a Meta-learning and Markov Chain Monte Carlo based SISR approach to learn kernel priors from organized randomness.
A lightweight network is adopted as kernel generator, and is optimized via learning from the MCMC simulation on random Gaussian distributions.
A meta-learning-based alternating optimization procedure is proposed to optimize the kernel generator and image restorer.
arXiv Detail & Related papers (2024-06-13T07:50:15Z) - Sparsity-Aware Distributed Learning for Gaussian Processes with Linear
Multiple Kernel [22.23550794664218]
This paper presents a novel GP linear multiple kernel (LMK) and a generic sparsity-aware distributed learning framework.
The framework incorporates a quantized alternating direction method of multipliers (ADMM) for collaborative learning among multiple agents.
Experiments on diverse datasets demonstrate the superior prediction performance and efficiency of our proposed methods.
arXiv Detail & Related papers (2023-09-15T07:05:33Z) - Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning.
Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Distributionally Robust Federated Averaging [19.875176871167966]
We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling.
We give corroborating experimental evidence for our theoretical results in federated learning settings.
arXiv Detail & Related papers (2021-02-25T03:32:09Z) - Adaptive Serverless Learning [114.36410688552579]
We propose a novel adaptive decentralized training approach, which can compute the learning rate from data dynamically.
Our theoretical results reveal that the proposed algorithm can achieve linear speedup with respect to the number of workers.
To reduce the communication-efficient overhead, we further propose a communication-efficient adaptive decentralized training approach.
arXiv Detail & Related papers (2020-08-24T13:23:02Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Large Batch Training Does Not Need Warmup [111.07680619360528]
Training deep neural networks using a large batch size has shown promising results and benefits many real-world applications.
In this paper, we propose a novel Complete Layer-wise Adaptive Rate Scaling (CLARS) algorithm for large-batch training.
Based on our analysis, we bridge the gap and illustrate the theoretical insights for three popular large-batch training techniques.
arXiv Detail & Related papers (2020-02-04T23:03:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.