Local Adaptivity of Gradient Boosting in Histogram Transform Ensemble
Learning
- URL: http://arxiv.org/abs/2112.02589v1
- Date: Sun, 5 Dec 2021 14:56:56 GMT
- Title: Local Adaptivity of Gradient Boosting in Histogram Transform Ensemble
Learning
- Authors: Hanyuan Hang
- Abstract summary: gradient boosting algorithm called textitadaptive boosting histogram transform (textitABHT)
We show that our ABHT can filter out the regions with different orders of smoothness.
- Score: 5.241402683680909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a gradient boosting algorithm called
\textit{adaptive boosting histogram transform} (\textit{ABHT}) for regression
to illustrate the local adaptivity of gradient boosting algorithms in histogram
transform ensemble learning. From the theoretical perspective, when the target
function lies in a locally H\"older continuous space, we show that our ABHT can
filter out the regions with different orders of smoothness. Consequently, we
are able to prove that the upper bound of the convergence rates of ABHT is
strictly smaller than the lower bound of \textit{parallel ensemble histogram
transform} (\textit{PEHT}). In the experiments, both synthetic and real-world
data experiments empirically validate the theoretical results, which
demonstrates the advantageous performance and local adaptivity of our ABHT.
Related papers
- Enabling Tensor Decomposition for Time-Series Classification via A Simple Pseudo-Laplacian Contrast [26.28414569796961]
We propose a novel Pseudo Laplacian Contrast (PLC) tensor decomposition framework.
It integrates the data augmentation and cross-view Laplacian to enable the extraction of class-aware representations.
Experiments on various datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-23T16:48:13Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Entropy Transformer Networks: A Learning Approach via Tangent Bundle
Data Manifold [8.893886200299228]
This paper focuses on an accurate and fast approach for image transformation employed in the design of CNN architectures.
A novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions.
Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks.
arXiv Detail & Related papers (2023-07-24T04:21:51Z) - Gradient Boosted Binary Histogram Ensemble for Large-scale Regression [60.16351608335641]
We propose a gradient boosting algorithm for large-scale regression problems called textitGradient Boosted Binary Histogram Ensemble (GBBHE) based on binary histogram partition and ensemble learning.
In the experiments, compared with other state-of-the-art algorithms such as gradient boosted regression tree (GBRT), our GBBHE algorithm shows promising performance with less running time on large-scale datasets.
arXiv Detail & Related papers (2021-06-03T17:05:40Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Feature Whitening via Gradient Transformation for Improved Convergence [3.5579740292581]
We address the complexity drawbacks of feature whitening.
We derive an equivalent method, which replaces the sample transformations by a transformation to the weight gradients, applied to every batch of B samples.
We exemplify the proposed algorithms with ResNet-based networks for image classification demonstrated on the CIFAR and Imagenet datasets.
arXiv Detail & Related papers (2020-10-04T11:30:20Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.