Condensed Gradient Boosting
- URL: http://arxiv.org/abs/2211.14599v2
- Date: Tue, 14 May 2024 07:51:38 GMT
- Title: Condensed Gradient Boosting
- Authors: Seyedsaman Emami, Gonzalo Martínez-Muñoz,
- Abstract summary: We propose the use of multi-output regressors as base models to handle the multi-class problem as a single task.
An extensive comparison with other multi-ouptut based gradient boosting methods is carried out in terms of generalization and computational efficiency.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents a computationally efficient variant of gradient boosting for multi-class classification and multi-output regression tasks. Standard gradient boosting uses a 1-vs-all strategy for classifications tasks with more than two classes. This strategy translates in that one tree per class and iteration has to be trained. In this work, we propose the use of multi-output regressors as base models to handle the multi-class problem as a single task. In addition, the proposed modification allows the model to learn multi-output regression problems. An extensive comparison with other multi-ouptut based gradient boosting methods is carried out in terms of generalization and computational efficiency. The proposed method showed the best trade-off between generalization ability and training and predictions speeds.
Related papers
- Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method [53.170053108447455]
Ensemble learning is a method that leverages weak learners to produce a strong learner.
We design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative.
We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets.
arXiv Detail & Related papers (2024-08-06T03:42:38Z) - Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy [67.45518210171024]
Dynamic computation methods have shown notable acceleration for Large Language Models (LLMs) by skipping several layers of computations.
We propose a Unified Layer Skipping strategy, which selects the number of layers to skip computation based solely on the target speedup ratio.
Experimental results on two common tasks, i.e., machine translation and text summarization, indicate that given a target speedup ratio, the Unified Layer Skipping strategy significantly enhances both the inference performance and the actual model throughput.
arXiv Detail & Related papers (2024-04-10T12:12:07Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Adapting tree-based multiple imputation methods for multi-level data? A
simulation study [0.0]
This simulation study evaluates the effectiveness of multiple imputation techniques for multilevel data.
It compares the performance of traditional Multiple Imputation by Chained Equations (MICE) with tree-based methods.
arXiv Detail & Related papers (2024-01-25T13:12:50Z) - Class Gradient Projection For Continual Learning [99.105266615448]
Catastrophic forgetting is one of the most critical challenges in Continual Learning (CL)
We propose Class Gradient Projection (CGP), which calculates the gradient subspace from individual classes rather than tasks.
arXiv Detail & Related papers (2023-11-25T02:45:56Z) - Generalization for multiclass classification with overparameterized
linear models [3.3434274586532515]
We show that multiclass classification behaves like binary classification in that, as long as there are not too many classes, it is possible to generalize well.
Besides various technical challenges, it turns out that the key difference from the binary classification setting is that there are relatively fewer positive training examples of each class in the multiclass setting as the number of classes increases.
arXiv Detail & Related papers (2022-06-03T05:52:43Z) - GuideBP: Guiding Backpropagation Through Weaker Pathways of Parallel
Logits [6.764324841419295]
The proposed approach guides the gradients of backpropagation along weakest concept representations.
A weakness scores defines the class specific performance of individual pathways which is then used to create a logit.
The proposed approach has been shown to perform better than traditional column merging techniques.
arXiv Detail & Related papers (2021-04-23T14:14:00Z) - Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results.
Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples.
Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z) - A Multilevel Approach to Training [0.0]
We propose a novel training method based on nonlinear multilevel techniques, commonly used for solving discretized large scale partial differential equations.
Our multilevel training method constructs a multilevel hierarchy by reducing the number of samples.
The training of the original model is then enhanced by internally training surrogate models constructed with fewer samples.
arXiv Detail & Related papers (2020-06-28T13:34:48Z) - Variance Reduction with Sparse Gradients [82.41780420431205]
Variance reduction methods such as SVRG and SpiderBoost use a mixture of large and small batch gradients.
We introduce a new sparsity operator: The random-top-k operator.
Our algorithm consistently outperforms SpiderBoost on various tasks including image classification, natural language processing, and sparse matrix factorization.
arXiv Detail & Related papers (2020-01-27T08:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.