Improve Deep Forest with Learnable Layerwise Augmentation Policy
Schedule
- URL: http://arxiv.org/abs/2309.09030v1
- Date: Sat, 16 Sep 2023 15:54:25 GMT
- Title: Improve Deep Forest with Learnable Layerwise Augmentation Policy
Schedule
- Authors: Hongyu Zhu, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali yuan, Shi-Lin
Wang, Guang Cheng
- Abstract summary: This paper presents an optimized Deep Forest, featuring learnable, layerwise data augmentation policy schedules.
We introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate overfitting and develop a population-based search algorithm to tailor augmentation intensity for each layer.
Experimental results show that our method sets new state-of-the-art (SOTA) benchmarks in various classification tasks, outperforming shallow tree ensembles, deep forests, deep neural network, and AutoML competitors.
- Score: 22.968268349995853
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a modern ensemble technique, Deep Forest (DF) employs a cascading
structure to construct deep models, providing stronger representational power
compared to traditional decision forests. However, its greedy multi-layer
learning procedure is prone to overfitting, limiting model effectiveness and
generalizability. This paper presents an optimized Deep Forest, featuring
learnable, layerwise data augmentation policy schedules. Specifically, We
introduce the Cut Mix for Tabular data (CMT) augmentation technique to mitigate
overfitting and develop a population-based search algorithm to tailor
augmentation intensity for each layer. Additionally, we propose to incorporate
outputs from intermediate layers into a checkpoint ensemble for more stable
performance. Experimental results show that our method sets new
state-of-the-art (SOTA) benchmarks in various tabular classification tasks,
outperforming shallow tree ensembles, deep forests, deep neural network, and
AutoML competitors. The learned policies also transfer effectively to Deep
Forest variants, underscoring its potential for enhancing non-differentiable
deep learning modules in tabular signal processing.
Related papers
- Informed deep hierarchical classification: a non-standard analysis inspired approach [0.0]
It consists in a multi-output deep neural network equipped with specific projection operators placed before each output layer.
The design of such an architecture, called lexicographic hybrid deep neural network (LH-DNN), has been possible by combining tools from different and quite distant research fields.
To assess the efficacy of the approach, the resulting network is compared against the B-CNN, a convolutional neural network tailored for hierarchical classification tasks.
arXiv Detail & Related papers (2024-09-25T14:12:50Z) - Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later [59.88557193062348]
We revisit the classic Neighborhood Component Analysis (NCA), designed to learn a linear projection that captures semantic similarities between instances.
We find that minor modifications, such as adjustments to the learning objectives and the integration of deep learning architectures, significantly enhance NCA's performance.
We also introduce a neighbor sampling strategy that improves both the efficiency and predictive accuracy of our proposed ModernNCA.
arXiv Detail & Related papers (2024-07-03T16:38:57Z) - ForensicsForest Family: A Series of Multi-scale Hierarchical Cascade Forests for Detecting GAN-generated Faces [53.739014757621376]
We describe a simple and effective forest-based method set called em ForensicsForest Family to detect GAN-generate faces.
ForenscisForest is a newly proposed Multi-scale Hierarchical Cascade Forest.
Hybrid ForensicsForest integrates the CNN layers into models.
Divide-and-Conquer ForensicsForest can construct a forest model using only a portion of training samplings.
arXiv Detail & Related papers (2023-08-02T06:41:19Z) - Interpreting Deep Forest through Feature Contribution and MDI Feature
Importance [6.475147482292634]
Deep forest is a non-differentiable deep model which has achieved impressive empirical success across a wide variety of applications.
Many of the application fields prefer explainable models, such as random forests with feature contributions that can provide local explanation for each prediction.
We propose our feature contribution and MDI feature importance calculation tools for deep forest.
arXiv Detail & Related papers (2023-05-01T13:10:24Z) - WLD-Reg: A Data-dependent Within-layer Diversity Regularizer [98.78384185493624]
Neural networks are composed of multiple layers arranged in a hierarchical structure jointly trained with a gradient-based optimization.
We propose to complement this traditional 'between-layer' feedback with additional 'within-layer' feedback to encourage the diversity of the activations within the same layer.
We present an extensive empirical study confirming that the proposed approach enhances the performance of several state-of-the-art neural network models in multiple tasks.
arXiv Detail & Related papers (2023-01-03T20:57:22Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Residual Likelihood Forests [19.97069303172077]
This paper presents a novel ensemble learning approach called Residual Likelihood Forests (RLF)
Our weak learners produce conditional likelihoods that are sequentially optimized using global loss in the context of previous learners.
When compared against several ensemble approaches including Random Forests and Gradient Boosted Trees, RLFs offer a significant improvement in performance.
arXiv Detail & Related papers (2020-11-04T00:59:41Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Streaming Active Deep Forest for Evolving Data Stream Classification [9.273077240506016]
Streaming Deep Forest (SDF) is a high-performance deep ensemble method specially adapted to stream classification.
We also present the Augmented Variable Uncertainty (AVU) active learning strategy to reduce the labeling cost in the streaming context.
arXiv Detail & Related papers (2020-02-26T22:00:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.