DTization: A New Method for Supervised Feature Scaling
- URL: http://arxiv.org/abs/2404.17937v1
- Date: Sat, 27 Apr 2024 15:25:03 GMT
- Title: DTization: A New Method for Supervised Feature Scaling
- Authors: Niful Islam,
- Abstract summary: Feature scaling is one of the data pre-processing techniques that improves the performance of machine learning algorithms.
We have presented a novel feature scaling technique named DTization that employs decision tree and robust scaler for supervised feature scaling.
The results show a noteworthy performance improvement compared to the traditional feature scaling methods.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial intelligence is currently a dominant force in shaping various aspects of the world. Machine learning is a sub-field in artificial intelligence. Feature scaling is one of the data pre-processing techniques that improves the performance of machine learning algorithms. The traditional feature scaling techniques are unsupervised where they do not have influence of the dependent variable in the scaling process. In this paper, we have presented a novel feature scaling technique named DTization that employs decision tree and robust scaler for supervised feature scaling. The proposed method utilizes decision tree to measure the feature importance and based on the importance, different features get scaled differently with the robust scaler algorithm. The proposed method has been extensively evaluated on ten classification and regression datasets on various evaluation matrices and the results show a noteworthy performance improvement compared to the traditional feature scaling methods.
Related papers
- Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution [38.53065398127086]
This study investigates the potential of feature attribution methods to filter out uninformative features in input data for regression problems.
We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space.
To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
arXiv Detail & Related papers (2024-09-25T09:50:51Z) - Adaptive Optimization Algorithms for Machine Learning [0.0]
Machine learning assumes a pivotal role in our data-driven world.
This thesis contributes novel insights, introduces new algorithms with improved convergence guarantees, and improves analyses of popular practical algorithms.
arXiv Detail & Related papers (2023-11-16T21:22:47Z) - The choice of scaling technique matters for classification performance [6.745479230590518]
We compare the impact of 5 scaling techniques on the performances of 20 classification algorithms among monolithic and ensemble models.
Results show that the performance difference between the best and the worst scaling technique is relevant and statistically significant in most cases.
We also show how the performance variation of an ensemble model, considering different scaling techniques, tends to be dictated by that of its base model.
arXiv Detail & Related papers (2022-12-23T13:51:45Z) - Stabilizing Q-learning with Linear Architectures for Provably Efficient
Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation.
We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - Adaptive Hierarchical Similarity Metric Learning with Noisy Labels [138.41576366096137]
We propose an Adaptive Hierarchical Similarity Metric Learning method.
It considers two noise-insensitive information, textiti.e., class-wise divergence and sample-wise consistency.
Our method achieves state-of-the-art performance compared with current deep metric learning approaches.
arXiv Detail & Related papers (2021-10-29T02:12:18Z) - Powerpropagation: A sparsity inducing weight reparameterisation [65.85142037667065]
We introduce Powerpropagation, a new weight- parameterisation for neural networks that leads to inherently sparse models.
Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely.
Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark.
arXiv Detail & Related papers (2021-10-01T10:03:57Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs [15.689418447376587]
Training artificial neural networks requires the optimization of highly non-dimensional loss functions.
Visualization tools have played a key role in uncovering key geometric characteristics of loss-landscape of ANNs.
We propose the modernity reduction method which represents the SOTA in terms both local and global structures.
arXiv Detail & Related papers (2021-01-31T16:30:50Z) - Feature space approximation for kernel-based supervised learning [2.653409741248232]
The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity.
We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set.
The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.
arXiv Detail & Related papers (2020-11-25T11:23:58Z) - Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling.
It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.