Related papers: ABM: an automatic supervised feature engineering method for loss based models based on group and fused lasso

ABM: an automatic supervised feature engineering method for loss based models based on group and fused lasso

URL: http://arxiv.org/abs/2009.10498v1
Date: Tue, 22 Sep 2020 12:42:22 GMT
Title: ABM: an automatic supervised feature engineering method for loss based models based on group and fused lasso
Authors: Weijian Luo and Yongxian Long
Abstract summary: A vital problem in solving classification or regression problem is to apply feature engineering and variable selection on data before fed into models. This paper proposes an end-to-end supervised cutting point selection method based on group and lasso fused along with the automatically variable selection effect.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A vital problem in solving classification or regression problem is to apply feature engineering and variable selection on data before fed into models.One of a most popular feature engineering method is to discretisize continous variable with some cutting points,which is refered to as bining processing.Good cutting points are important for improving model's ability, because wonderful bining may ignore some noisy variance in continous variable range and keep useful leveled information with good ordered encodings.However, to our best knowledge a majority of cutting point selection is done via researchers domain knownledge or some naive methods like equal-width cutting or equal-frequency cutting.In this paper we propose an end-to-end supervised cutting point selection method based on group and fused lasso along with the automatically variable selection effect.We name our method \textbf{ABM}(automatic bining machine). We firstly cut each variable range into fine grid bins and train model with our group and group fused lasso regularization on each successive bins.It is a method that integrates feature engineering,variable selection and model training simultanously.And one more inspiring thing is that the method is flexible such that it can be taken into a bunch of loss function based model including deep neural networks.We have also implemented the method in R and open the source code to other researchers.A Python version will also meet the community in days.

Related papers

Meta-Instance Selection. Instance Selection as a Classification Problem with Meta-Features [0.0]
The study proposes an approach involving transforming the instance selection process into a classification task conducted in a unified meta-feature space. The proposed solution achieves results comparable to reference instance selection methods while significantly reducing computational complexity.
arXiv Detail & Related papers (2025-01-20T15:08:19Z)
Learning the Regularization Strength for Deep Fine-Tuning via a Data-Emphasized Variational Objective [4.453137996095194]
grid search is computationally expensive, requires carving out a validation set, and requires practitioners to specify candidate values. Our proposed technique overcomes all three disadvantages of grid search. We demonstrate effectiveness on image classification tasks on several datasets, yielding heldout accuracy comparable to existing approaches.
arXiv Detail & Related papers (2024-10-25T16:32:11Z)
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis [17.989809995141044]
We propose CCA Merge, which is based on Corre Analysis Analysis. We show that CCA works significantly better than past methods when more than 2 models are merged.
arXiv Detail & Related papers (2024-07-07T14:21:04Z)
Feature Selection as Deep Sequential Generative Learning [50.00973409680637]
We develop a deep variational transformer model over a joint of sequential reconstruction, variational, and performance evaluator losses. Our model can distill feature selection knowledge and learn a continuous embedding space to map feature selection decision sequences into embedding vectors associated with utility scores.
arXiv Detail & Related papers (2024-03-06T16:31:56Z)
Merging by Matching Models in Task Parameter Subspaces [87.8712523378141]
Model merging aims to cheaply combine individual task-specific models into a single multitask model. We formalize how this approach to model merging can be seen as solving a linear system of equations. We show that using the conjugate gradient method can outperform closed-form solutions.
arXiv Detail & Related papers (2023-12-07T14:59:15Z)
Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness [86.61582747039053]
Language model training in distributed settings is limited by the communication cost of exchanges. We extend recent work using shared randomness to perform distributed fine-tuning with low bandwidth.
arXiv Detail & Related papers (2023-06-16T17:59:51Z)
MILO: Model-Agnostic Subset Selection Framework for Efficient Model Training and Tuning [68.12870241637636]
We propose MILO, a model-agnostic subset selection framework that decouples the subset selection from model training. Our empirical results indicate that MILO can train models $3times - 10 times$ faster and tune hyperparameters $20times - 75 times$ faster than full-dataset training or tuning without performance.
arXiv Detail & Related papers (2023-01-30T20:59:30Z)
Learning To Cut By Looking Ahead: Cutting Plane Selection via Imitation Learning [80.45697245527019]
We show that a greedy selection rule explicitly looking ahead to select cuts that yield the best bound improvement delivers strong decisions for cut selection. We propose a new neural architecture (NeuralCut) for imitation learning on the lookahead expert.
arXiv Detail & Related papers (2022-06-27T16:07:27Z)
A Framework and Benchmark for Deep Batch Active Learning for Regression [2.093287944284448]
We study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code.
arXiv Detail & Related papers (2022-03-17T16:11:36Z)
A concise method for feature selection via normalized frequencies [0.0]
In this paper, a concise method is proposed for universal feature selection. The proposed method uses a fusion of the filter method and the wrapper method, rather than a combination of them. The evaluation results show that the proposed method outperformed several state-of-the-art related works in terms of accuracy, precision, recall, F-score and AUC.
arXiv Detail & Related papers (2021-06-10T15:29:54Z)
Embedded methods for feature selection in neural networks [0.0]
Black box models like neural networks negatively affect the interpretability, generalizability, and the training time of these models. I propose two integrated approaches for feature selection that can be incorporated directly into the parameter learning. I benchmarked both the methods against Permutation Feature Importance (PFI) - a general-purpose feature ranking method and a random baseline.
arXiv Detail & Related papers (2020-10-12T16:33:46Z)
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning [100.83444258562263]
We propose a novel Bayesian optimization (BO) algorithm to tackle the challenge of model selection in this setting. In order to solve the resulting multiple black-box function optimization problem jointly and efficiently, we exploit potential correlations among black-box functions. We are the first to formulate the problem of stepwise model selection (SMS) for sequence prediction, and to design and demonstrate an efficient joint-learning algorithm for this purpose.
arXiv Detail & Related papers (2020-01-12T09:42:19Z)
Model Fusion via Optimal Transport [64.13185244219353]
We present a layer-wise model fusion algorithm for neural networks. We show that this can successfully yield "one-shot" knowledge transfer between neural networks trained on heterogeneous non-i.i.d. data.
arXiv Detail & Related papers (2019-10-12T22:07:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.