Toward Efficient Automated Feature Engineering
- URL: http://arxiv.org/abs/2212.13152v1
- Date: Mon, 26 Dec 2022 13:18:51 GMT
- Title: Toward Efficient Automated Feature Engineering
- Authors: Kafeng Wang, Pengyang Wang and Chengzhong xu
- Abstract summary: Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks.
Current AFE methods mainly focus on improving the effectiveness of the produced features, but ignoring the low-efficiency issue for large-scale deployment.
We construct the AFE pipeline based on reinforcement learning setting, where each feature is assigned an agent to perform feature transformation.
We conduct comprehensive experiments on 36 datasets in terms of both classification and regression tasks.
- Score: 27.47868891738917
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated Feature Engineering (AFE) refers to automatically generate and
select optimal feature sets for downstream tasks, which has achieved great
success in real-world applications. Current AFE methods mainly focus on
improving the effectiveness of the produced features, but ignoring the
low-efficiency issue for large-scale deployment. Therefore, in this work, we
propose a generic framework to improve the efficiency of AFE. Specifically, we
construct the AFE pipeline based on reinforcement learning setting, where each
feature is assigned an agent to perform feature transformation \com{and}
selection, and the evaluation score of the produced features in downstream
tasks serve as the reward to update the policy. We improve the efficiency of
AFE in two perspectives. On the one hand, we develop a Feature Pre-Evaluation
(FPE) Model to reduce the sample size and feature size that are two main
factors on undermining the efficiency of feature evaluation. On the other hand,
we devise a two-stage policy training strategy by running FPE on the
pre-evaluation task as the initialization of the policy to avoid training
policy from scratch. We conduct comprehensive experiments on 36 datasets in
terms of both classification and regression tasks. The results show $2.9\%$
higher performance in average and 2x higher computational efficiency comparing
to state-of-the-art AFE methods.
Related papers
- Learning Reward for Robot Skills Using Large Language Models via Self-Alignment [11.639973274337274]
Large Language Models (LLM) contain valuable task-related knowledge that can potentially aid in the learning of reward functions.
We propose a method to learn rewards more efficiently in the absence of humans.
arXiv Detail & Related papers (2024-05-12T04:57:43Z) - Adaptive Testing Environment Generation for Connected and Automated
Vehicles with Dense Reinforcement Learning [7.6589102528398065]
We develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models.
We propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency.
arXiv Detail & Related papers (2024-02-29T15:42:33Z) - Sample Efficient Reinforcement Learning by Automatically Learning to
Compose Subtasks [3.1594865504808944]
We propose an RL algorithm that automatically structure the reward function for sample efficiency, given a set of labels that signify subtasks.
We evaluate our algorithm in a variety of sparse-reward environments.
arXiv Detail & Related papers (2024-01-25T15:06:40Z) - VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness [56.87603097348203]
VeCAF uses labels and natural language annotations to perform parametric data selection for PVM finetuning.
VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence.
On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning.
arXiv Detail & Related papers (2024-01-15T17:28:37Z) - Pre-Training and Fine-Tuning Generative Flow Networks [61.90529626590415]
We introduce a novel approach for reward-free pre-training of GFlowNets.
By framing the training as a self-supervised problem, we propose an outcome-conditioned GFlowNet that learns to explore the candidate space.
We show that the pre-trained OC-GFN model can allow for a direct extraction of a policy capable of sampling from any new reward functions in downstream tasks.
arXiv Detail & Related papers (2023-10-05T09:53:22Z) - Statistically Efficient Variance Reduction with Double Policy Estimation
for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning [53.97273491846883]
We propose DPE: an RL algorithm that blends offline sequence modeling and offline reinforcement learning with Double Policy Estimation.
We validate our method in multiple tasks of OpenAI Gym with D4RL benchmarks.
arXiv Detail & Related papers (2023-08-28T20:46:07Z) - FedDUAP: Federated Learning with Dynamic Update and Adaptive Pruning
Using Shared Data on the Server [64.94942635929284]
Federated Learning (FL) suffers from two critical challenges, i.e., limited computational resources and low training efficiency.
We propose a novel FL framework, FedDUAP, to exploit the insensitive data on the server and the decentralized data in edge devices.
By integrating the two original techniques together, our proposed FL model, FedDUAP, significantly outperforms baseline approaches in terms of accuracy (up to 4.8% higher), efficiency (up to 2.8 times faster), and computational cost (up to 61.9% smaller)
arXiv Detail & Related papers (2022-04-25T10:00:00Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Efficient Reinforced Feature Selection via Early Stopping Traverse
Strategy [36.890295071860166]
We propose a single-agent Monte Carlo based reinforced feature selection (MCRFS) method.
We also propose two efficiency improvement strategies, i.e., early stopping (ES) strategy and reward-level interactive (RI) strategy.
arXiv Detail & Related papers (2021-09-29T03:51:13Z) - APS: Active Pretraining with Successor Features [96.24533716878055]
We show that by reinterpreting and combining successorcitepHansenFast with non entropy, the intractable mutual information can be efficiently optimized.
The proposed method Active Pretraining with Successor Feature (APS) explores the environment via non entropy, and the explored data can be efficiently leveraged to learn behavior.
arXiv Detail & Related papers (2021-08-31T16:30:35Z) - Model-based Adversarial Meta-Reinforcement Learning [38.28304764312512]
We propose Model-based Adversarial Meta-Reinforcement Learning (AdMRL)
AdMRL aims to minimize the worst-case sub-optimality gap across all tasks in a family of tasks.
We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks.
arXiv Detail & Related papers (2020-06-16T02:21:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.