IIFE: Interaction Information Based Automated Feature Engineering
- URL: http://arxiv.org/abs/2409.04665v1
- Date: Sat, 7 Sep 2024 00:34:26 GMT
- Title: IIFE: Interaction Information Based Automated Feature Engineering
- Authors: Tom Overman, Diego Klabjan, Jean Utke,
- Abstract summary: We introduce a new AutoFE algorithm, IIFE, based on determining which feature pairs synergize well.
We show how interaction information can be used to improve existing AutoFE algorithms.
- Score: 11.866061471514582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automated feature engineering (AutoFE) is the process of automatically building and selecting new features that help improve downstream predictive performance. While traditional feature engineering requires significant domain expertise and time-consuming iterative testing, AutoFE strives to make feature engineering easy and accessible to all data science practitioners. We introduce a new AutoFE algorithm, IIFE, based on determining which feature pairs synergize well through an information-theoretic perspective called interaction information. We demonstrate the superior performance of IIFE over existing algorithms. We also show how interaction information can be used to improve existing AutoFE algorithms. Finally, we highlight several critical experimental setup issues in the existing AutoFE literature and their effects on performance.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Statistical Test for Auto Feature Engineering by Selective Inference [12.703556860454565]
Auto Feature Engineering (AFE) plays a crucial role in developing practical machine learning pipelines.
We propose a new statistical test for generated features by AFE algorithms based on a framework called selective inference.
The proposed test can quantify the statistical significance of the generated features in the form of $p$-values, enabling theoretically guaranteed control of the risk of false findings.
arXiv Detail & Related papers (2024-10-13T12:26:51Z) - AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation [0.0]
Autonomous User-interface Transformation through Online Neuro-graphic Operations and Deep Exploration.
Our engine empowers agents to comprehend and implement complex, adapting to dynamic web environments with unparalleled efficiency.
The versatility and efficacy of AUTONODE are demonstrated through a series of experiments, highlighting its proficiency in managing a diverse array of web-based tasks.
arXiv Detail & Related papers (2024-03-15T10:27:17Z) - AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA.
It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z) - AutoTransfer: AutoML with Knowledge Transfer -- An Application to Graph
Neural Networks [75.11008617118908]
AutoML techniques consider each task independently from scratch, leading to high computational cost.
Here we propose AutoTransfer, an AutoML solution that improves search efficiency by transferring the prior architectural design knowledge to the novel task of interest.
arXiv Detail & Related papers (2023-03-14T07:23:16Z) - Toward Efficient Automated Feature Engineering [27.47868891738917]
Automated Feature Engineering (AFE) refers to automatically generate and select optimal feature sets for downstream tasks.
Current AFE methods mainly focus on improving the effectiveness of the produced features, but ignoring the low-efficiency issue for large-scale deployment.
We construct the AFE pipeline based on reinforcement learning setting, where each feature is assigned an agent to perform feature transformation.
We conduct comprehensive experiments on 36 datasets in terms of both classification and regression tasks.
arXiv Detail & Related papers (2022-12-26T13:18:51Z) - AEFE: Automatic Embedded Feature Engineering for Categorical Features [4.310748698480341]
We propose an automatic feature engineering framework for representing categorical features, which consists of various components including custom paradigm feature construction and multiple feature selection.
Experiments conducted on some typical e-commerce datasets indicate that our method outperforms the classical machine learning models and state-of-the-art deep learning models.
arXiv Detail & Related papers (2021-10-19T07:22:59Z) - FLFE: A Communication-Efficient and Privacy-Preserving Federated Feature
Engineering Framework [16.049161581014513]
We present a framework called FLFE to conduct privacy-preserving and communication-preserving multi-party feature transformations.
The framework pre-learns the pattern of the feature to directly judge the usefulness of the transformation on a feature.
arXiv Detail & Related papers (2020-09-05T16:08:54Z) - Towards Automated Neural Interaction Discovery for Click-Through Rate
Prediction [64.03526633651218]
Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems.
We propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR.
arXiv Detail & Related papers (2020-06-29T04:33:01Z) - AutoFIS: Automatic Feature Interaction Selection in Factorization Models
for Click-Through Rate Prediction [75.16836697734995]
We propose a two-stage algorithm called Automatic Feature Interaction Selection (AutoFIS)
AutoFIS can automatically identify important feature interactions for factorization models with computational cost just equivalent to training the target model to convergence.
AutoFIS has been deployed onto the training platform of Huawei App Store recommendation service.
arXiv Detail & Related papers (2020-03-25T06:53:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.