Related papers: AFS-BM: Enhancing Model Performance through Adaptive Feature Selection with Binary Masking

AFS-BM: Enhancing Model Performance through Adaptive Feature Selection with Binary Masking

URL: http://arxiv.org/abs/2401.11250v2
Date: Mon, 17 Jun 2024 10:22:06 GMT
Title: AFS-BM: Enhancing Model Performance through Adaptive Feature Selection with Binary Masking
Authors: Mehmet Y. Turali, Mehmet E. Lorasdagi, Ali T. Koc, Suleyman S. Kozat,
Abstract summary: We introduce the "Adaptive Feature Selection with Binary Masking" (AFS-BM) We do the joint optimization and binary masking to continuously adapt the set of features and model parameters during the training process. Our results show that AFS-BM makes significant improvement in terms of accuracy and requires significantly less computational complexity.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of feature selection in general machine learning (ML) context, which is one of the most critical subjects in the field. Although, there exist many feature selection methods, however, these methods face challenges such as scalability, managing high-dimensional data, dealing with correlated features, adapting to variable feature importance, and integrating domain knowledge. To this end, we introduce the "Adaptive Feature Selection with Binary Masking" (AFS-BM) which remedies these problems. AFS-BM achieves this by joint optimization for simultaneous feature selection and model training. In particular, we do the joint optimization and binary masking to continuously adapt the set of features and model parameters during the training process. This approach leads to significant improvements in model accuracy and a reduction in computational requirements. We provide an extensive set of experiments where we compare AFS-BM with the established feature selection methods using well-known datasets from real-life competitions. Our results show that AFS-BM makes significant improvement in terms of accuracy and requires significantly less computational complexity. This is due to AFS-BM's ability to dynamically adjust to the changing importance of features during the training process, which an important contribution to the field. We openly share our code for the replicability of our results and to facilitate further research.

Related papers

Combatting Dimensional Collapse in LLM Pre-Training Data via Diversified File Selection [65.96556073745197]
DiverSified File selection algorithm (DiSF) is proposed to select the most decorrelated text files in the feature space. DiSF saves 98.5% of 590M training files in SlimPajama, outperforming the full-data pre-training within a 50B training budget.
arXiv Detail & Related papers (2025-04-29T11:13:18Z)
Embedded Federated Feature Selection with Dynamic Sparse Training: Balancing Accuracy-Cost Tradeoffs [1.749521391198341]
We present textitDynamic Sparse Federated Feature Selection (DSFFS), the first innovative embedded FFS that is efficient in both communication and computation.<n>During training, input-layer neurons, their connections, and hidden-layer connections are dynamically pruned and regrown, eliminating uninformative features.<n>Several experiments are conducted on nine real-world datasets, including biology, image, speech, and text.
arXiv Detail & Related papers (2025-04-07T16:33:05Z)
FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning. Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning. We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
"FRAME: Forward Recursive Adaptive Model Extraction-A Technique for Advance Feature Selection" [0.0]
This study introduces a novel hybrid approach, the Forward Recursive Adaptive Model Extraction Technique (FRAME) FRAME combines Forward Selection and Recursive Feature Elimination to enhance feature selection across diverse datasets. The results demonstrate that FRAME consistently delivers superior predictive performance based on downstream machine learning evaluation metrics.
arXiv Detail & Related papers (2025-01-21T08:34:10Z)
A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts. With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS) Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements. High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z)
FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning [57.38427653043984]
Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients. We introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. We demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance.
arXiv Detail & Related papers (2024-05-20T06:12:33Z)
LESS: Selecting Influential Data for Targeted Instruction Tuning [64.78894228923619]
We propose LESS, an efficient algorithm to estimate data influences and perform Low-rank gradiEnt Similarity Search for instruction data selection. We show that training on a LESS-selected 5% of the data can often outperform training on the full dataset across diverse downstream tasks. Our method goes beyond surface form cues to identify data that the necessary reasoning skills for the intended downstream application.
arXiv Detail & Related papers (2024-02-06T19:18:04Z)
Embedded feature selection in LSTM networks with multi-objective evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks. Our approach optimize the weights and biases of the LSTM in a partitioned manner. Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z)
Model Input-Output Configuration Search with Embedded Feature Selection for Sensor Time-series and Image Classification [0.0]
MICS-EFS is a model input-output configuration search with embedded feature selection. It delivers an average accuracy improvement of 1.5% over baseline models. It reduces feature dimensionality to just 2-5% of the original data, significantly enhancing computational efficiency.
arXiv Detail & Related papers (2023-10-26T08:58:29Z)
FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions [7.674715791336311]
We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse function-on-function regression problem. We show how to extend it to the scalar-on-function framework. We present an application to brain fMRI data from the AOMIC PIOP1 study.
arXiv Detail & Related papers (2023-03-26T19:41:17Z)
Learning to Maximize Mutual Information for Dynamic Feature Selection [13.821253491768168]
We consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. We explore a simpler approach of greedily selecting features based on their conditional mutual information. The proposed method is shown to recover the greedy policy when trained to optimality, and it outperforms numerous existing feature selection methods in our experiments.
arXiv Detail & Related papers (2023-01-02T08:31:56Z)
Traceable Automatic Feature Transformation via Cascading Actor-Critic Agents [25.139229855367088]
Feature transformation is an essential task to boost the effectiveness and interpretability of machine learning (ML) We formulate the feature transformation task as an iterative, nested process of feature generation and selection. We show 24.7% improvements in F1 scores compared with SOTAs and robustness in high-dimensional data.
arXiv Detail & Related papers (2022-12-27T08:20:19Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS) [0.0]
We propose UBayFS, an ensemble feature selection technique, embedded in a Bayesian statistical framework. Our approach enhances the feature selection process by considering two sources of information: data and domain knowledge. A comparison with standard feature selectors underlines that UBayFS achieves competitive performance, while providing additional flexibility to incorporate domain knowledge.
arXiv Detail & Related papers (2021-04-30T06:51:33Z)
Optimization-driven Machine Learning for Intelligent Reflecting Surfaces Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts. Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity. In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification [53.58763562421771]
We propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective Facial Attribute Classification (FAC) Specifically, DMM-CNN jointly optimize two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning. Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.
arXiv Detail & Related papers (2020-02-10T12:34:16Z)
Feature Selection Library (MATLAB Toolbox) [1.2058143465239939]
The Feature Selection Library (FSLib) introduces a comprehensive suite of feature selection (FS) algorithms. FSLib addresses the curse of dimensionality, reduces computational load, and enhances model generalizability. FSLib contributes to data interpretability by revealing important features, aiding in pattern recognition and understanding.
arXiv Detail & Related papers (2016-07-05T16:50:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.