Related papers: Fast Feature Selection with Fairness Constraints

Fast Feature Selection with Fairness Constraints

URL: http://arxiv.org/abs/2202.13718v2
Date: Fri, 3 Feb 2023 13:03:43 GMT
Title: Fast Feature Selection with Fairness Constraints
Authors: Francesco Quinzan, Rajiv Khanna, Moshik Hershcovitch, Sarel Cohen, Daniel G. Waddington, Tobias Friedrich, Michael W. Mahoney
Abstract summary: We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. We extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work.
Score: 49.142308856826396
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.

Related papers

Hybrid least squares for learning functions from highly noisy data [7.096701481970196]
We consider a least-squares function approximation problem with heavily polluted data.<n>Existing methods that are powerful in the small noise regime are suboptimal when large noise is present.<n>We show that the proposed algorithm enjoys appropriate optimality properties for both sample point generation and noise mollification.
arXiv Detail & Related papers (2025-07-03T00:31:29Z)
Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems. Such problems are encountered in medicine, physics, and machine learning. We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z)
An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting [53.36437745983783]
We first construct a max-margin optimization-based model to model potentially non-monotonic preferences. We devise information amount measurement methods and question selection strategies to pinpoint the most informative alternative in each iteration. Two incremental preference elicitation-based algorithms are developed to learn potentially non-monotonic preferences.
arXiv Detail & Related papers (2024-09-04T14:36:20Z)
Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization [38.57171985309975]
We develop efficient algorithms for the problem of regret in assortment selection with emphPlackett Luce (PL) based user choices. Our methods are practical, provably optimal, and devoid of the aforementioned limitations of the existing methods.
arXiv Detail & Related papers (2024-02-29T07:17:04Z)
Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique [0.6338047104436422]
Best subset section has been widely regarded as the Holy Grail of problems of this type. We proposed and illustrated an algorithm for best subset recovery in mild conditions. Our implementation achieves approximately a fourfold speedup compared to popular variable selection toolkits.
arXiv Detail & Related papers (2023-08-01T03:11:31Z)
Universal and data-adaptive algorithms for model selection in linear contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$. Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z)
Near Instance Optimal Model Selection for Pure Exploration Linear Bandits [20.67688737534517]
We study the model selection problem in the pure exploration linear bandit setting. Our goal is to automatically adapt to the instance-dependent complexity measure of the smallest hypothesis class. Our algorithms define a new optimization problem based on experimental design.
arXiv Detail & Related papers (2021-09-10T22:56:58Z)
Online Model Selection for Reinforcement Learning with Function Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret. We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
Joint Adaptive Graph and Structured Sparsity Regularization for Unsupervised Feature Selection [6.41804410246642]
We propose a joint adaptive graph and structured sparsity regularization unsupervised feature selection (JASFS) method. A subset of optimal features will be selected in group, and the number of selected features will be determined automatically. Experimental results on eight benchmarks demonstrate the effectiveness and efficiency of the proposed method.
arXiv Detail & Related papers (2020-10-09T08:17:04Z)
Adaptive Sampling of Pareto Frontiers with Binary Constraints Using Regression and Classification [0.0]
We present a novel adaptive optimization algorithm for black-box multi-objective optimization problems with binary constraints. Our method is based on probabilistic regression and classification models, which act as a surrogate for the optimization goals. We also present a novel ellipsoid truncation method to speed up the expected hypervolume calculation.
arXiv Detail & Related papers (2020-08-27T09:15:02Z)
Extreme Algorithm Selection With Dyadic Feature Representation [78.13985819417974]
We propose the setting of extreme algorithm selection (XAS) where we consider fixed sets of thousands of candidate algorithms. We assess the applicability of state-of-the-art AS techniques to the XAS setting and propose approaches leveraging a dyadic feature representation.
arXiv Detail & Related papers (2020-01-29T09:40:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.