On the Adversarial Robustness of LASSO Based Feature Selection
- URL: http://arxiv.org/abs/2010.10045v1
- Date: Tue, 20 Oct 2020 05:51:26 GMT
- Title: On the Adversarial Robustness of LASSO Based Feature Selection
- Authors: Fuwei Li, Lifeng Lai, Shuguang Cui
- Abstract summary: In the considered model, there is a malicious adversary who can observe the whole dataset, and then will carefully modify the response values or the feature matrix.
We formulate the modification strategy of the adversary as a bi-level optimization problem.
Numerical examples with synthetic and real data illustrate that our method is efficient and effective.
- Score: 72.54211869067979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the adversarial robustness of feature selection
based on the $\ell_1$ regularized linear regression model, namely LASSO. In the
considered model, there is a malicious adversary who can observe the whole
dataset, and then will carefully modify the response values or the feature
matrix in order to manipulate the selected features. We formulate the
modification strategy of the adversary as a bi-level optimization problem. Due
to the difficulty of the non-differentiability of the $\ell_1$ norm at the zero
point, we reformulate the $\ell_1$ norm regularizer as linear inequality
constraints. We employ the interior-point method to solve this reformulated
LASSO problem and obtain the gradient information. Then we use the projected
gradient descent method to design the modification strategy. In addition, We
demonstrate that this method can be extended to other $\ell_1$ based feature
selection methods, such as group LASSO and sparse group LASSO. Numerical
examples with synthetic and real data illustrate that our method is efficient
and effective.
Related papers
- Geometric-Averaged Preference Optimization for Soft Preference Labels [78.2746007085333]
Many algorithms for aligning LLMs with human preferences assume that human preferences are binary and deterministic.
In this work, we introduce the distributional soft preference labels and improve Direct Preference Optimization (DPO) with a weighted geometric average of the LLM output likelihood in the loss function.
arXiv Detail & Related papers (2024-09-10T17:54:28Z) - Solution-Set Geometry and Regularization Path of a Nonconvexly Regularized Convex Sparse Model [8.586951231230596]
The Osborne generalized minimax geometry penalty is a non uniqueness solution which can preserve the overall-squareity of the regularized path.
We show that similar to LASSO, the minimum $ell$-norm regularization is continuous piecewise in $lambda$.
arXiv Detail & Related papers (2023-11-30T10:39:47Z) - Domain Generalization via Rationale Invariance [70.32415695574555]
This paper offers a new perspective to ease the challenge of domain generalization, which involves maintaining robust results even in unseen environments.
We propose treating the element-wise contributions to the final results as the rationale for making a decision and representing the rationale for each sample as a matrix.
Our experiments demonstrate that the proposed approach achieves competitive results across various datasets, despite its simplicity.
arXiv Detail & Related papers (2023-08-22T03:31:40Z) - A model-free feature selection technique of feature screening and random
forest based recursive feature elimination [0.0]
We propose a model-free feature selection method for ultra-high dimensional data with mass features.
We show that the proposed method is selection consistent and $L$ consistent under weak regularity conditions.
arXiv Detail & Related papers (2023-02-15T03:39:16Z) - A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model.
The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z) - Random Manifold Sampling and Joint Sparse Regularization for Multi-label
Feature Selection [0.0]
The model proposed in this paper can obtain the most relevant few features by solving the joint constrained optimization problems of $ell_2,1$ and $ell_F$ regularization.
Comparative experiments on real-world data sets show that the proposed method outperforms other methods.
arXiv Detail & Related papers (2022-04-13T15:06:12Z) - A Lagrangian Duality Approach to Active Learning [119.36233726867992]
We consider the batch active learning problem, where only a subset of the training data is labeled.
We formulate the learning problem using constrained optimization, where each constraint bounds the performance of the model on labeled samples.
We show, via numerical experiments, that our proposed approach performs similarly to or better than state-of-the-art active learning methods.
arXiv Detail & Related papers (2022-02-08T19:18:49Z) - Universal and data-adaptive algorithms for model selection in linear
contextual bandits [52.47796554359261]
We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem.
We introduce new algorithms that explore in a data-adaptive manner and provide guarantees of the form $mathcalO(dalpha T1- alpha)$.
Our approach extends to model selection among nested linear contextual bandits under some additional assumptions.
arXiv Detail & Related papers (2021-11-08T18:05:35Z) - Ridge regression with adaptive additive rectangles and other piecewise
functional templates [0.0]
We propose an $L_2$-based penalization algorithm for functional linear regression models.
We show how our algorithm alternates between approximating a suitable template and solving a convex ridge-like problem.
arXiv Detail & Related papers (2020-11-02T15:28:54Z) - Optimal Feature Manipulation Attacks Against Linear Regression [64.54500628124511]
In this paper, we investigate how to manipulate the coefficients obtained via linear regression by adding carefully designed poisoning data points to the dataset or modify the original data points.
Given the energy budget, we first provide the closed-form solution of the optimal poisoning data point when our target is modifying one designated regression coefficient.
We then extend the analysis to the more challenging scenario where the attacker aims to change one particular regression coefficient while making others to be changed as small as possible.
arXiv Detail & Related papers (2020-02-29T04:26:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.