Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares
- URL: http://arxiv.org/abs/2102.04108v1
- Date: Mon, 8 Feb 2021 10:25:40 GMT
- Title: Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares
- Authors: Hiroaki Yamada, Makoto Yamada
- Abstract summary: We propose a flexible framework for safe screening based on the Fenchel-Rockafellar duality.
We show that our screening rule can eliminate more features and increase the speed of the solver.
- Score: 21.42115891216612
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A recently introduced technique for a sparse optimization problem called
"safe screening" allows us to identify irrelevant variables in the early stage
of optimization. In this paper, we first propose a flexible framework for safe
screening based on the Fenchel-Rockafellar duality and then derive a strong
safe screening rule for norm-regularized least squares by the framework. We
call the proposed screening rule for norm-regularized least squares "dynamic
Sasvi" because it can be interpreted as a generalization of Sasvi. Unlike the
original Sasvi, it does not require the exact solution of a more strongly
regularized problem; hence, it works safely in practice. We show that our
screening rule can eliminate more features and increase the speed of the solver
in comparison with other screening rules both theoretically and experimentally.
Related papers
- SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation [5.523554757946985]
Sassha is a novel second-order method designed to enhance generalization by explicitly reducing sharpness of the solution.
We provide a comprehensive set of analyses including convergence, robustness, stability, efficiency, and cost.
arXiv Detail & Related papers (2025-02-25T12:35:05Z) - SeWA: Selective Weight Average via Probabilistic Masking [51.015724517293236]
We show that only a few points are needed to achieve better and faster convergence.
We transform the discrete selection problem into a continuous subset optimization framework.
We derive the SeWA's stability bounds, which are sharper than that under both convex image checkpoints.
arXiv Detail & Related papers (2025-02-14T12:35:21Z) - Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization [52.61737731453222]
Non-Machine Learning problems typically do not adhere to the standard smoothness assumption.
We propose and analyze new methods with local steps, partial participation of clients, and Random Random Reshuffling.
Our theory is consistent with the known results for standard smooth problems.
arXiv Detail & Related papers (2024-12-03T19:20:56Z) - Information-Theoretic Safe Bayesian Optimization [59.758009422067005]
We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2024-02-23T14:31:10Z) - Leveraging Approximate Model-based Shielding for Probabilistic Safety
Guarantees in Continuous Environments [63.053364805943026]
We extend the approximate model-based shielding framework to the continuous setting.
In particular we use Safety Gym as our test-bed, allowing for a more direct comparison of AMBS with popular constrained RL algorithms.
arXiv Detail & Related papers (2024-02-01T17:55:08Z) - Thompson Exploration with Best Challenger Rule in Best Arm
Identification [66.33448474838342]
We study the fixed-confidence best arm identification problem in the bandit framework.
We propose a novel policy that combines Thompson sampling with a computationally efficient approach known as the best challenger rule.
arXiv Detail & Related papers (2023-10-01T01:37:02Z) - Information-Theoretic Safe Exploration with Gaussian Processes [89.31922008981735]
We consider a sequential decision making task where we are not allowed to evaluate parameters that violate an unknown (safety) constraint.
Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case.
We propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate.
arXiv Detail & Related papers (2022-12-09T15:23:58Z) - On Kernelized Multi-Armed Bandits with Constraints [16.102401271318012]
We study a bandit problem with a general unknown reward function and a general unknown constraint function.
We propose a general framework for both algorithm performance analysis.
We demonstrate the superior performance of our proposed algorithms via numerical experiments.
arXiv Detail & Related papers (2022-03-29T14:02:03Z) - Constrained Variational Policy Optimization for Safe Reinforcement
Learning [40.38842532850959]
Safe reinforcement learning aims to learn policies that satisfy certain constraints before deploying to safety-critical applications.
primal-dual as a prevalent constrained optimization framework suffers from instability issues and lacks optimality guarantees.
This paper overcomes the issues from a novel probabilistic inference perspective and proposes an Expectation-Maximization style approach to learn safe policy.
arXiv Detail & Related papers (2022-01-28T04:24:09Z) - The Hessian Screening Rule [5.076419064097734]
Hessian Screening Rule uses second-order information from the model to provide more accurate screening.
We show that the rule outperforms all other alternatives in simulated experiments with high correlation.
arXiv Detail & Related papers (2021-04-27T07:55:29Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Safe Screening Rules for $\ell_0$-Regression [3.04585143845864]
The screening rules can be computed from a convex relaxation solution in linear time.
Experiments on real and synthetic data indicate that, on average, 76% of the variables can be fixed to their optimal values.
arXiv Detail & Related papers (2020-04-19T06:07:09Z) - Safe RuleFit: Learning Optimal Sparse Rule Model by Meta Safe Screening [19.524479577246336]
We consider the problem of learning a sparse rule model, a prediction model in the form of a sparse linear combination of rules.
Since the number of all possible such rules is extremely large, it has been computationally intractable to select the optimal set of active rules.
We propose Safe RuleFit (SRF), which is a non-trivial extension of well-known safe screening techniques.
arXiv Detail & Related papers (2018-10-03T10:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.