Adopting Robustness and Optimality in Fitting and Learning
- URL: http://arxiv.org/abs/1510.03826v4
- Date: Wed, 18 Oct 2023 06:14:51 GMT
- Title: Adopting Robustness and Optimality in Fitting and Learning
- Authors: Zhiguang Wang, Tim Oates, James Lo
- Abstract summary: We generalize a modified exponentialized estimator by pushing the robust-optimal (RO) index $lambda$ to $-infty digits.
The robustness is realized and controlled adaptively by RONIST.
- Score: 4.511425319032815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We generalized a modified exponentialized estimator by pushing the
robust-optimal (RO) index $\lambda$ to $-\infty$ for achieving robustness to
outliers by optimizing a quasi-Minimin function. The robustness is realized and
controlled adaptively by the RO index without any predefined threshold.
Optimality is guaranteed by expansion of the convexity region in the Hessian
matrix to largely avoid local optima. Detailed quantitative analysis on both
robustness and optimality are provided. The results of proposed experiments on
fitting tasks for three noisy non-convex functions and the digits recognition
task on the MNIST dataset consolidate the conclusions.
Related papers
- Optimised Feature Subset Selection via Simulated Annealing [39.58317527488534]
We introduce SA-FDR, a novel algorithm for $ell_0$-norm feature selection.<n>We show that SA-FDR consistently selects more compact feature subsets while achieving a high predictive accuracy.<n>As a result, SA-FDR provides a flexible and effective solution for designing interpretable models in high-dimensional settings.
arXiv Detail & Related papers (2025-07-31T13:57:38Z) - Enhancing Distributional Robustness in Principal Component Analysis by Wasserstein Distances [7.695578200868269]
We consider the distributionally robust optimization (DRO) model of principal component analysis (PCA) to account for uncertainty in the underlying probability distribution.<n>The resulting formulation leads to a nonsmooth constrained min-max optimization problem, where the ambiguity set captures the distributional uncertainty by the type-$2$ Wasserstein distance.<n>This explicit characterization equivalently reformulates the original DRO model into a minimization problem on the Stiefel manifold with intricate nonsmooth terms.
arXiv Detail & Related papers (2025-03-04T11:00:08Z) - Optimal Rates for Robust Stochastic Convex Optimization [12.620782629498812]
We develop novel algorithms that achieve minimax-optimal excess risk (up to logarithmic factors) under the $epsilon$-contamination model.
Our algorithms do not require stringent assumptions, including Lipschitz continuity and smoothness of individual sample functions.
We complement our algorithmic developments with a tight information-theoretic lower bound for robust SCO.
arXiv Detail & Related papers (2024-12-15T00:52:08Z) - Robust Entropy Search for Safe Efficient Bayesian Optimization [40.56709991743249]
We develop an efficient information-based acquisition function that we call Robust Entropy Search (RES)
RES reliably finds robust optima, outperforming state-of-the-art algorithms.
arXiv Detail & Related papers (2024-05-29T13:00:10Z) - Smoothed $f$-Divergence Distributionally Robust Optimization [5.50764401597583]
We argue that a special type of distributionallly robust optimization (DRO) formulation offers theoretical advantages.
DRO uses an ambiguity set based on a Kullback Leibler (KL) divergence smoothed by the Wasserstein or L'evy-Prokhorov (LP) distance.
arXiv Detail & Related papers (2023-06-24T19:22:01Z) - Robust Subset Selection by Greedy and Evolutionary Pareto Optimization [23.0838604893412]
Subset selection aims to select a subset from a ground set to maximize some objective function.
We show that a greedy algorithm can obtain an approximation ratio of $1-e-betagamma$, where $beta$ and $gamma$ are the correlation and submodularity ratios of the objective functions.
arXiv Detail & Related papers (2022-05-03T11:00:54Z) - Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium
Learning from Offline Datasets [101.5329678997916]
We study episodic two-player zero-sum Markov games (MGs) in the offline setting.
The goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori.
arXiv Detail & Related papers (2022-02-15T15:39:30Z) - Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis [16.499651513178012]
Distributionally robust optimization (DRO) is a widely-used approach to learn models that are robust against distribution shift.
We provide non-asymptotic convergence guarantees even though the objective function is possibly prominent nonsmooth- and has normalized gradient descent.
arXiv Detail & Related papers (2021-10-24T14:56:38Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Bayesian Optimization of Risk Measures [7.799648230758491]
We consider Bayesian optimization of objective functions of the form $rho[ F(x, W) ]$, where $F$ is a black-box expensive-to-evaluate function.
We propose a family of novel Bayesian optimization algorithms that exploit the structure of the objective function to substantially improve sampling efficiency.
arXiv Detail & Related papers (2020-07-10T18:20:46Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.