Automatic Outlier Rectification via Optimal Transport
- URL: http://arxiv.org/abs/2403.14067v2
- Date: Thu, 11 Jul 2024 05:22:42 GMT
- Title: Automatic Outlier Rectification via Optimal Transport
- Authors: Jose Blanchet, Jiajin Li, Markus Pelger, Greg Zanotti,
- Abstract summary: We propose a novel conceptual framework to detect outliers using optimal transport with a concave cost function.
We take the first step to utilize the optimal transport distance with a concave cost function to construct a rectification set.
Then, we select the best distribution within the rectification set to perform the estimation task.
- Score: 7.421153752627664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel conceptual framework to detect outliers using optimal transport with a concave cost function. Conventional outlier detection approaches typically use a two-stage procedure: first, outliers are detected and removed, and then estimation is performed on the cleaned data. However, this approach does not inform outlier removal with the estimation task, leaving room for improvement. To address this limitation, we propose an automatic outlier rectification mechanism that integrates rectification and estimation within a joint optimization framework. We take the first step to utilize the optimal transport distance with a concave cost function to construct a rectification set in the space of probability distributions. Then, we select the best distribution within the rectification set to perform the estimation task. Notably, the concave cost function we introduced in this paper is the key to making our estimator effectively identify the outlier during the optimization process. We demonstrate the effectiveness of our approach over conventional approaches in simulations and empirical analyses for mean estimation, least absolute regression, and the fitting of option implied volatility surfaces.
Related papers
- Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Acquiring Better Load Estimates by Combining Anomaly and Change Point Detection in Power Grid Time-series Measurements [0.49478969093606673]
Our approach prioritizes interpretability while ensuring robust and generalizable performance on unseen data.
Results indicate the clear wasted potential when filtering is not applied.
Our methodology's interpretability makes it particularly suitable for critical infrastructure planning.
arXiv Detail & Related papers (2024-05-25T10:15:51Z) - Unifying Distributionally Robust Optimization via Optimal Transport
Theory [13.19058156672392]
This paper introduces a novel approach that unifies these methods into a single framework based on optimal transport.
Our proposed approach makes it possible for optimal adversarial distributions to simultaneously perturb likelihood and outcomes.
The paper investigates several duality results and presents tractable reformulations that enhance the practical applicability of this unified framework.
arXiv Detail & Related papers (2023-08-10T08:17:55Z) - Consensus-Adaptive RANSAC [104.87576373187426]
We propose a new RANSAC framework that learns to explore the parameter space by considering the residuals seen so far via a novel attention layer.
The attention mechanism operates on a batch of point-to-model residuals, and updates a per-point estimation state to take into account the consensus found through a lightweight one-step transformer.
arXiv Detail & Related papers (2023-07-26T08:25:46Z) - Distributed Unconstrained Optimization with Time-varying Cost Functions [1.52292571922932]
The objective is to track the optimal trajectory that minimizes the total cost at each time instant.
Our approach consists of a two-stage dynamics, where the first one samples the first and second derivatives of the local costs to periodically construct an estimate of the descent direction towards the optimal trajectory.
To demonstrate the performance of the proposed method, a numerical example is conducted that studies tuning the algorithm's parameters and their effects on the convergence of local states to the optimal trajectory.
arXiv Detail & Related papers (2022-12-12T23:59:54Z) - Multi-objective robust optimization using adaptive surrogate models for
problems with mixed continuous-categorical parameters [0.0]
Robust design optimization is traditionally considered when uncertainties are mainly affecting the objective function.
The resulting nested optimization problem may be solved using a general-purpose solver, herein the non-dominated sorting genetic algorithm (NSGA-II)
The proposed approach consists of sequentially carrying out NSGA-II while using an adaptively built Kriging model to estimate the quantiles.
arXiv Detail & Related papers (2022-03-03T20:23:18Z) - Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints.
We develop novel and simple optimization formulations for these problems.
As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.