A Mathematical Optimization Approach to Multisphere Support Vector Data Description
- URL: http://arxiv.org/abs/2507.11106v1
- Date: Tue, 15 Jul 2025 08:57:27 GMT
- Title: A Mathematical Optimization Approach to Multisphere Support Vector Data Description
- Authors: Víctor Blanco, Inmaculada Espejo, Raúl Páez, Antonio M. Rodríguez-Chía,
- Abstract summary: We provide a primal formulation, in the shape of a Mixed Second Order Cone model, that constructs Euclidean hyperspheres to identify anomalous observations.<n>We develop a dual model that enables the application of the kernel trick, thus allowing for the detection of outliers within complex, non-linear data structures.
- Score: 1.9499277906326784
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel mathematical optimization framework for outlier detection in multimodal datasets, extending Support Vector Data Description approaches. We provide a primal formulation, in the shape of a Mixed Integer Second Order Cone model, that constructs Euclidean hyperspheres to identify anomalous observations. Building on this, we develop a dual model that enables the application of the kernel trick, thus allowing for the detection of outliers within complex, non-linear data structures. An extensive computational study demonstrates the effectiveness of our exact method, showing clear advantages over existing heuristic techniques in terms of accuracy and robustness.
Related papers
- Going from a Representative Agent to Counterfactuals in Combinatorial Choice [2.9172603864294033]
We study decision-making problems where data comprises points from a collection of binary polytopes.<n>We propose a nonparametric approach for counterfactual inference in this setting based on a representative agent model.
arXiv Detail & Related papers (2025-05-29T15:24:23Z) - Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices [6.7523635840772505]
Circular and non-flat data distributions are prevalent across diverse domains of data science.<n>A principled approach to accounting for the underlying geometry of such data is pivotal.<n>This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.
arXiv Detail & Related papers (2025-02-03T16:46:46Z) - Efficient Fairness-Performance Pareto Front Computation [51.558848491038916]
We show that optimal fair representations possess several useful structural properties.
We then show that these approxing problems can be solved efficiently via concave programming methods.
arXiv Detail & Related papers (2024-09-26T08:46:48Z) - Regularized Projection Matrix Approximation with Applications to Community Detection [1.3761665705201904]
This paper introduces a regularized projection matrix approximation framework designed to recover cluster information from the affinity matrix.
We investigate three distinct penalty functions, each specifically tailored to address bounded, positive, and sparse scenarios.
Numerical experiments conducted on both synthetic and real-world datasets reveal that our regularized projection matrix approximation approach significantly outperforms state-of-the-art methods in clustering performance.
arXiv Detail & Related papers (2024-05-26T15:18:22Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Nonparametric Automatic Differentiation Variational Inference with
Spline Approximation [7.5620760132717795]
We develop a nonparametric approximation approach that enables flexible posterior approximation for distributions with complicated structures.
Compared with widely-used nonparametrical inference methods, the proposed method is easy to implement and adaptive to various data structures.
Experiments demonstrate the efficiency of the proposed method in approximating complex posterior distributions and improving the performance of generative models with incomplete data.
arXiv Detail & Related papers (2024-03-10T20:22:06Z) - Joint Distributional Learning via Cramer-Wold Distance [0.7614628596146602]
We introduce the Cramer-Wold distance regularization, which can be computed in a closed-form, to facilitate joint distributional learning for high-dimensional datasets.
We also introduce a two-step learning method to enable flexible prior modeling and improve the alignment between the aggregated posterior and the prior distribution.
arXiv Detail & Related papers (2023-10-25T05:24:23Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Accelerated structured matrix factorization [0.0]
Matrix factorization exploits the idea that, in complex high-dimensional data, the actual signal typically lies in lower-dimensional structures.
By exploiting Bayesian shrinkage priors, we devise a computationally convenient approach for high-dimensional matrix factorization.
The dependence between row and column entities is modeled by inducing flexible sparse patterns within factors.
arXiv Detail & Related papers (2022-12-13T11:35:01Z) - Learning Graphical Factor Models with Riemannian Optimization [70.13748170371889]
This paper proposes a flexible algorithmic framework for graph learning under low-rank structural constraints.
The problem is expressed as penalized maximum likelihood estimation of an elliptical distribution.
We leverage geometries of positive definite matrices and positive semi-definite matrices of fixed rank that are well suited to elliptical models.
arXiv Detail & Related papers (2022-10-21T13:19:45Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.