Efficient Solvers for SLOPE in R, Python, Julia, and C++
- URL: http://arxiv.org/abs/2511.02430v1
- Date: Tue, 04 Nov 2025 10:03:15 GMT
- Title: Efficient Solvers for SLOPE in R, Python, Julia, and C++
- Authors: Johan Larsson, Malgorzata Bogdan, Krystyna Grzesiak, Mathurin Massias, Jonas Wallin,
- Abstract summary: We present a suite of packages that efficiently solve the Sorted L-One Penalized Estimation problem.<n>The packages feature a highly efficient hybrid coordinate descent algorithm that fits generalized linear models.<n>Our implementation is designed to be fast, memory-efficient, and flexible.
- Score: 5.542449901887863
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a suite of packages in R, Python, Julia, and C++ that efficiently solve the Sorted L-One Penalized Estimation (SLOPE) problem. The packages feature a highly efficient hybrid coordinate descent algorithm that fits generalized linear models (GLMs) and supports a variety of loss functions, including Gaussian, binomial, Poisson, and multinomial logistic regression. Our implementation is designed to be fast, memory-efficient, and flexible. The packages support a variety of data structures (dense, sparse, and out-of-memory matrices) and are designed to efficiently fit the full SLOPE path as well as handle cross-validation of SLOPE models, including the relaxed SLOPE. We present examples of how to use the packages and benchmarks that demonstrate the performance of the packages on both real and simulated data and show that our packages outperform existing implementations of SLOPE in terms of speed.
Related papers
- HiGP: A high-performance Python package for Gaussian Process [13.127443064937735]
HiGP is a high-performance Python package designed for efficient Gaussian Process regression (GPR) and classification (GPC)<n>It implements various effective matrix-vector (MatVec) and matrix-matrix (MatMul) multiplication strategies specifically tailored for kernel matrices.<n>With a user-friendly Python interface, HiGP integrates seamlessly with PyTorch and other Python packages, allowing easy incorporation into existing machine learning and data analysis.
arXiv Detail & Related papers (2025-03-04T04:17:36Z) - Sketch 'n Solve: An Efficient Python Package for Large-Scale Least Squares Using Randomized Numerical Linear Algebra [0.0]
We present Sketch 'n Solve, an open-source Python package that implements efficient randomized numerical linear algebra techniques.
We show that our implementation achieves up to 50x speedup over traditional LSQR while maintaining high accuracy, even for ill-conditioned matrices.
The package shows particular promise for applications in machine learning optimization, signal processing, and scientific computing.
arXiv Detail & Related papers (2024-09-22T04:29:51Z) - Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control [66.78146440275093]
Learned retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors.
We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval.
Current approaches like LexLIP and STAIR require complex multi-step training on massive datasets.
Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
arXiv Detail & Related papers (2024-02-27T14:21:56Z) - Multi-Task Learning for Sparsity Pattern Heterogeneity: Statistical and Computational Perspectives [10.514866749547558]
We consider a problem in Multi-Task Learning (MTL) where multiple linear models are jointly trained on a collection of datasets.
A key novelty of our framework is that it allows the sparsity pattern of regression coefficients and the values of non-zero coefficients to differ across tasks.
Our methods encourage models to share information across tasks through separately encouraging 1) coefficient supports, and/or 2) nonzero coefficient values to be similar.
This allows models to borrow strength during variable selection even when non-zero coefficient values differ across tasks.
arXiv Detail & Related papers (2022-12-16T19:52:25Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - NumS: Scalable Array Programming for the Cloud [82.827921577004]
We present NumS, an array programming library which optimize NumPy-like expressions on task-based distributed systems.
This is achieved through a novel scheduler called Load Simulated Hierarchical Scheduling (LSHS)
We show that LSHS enhances performance on Ray by decreasing network load by a factor of 2x, requiring 4x less memory, and reducing execution time by 10x on the logistic regression problem.
arXiv Detail & Related papers (2022-06-28T20:13:40Z) - Stochastic Gradient Descent without Full Data Shuffle [65.97105896033815]
CorgiPile is a hierarchical data shuffling strategy that avoids a full data shuffle while maintaining comparable convergence rate of SGD as if a full shuffle were performed.
Our results show that CorgiPile can achieve comparable convergence rate with the full shuffle based SGD for both deep learning and generalized linear models.
arXiv Detail & Related papers (2022-06-12T20:04:31Z) - The flare Package for High Dimensional Linear Regression and Precision
Matrix Estimation in R [45.24529956312764]
This paper describes an R package named flare, which implements a family of new high dimensional regression methods.
The package flare is coded in double precision C, and called from R by a user-friendly interface.
Experiments show that flare is efficient and can scale up to large problems.
arXiv Detail & Related papers (2020-06-27T18:01:56Z) - Picasso: A Sparse Learning Library for High Dimensional Data Analysis in
R and Python [77.33905890197269]
We describe a new library which implements a unified pathwise coordinate optimization for a variety of sparse learning problems.
The library is coded in R++ and has user-friendly sparse experiments.
arXiv Detail & Related papers (2020-06-27T02:39:24Z) - Multi-layer Optimizations for End-to-End Data Analytics [71.05611866288196]
We introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach.
IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language.
We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and specialization by several orders of magnitude for linear regression and regression tree models over several relational datasets.
arXiv Detail & Related papers (2020-01-10T16:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.