Related papers: Feature Selection and Regularization in Multi-Class Classification: An Empirical Study of One-vs-Rest Logistic Regression with Gradient Descent Optimization and L1 Sparsity Constraints

Feature Selection and Regularization in Multi-Class Classification: An Empirical Study of One-vs-Rest Logistic Regression with Gradient Descent Optimization and L1 Sparsity Constraints

URL: http://arxiv.org/abs/2510.14449v2
Date: Wed, 22 Oct 2025 23:40:52 GMT
Title: Feature Selection and Regularization in Multi-Class Classification: An Empirical Study of One-vs-Rest Logistic Regression with Gradient Descent Optimization and L1 Sparsity Constraints
Authors: Jahidul Arafat, Fariha Tasmin, Sanjaya Poudel,
Abstract summary: Multi-class wine classification presents fundamental trade-offs between model accuracy, feature dimensionality, and interpretability.<n>This paper presents a comprehensive empirical study of One-vs-Rest logistic regression on the UCI Wine dataset.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-class wine classification presents fundamental trade-offs between model accuracy, feature dimensionality, and interpretability - critical factors for production deployment in analytical chemistry. This paper presents a comprehensive empirical study of One-vs-Rest logistic regression on the UCI Wine dataset (178 samples, 3 cultivars, 13 chemical features), comparing from-scratch gradient descent implementation against scikit-learn's optimized solvers and quantifying L1 regularization effects on feature sparsity. Manual gradient descent achieves 92.59 percent mean test accuracy with smooth convergence, validating theoretical foundations, though scikit-learn provides 24x training speedup and 98.15 percent accuracy. Class-specific analysis reveals distinct chemical signatures with heterogeneous patterns where color intensity varies dramatically (0.31 to 16.50) across cultivars. L1 regularization produces 54-69 percent feature reduction with only 4.63 percent accuracy decrease, demonstrating favorable interpretability-performance trade-offs. We propose an optimal 5-feature subset achieving 62 percent complexity reduction with estimated 92-94 percent accuracy, enabling cost-effective deployment with 80 dollars savings per sample and 56 percent time reduction. Statistical validation confirms robust generalization with sub-2ms prediction latency suitable for real-time quality control. Our findings provide actionable guidelines for practitioners balancing comprehensive chemical analysis against targeted feature measurement in resource-constrained environments.

Related papers

A Data-Driven Approach to Support Clinical Renal Replacement Therapy [1.7666791716676549]
This study investigates a data-driven machine learning approach to predict membrane fouling in critically ill patients undergoing Continuous Renal Replacement Therapy (CRRT)<n>Using time-series data from an ICU, 16 clinically selected features were identified to train predictive models.<n>Results remained robust across different forecasting horizons.
arXiv Detail & Related papers (2026-02-26T11:47:22Z)
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models [108.26461635308796]
We introduce Rationale Consistency, a fine-grained metric that quantifies the alignment between the model's reasoning process and human judgment.<n>Our evaluation of frontier models reveals that rationale consistency effectively discriminates among state-of-the-art models.<n>We introduce a hybrid signal that combines rationale consistency with outcome accuracy for GenRM training.
arXiv Detail & Related papers (2026-02-04T15:24:52Z)
Data Distribution as a Lever for Guiding Optimizers Toward Superior Generalization in LLMs [60.68927774057402]
We show, for the first time, that a lower simplicity bias induces a better generalization.<n>Motivated by this insight, we demonstrate that the training data distribution by upsampling or augmenting examples learned later in training similarly reduces SB and leads to improved generalization.<n>Our strategy improves the performance of multiple language models including Phi2-2.7B, Llama3.2-1B, Gemma3-1B-PT, Qwen3-0.6B-Base-achieving relative accuracy gains up to 18% when fine-tuned with AdamW and Muon.
arXiv Detail & Related papers (2026-01-31T07:40:36Z)
Multi-Method Analysis of Mathematics Placement Assessments: Classical, Machine Learning, and Clustering Approaches [0.0]
This study evaluates a 40-item mathematics placement examination administered to 198 students.<n>It uses a multi-method framework combining Classical Test Theory, machine learning, and unsupervised clustering.
arXiv Detail & Related papers (2025-11-06T18:53:07Z)
Rethinking LLM Evaluation: Can We Evaluate LLMs with 200x Less Data? [82.09573568241724]
EssenceBench is a coarse-to-fine framework utilizing an iterative Genetic Algorithm (GA)<n>Our approach yields superior compression results with lower reconstruction error and markedly higher efficiency.<n>On the HellaSwag benchmark (10K samples), our method preserves the ranking of all models shifting within 5% using 25x fewer samples, and achieves 95% ranking preservation shifting within 5% using only 200x fewer samples.
arXiv Detail & Related papers (2025-10-12T05:38:10Z)
From Noisy Traces to Stable Gradients: Bias-Variance Optimized Preference Optimization for Aligning Large Reasoning Models [90.45197506653341]
Large reasoning models generate intermediate reasoning traces before producing final answers.<n> aligning LRMs with human preferences, a crucial prerequisite for model deployment, remains underexplored.<n>A common workaround optimized a single sampled trajectory, which introduces substantial gradient variance from trace sampling.
arXiv Detail & Related papers (2025-10-06T17:58:01Z)
Signal Fidelity Index-Aware Calibration for Dementia Predictions Across Heterogeneous Real-World Data [1.741250583668341]
We develop a Signal Fidelity Index (SFI) diagnostic data quality at the patient level in dementia.<n>We test SFI-aware calibration for improving model performance across heterogeneous datasets without outcome labels.
arXiv Detail & Related papers (2025-09-10T15:19:04Z)
Advanced Multi-Architecture Deep Learning Framework for BIRADS-Based Mammographic Image Retrieval: Comprehensive Performance Analysis with Super-Ensemble Optimization [0.0]
mammographic image retrieval systems require exact BIRADS categorical matching across five distinct classes.<n>Current medical image retrieval studies suffer from methodological limitations.
arXiv Detail & Related papers (2025-08-06T18:05:18Z)
Advancing Tabular Stroke Modelling Through a Novel Hybrid Architecture and Feature-Selection Synergy [0.9999629695552196]
The present work develops and validates a data-driven and interpretable machine-learning framework designed to predict strokes.<n>Ten routinely gathered demographic, lifestyle, and clinical variables were sourced from a public cohort of 4,981 records.<n>The proposed model achieved an accuracy rate of 97.2% and an F1-score of 97.15%, indicating a significant enhancement compared to the leading individual model.
arXiv Detail & Related papers (2025-05-18T21:46:45Z)
Flow-GRPO: Training Flow Matching Models via Online RL [75.70017261794422]
We propose Flow-GRPO, the first method integrating online reinforcement learning (RL) into flow matching models.<n>Our approach uses two key strategies: (1) an ODE-to-SDE conversion that transforms a deterministic Ordinary Equation (ODE) into an equivalent Differential Equation (SDE) that matches the original model's marginal distribution at all timesteps; and (2) a Denoising Reduction strategy that reduces training denoising steps while retaining the original inference timestep number.
arXiv Detail & Related papers (2025-05-08T17:58:45Z)
Classifier Enhanced Deep Learning Model for Erythroblast Differentiation with Limited Data [0.08388591755871733]
Hematological disorders, which involve 1% of conditions and genetic diseases, present significant diagnostic challenges. Our approach evaluates various machine learning settings offering efficacy of various machine variety learning (ML) models. When data is available, the proposed solution is a solution for achieving higher accuracy for small and unique datasets.
arXiv Detail & Related papers (2024-11-23T15:51:15Z)
Accurate and Reliable Predictions with Mutual-Transport Ensemble [46.368395985214875]
We propose a co-trained auxiliary model and adaptively regularizes the cross-entropy loss using Kullback-Leibler (KL) We show that MTE can simultaneously enhance both accuracy and uncertainty calibration. For example, on the CIFAR-100 dataset, our MTE method on ResNet34/50 achieved significant improvements compared to previous state-of-the-art method.
arXiv Detail & Related papers (2024-05-30T03:15:59Z)
(Certified!!) Adversarial Robustness for Free! [116.6052628829344]
We certify 71% accuracy on ImageNet under adversarial perturbations constrained to be within a 2-norm of 0.5. We obtain these results using only pretrained diffusion models and image classifiers, without requiring any fine tuning or retraining of model parameters.
arXiv Detail & Related papers (2022-06-21T17:27:27Z)
Identifying and mitigating bias in algorithms used to manage patients in a pandemic [4.756860520861679]
Logistic regression models were created to predict COVID-19 mortality, ventilator status and inpatient status using a real-world dataset. Models showed a 57% decrease in the number of biased trials. After calibration, the average sensitivity of the predictive models increased from 0.527 to 0.955.
arXiv Detail & Related papers (2021-10-30T21:10:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.