Related papers: A Robust Support Vector Machine Approach for Raman COVID-19 Data Classification

A Robust Support Vector Machine Approach for Raman COVID-19 Data Classification

URL: http://arxiv.org/abs/2501.17904v1
Date: Wed, 29 Jan 2025 14:02:45 GMT
Title: A Robust Support Vector Machine Approach for Raman COVID-19 Data Classification
Authors: Marco Piazza, Andrea Spinelli, Francesca Maggioni, Marzia Bedoni, Enza Messina,
Abstract summary: In this paper, we investigate the performance of a novel robust formulation for Support Vector Machine (SVM) in classifying COVID-19 samples obtained from Raman spectroscopy.<n>We derive robust counterpart models of deterministic formulations using bounded-by-norm uncertainty sets around each observation.<n>The effectiveness of our approach is validated on real-world COVID-19 datasets provided by Italian hospitals.
Score: 0.7864304771129751
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in healthcare technologies have led to the availability of large amounts of biological samples across several techniques and applications. In particular, in the last few years, Raman spectroscopy analysis of biological samples has been successfully applied for early-stage diagnosis. However, spectra' inherent complexity and variability make the manual analysis challenging, even for domain experts. For the same reason, the use of traditional Statistical and Machine Learning (ML) techniques could not guarantee for accurate and reliable results. ML models, combined with robust optimization techniques, offer the possibility to improve the classification accuracy and enhance the resilience of predictive models. In this paper, we investigate the performance of a novel robust formulation for Support Vector Machine (SVM) in classifying COVID-19 samples obtained from Raman Spectroscopy. Given the noisy and perturbed nature of biological samples, we protect the classification process against uncertainty through the application of robust optimization techniques. Specifically, we derive robust counterpart models of deterministic formulations using bounded-by-norm uncertainty sets around each observation. We explore the cases of both linear and kernel-induced classifiers to address binary and multiclass classification tasks. The effectiveness of our approach is validated on real-world COVID-19 datasets provided by Italian hospitals by comparing the results of our simulations with a state-of-the-art classifier.

Related papers

Clinical NLP with Attention-Based Deep Learning for Multi-Disease Prediction [44.0876796031468]
This paper addresses the challenges posed by the unstructured nature and high-dimensional semantic complexity of electronic health record texts.<n>A deep learning method based on attention mechanisms is proposed to achieve unified modeling for information extraction and multi-label disease prediction.
arXiv Detail & Related papers (2025-07-02T07:45:22Z)
Drug classification based on X-ray spectroscopy combined with machine learning [11.985793625437546]
X-ray absorption spectroscopy offers advantages such as ease of operation, penetrative observation, and strong substance differentiation capabilities.<n>In this study, we constructed a classification model using Convolutional Neural Networks (CNN), Support Vector Machines (SVM), and Particle Swarm Optimization (PSO)<n>The experimental results demonstrate that this model achieved higher classification accuracy compared to two other common methods, with a prediction accuracy of 99.14%.
arXiv Detail & Related papers (2025-05-04T04:49:55Z)
Stabilizing Machine Learning for Reproducible and Explainable Results: A Novel Validation Approach to Subject-Specific Insights [2.7516838144367735]
We propose a novel validation approach that uses a general ML model to ensure reproducible performance and robust feature importance analysis. We tested a single Random Forest (RF) model on nine datasets varying in domain, sample size, and demographics. Our repeated trials approach consistently identified key features at the subject level and improved group-level feature importance analysis.
arXiv Detail & Related papers (2024-12-16T23:14:26Z)
Prediction by Machine Learning Analysis of Genomic Data Phenotypic Frost Tolerance in Perccottus glenii [7.412214379486083]
We will employ machine learning techniques to analyze the gene sequences of Perccottus glenii. We constructed four classification models: Random Forest, LightGBM, XGBoost, and Decision Tree. The dataset used by these classification models was extracted from the National Center for Biotechnology Information database.
arXiv Detail & Related papers (2024-10-11T14:45:47Z)
Multimodal Prototyping for cancer survival prediction [45.61869793509184]
Multimodal survival methods combining gigapixel histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification. Current approaches involve tokenizing the WSIs into smaller patches (>10,000 patches) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes. This process generates many tokens, which leads to high memory requirements for computing attention and complicates post-hoc interpretability analyses. Our framework outperforms state-of-the-art methods with much less computation while unlocking new interpretability analyses.
arXiv Detail & Related papers (2024-06-28T20:37:01Z)
Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offs [0.7226586370054761]
We propose a mixture of experts (MoE) scheme for detecting five notable artifacts, including damaged tissue, blur, folded tissue, air bubbles, and histologically irrelevant blood. We developed DL pipelines using two MoEs and two multiclass models of state-of-the-art deep convolutional neural networks (DCNNs) and vision transformers (ViTs) The proposed MoE yields 86.15% F1 and 97.93% sensitivity scores on unseen data, retaining less computational cost for inference than MoE using ViTs.
arXiv Detail & Related papers (2024-03-12T15:22:05Z)
Sparse high-dimensional linear mixed modeling with a partitioned empirical Bayes ECM algorithm [41.25603565852633]
This work presents an efficient and accurate Bayesian framework for high-dimensional LMMs. The novelty of the approach lies in its partitioning and parameter expansion as well as its fast and scalable computation. A real-world example is provided using data from a study of lupus in children, where we identify genes and clinical factors associated with a new lupus biomarker and predict the biomarker over time.
arXiv Detail & Related papers (2023-10-18T19:34:56Z)
Machine Learning Small Molecule Properties in Drug Discovery [44.62264781248437]
We review a wide range of properties, including binding affinities, solubility, and ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) We discuss existing popular descriptors and embeddings, such as chemical fingerprints and graph-based neural networks. Finally, techniques to provide an understanding of model predictions, especially for critical decision-making in drug discovery are assessed.
arXiv Detail & Related papers (2023-08-02T22:18:41Z)
Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression. The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z)
Benchmarking Machine Learning Robustness in Covid-19 Genome Sequence Classification [109.81283748940696]
We introduce several ways to perturb SARS-CoV-2 genome sequences to mimic the error profiles of common sequencing platforms such as Illumina and PacBio. We show that some simulation-based approaches are more robust (and accurate) than others for specific embedding methods to certain adversarial attacks to the input sequences.
arXiv Detail & Related papers (2022-07-18T19:16:56Z)
Empirical Analysis of Machine Learning Configurations for Prediction of Multiple Organ Failure in Trauma Patients [7.122236250657051]
Multiple organ failure (MOF) is a life-threatening condition. We perform quantitative analysis on early MOF prediction with comprehensive machine learning (ML) configurations.
arXiv Detail & Related papers (2021-03-19T17:49:22Z)
Data-Driven Logistic Regression Ensembles With Applications in Genomics [0.0]
We introduce a novel approach to high-dimensional binary classification that integrates regularization with ensembling techniques.<n>In medical genomics applications, our approach identifies critical biomarkers overlooked by competing methods.
arXiv Detail & Related papers (2021-02-17T05:57:26Z)
Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production. One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs) We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z)
SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples. We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.