Benchmarking Deep Learning Models for Raman Spectroscopy Across Open-Source Datasets
- URL: http://arxiv.org/abs/2601.16107v1
- Date: Thu, 22 Jan 2026 16:54:53 GMT
- Title: Benchmarking Deep Learning Models for Raman Spectroscopy Across Open-Source Datasets
- Authors: Adithya Sineesh, Akshita Kamsali,
- Abstract summary: This study presents one of the first systematic benchmarks comparing three or more published Raman-specific deep learning classifiers across multiple open-source Raman datasets.<n>We report classification accuracies and macro-averaged F1 scores to provide a fair and reproducible comparison of deep learning models for Raman spectra based classification.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning classifiers for Raman spectroscopy are increasingly reported to outperform classical chemometric approaches. However their evaluations are often conducted in isolation or compared against traditional machine learning methods or trivially adapted vision-based architectures that were not originally proposed for Raman spectroscopy. As a result, direct comparisons between existing deep learning models developed specifically for Raman spectral analysis on shared open-source datasets remain scarce. To the best of our knowledge, this study presents one of the first systematic benchmarks comparing three or more published Raman-specific deep learning classifiers across multiple open-source Raman datasets. We evaluate five representative deep learning architectures under a unified training and hyperparameter tuning protocol across three open-source Raman datasets selected to support standard evaluation, fine-tuning, and explicit distribution-shift testing. We report classification accuracies and macro-averaged F1 scores to provide a fair and reproducible comparison of deep learning models for Raman spectra based classification.
Related papers
- FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics [22.314786276794717]
The identification and property prediction of chemical molecules is of central importance in the advancement of drug discovery and material science.<n>Deep learning models appear promising for predicting molecular structure spectra, but overall assessment remains challenging.<n>Our contribution is the creation of benchmark framework FlexMS for constructing and evaluating diverse model architectures in mass spectrum prediction.
arXiv Detail & Related papers (2026-02-26T10:05:01Z) - Mitra: Mixed Synthetic Priors for Enhancing Tabular Foundation Models [85.64873567417396]
We introduce Mitra, a TFM trained on a curated mixture of synthetic priors selected for their diversity, distinctiveness, and performance on real-world data.<n>Mitra consistently outperforms state-of-the-art TFMs, such as TabPFNv2 and TabICL, across both classification and regression benchmarks.
arXiv Detail & Related papers (2025-10-24T07:15:06Z) - Pulse Shape Discrimination Algorithms: Survey and Benchmark [7.302101804475471]
This review presents a comprehensive survey and benchmark of pulse shape discrimination (PSD) algorithms for radiation detection.<n>We implement and evaluate all on two standardized datasets, using metrics including Figure of Merit (FOM), F1-score, ROC-AUC, and inter-method correlations.<n>Deep learning models, particularly Multi-Layer Perceptrons (MLPs) and hybrid approaches combining statistical features with neural regression, often outperform traditional methods.
arXiv Detail & Related papers (2025-08-03T04:41:32Z) - High-Performance Few-Shot Segmentation with Foundation Models: An Empirical Study [64.06777376676513]
We develop a few-shot segmentation (FSS) framework based on foundation models.
To be specific, we propose a simple approach to extract implicit knowledge from foundation models to construct coarse correspondence.
Experiments on two widely used datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-09-10T08:04:11Z) - A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised
Learning [33.05104609131764]
Open-world semi-supervised learning aims at inferring both known and novel classes in unlabeled data.
This paper formalizes a graph-theoretic framework tailored for the open-world setting.
Our graph-theoretic framework illuminates practical algorithms and provides guarantees.
arXiv Detail & Related papers (2023-11-06T21:15:09Z) - Tackling Computational Heterogeneity in FL: A Few Theoretical Insights [68.8204255655161]
We introduce and analyse a novel aggregation framework that allows for formalizing and tackling computational heterogeneous data.
Proposed aggregation algorithms are extensively analyzed from a theoretical, and an experimental prospective.
arXiv Detail & Related papers (2023-07-12T16:28:21Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Aligning Logits Generatively for Principled Black-Box Knowledge Distillation [49.43567344782207]
Black-Box Knowledge Distillation (B2KD) is a formulated problem for cloud-to-edge model compression with invisible data and models hosted on the server.
We formalize a two-step workflow consisting of deprivatization and distillation.
We propose a new method Mapping-Emulation KD (MEKD) that distills a black-box cumbersome model into a lightweight one.
arXiv Detail & Related papers (2022-05-21T02:38:16Z) - Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Raman Spectrum Matching with Contrastive Representation Learning [7.070018798821577]
We propose a new machine learning technique for Raman spectrum matching, based on contrastive representation learning.
Our approach significantly improves or is on par with the state of the art in prediction accuracy.
arXiv Detail & Related papers (2022-02-25T08:32:27Z) - RamanNet: A generalized neural network architecture for Raman Spectrum
Analysis [4.670045009583903]
Raman spectroscopy provides a vibrational profile of the molecules and can be used to identify different kind of materials.
Despite the recent rise in Raman spectra data volume, there has not been any significant effort in developing generalized machine learning methods for Raman spectra analysis.
We examine, experiment and evaluate existing methods and conjecture that neither current sequential models nor traditional machine learning models are satisfactorily sufficient to analyze Raman spectra.
arXiv Detail & Related papers (2022-01-20T23:15:25Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.