Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra
- URL: http://arxiv.org/abs/2512.06059v1
- Date: Fri, 05 Dec 2025 17:43:11 GMT
- Title: Deep learning recognition and analysis of Volatile Organic Compounds based on experimental and synthetic infrared absorption spectra
- Authors: Andrea Della Valle, Annalisa D'Arco, Tiziana Mancini, Rosanna Mosetti, Maria Chiara Paolozzi, Stefano Lupi, Sebastiano Pilati, Andrea Perali,
- Abstract summary: Volatile Organic Compounds (VOCs) pose significant risks to human health.<n>Infrared (IR) spectroscopy enables the ultrasensitive detection at low-concentrations of VOCs in the atmosphere.<n>Deep neural networks (NNs) are increasingly used for the recognition of complex data structures.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Volatile Organic Compounds (VOCs) are organic molecules that have low boiling points and therefore easily evaporate into the air. They pose significant risks to human health, making their accurate detection the crux of efforts to monitor and minimize exposure. Infrared (IR) spectroscopy enables the ultrasensitive detection at low-concentrations of VOCs in the atmosphere by measuring their IR absorption spectra. However, the complexity of the IR spectra limits the possibility to implement VOC recognition and quantification in real-time. While deep neural networks (NNs) are increasingly used for the recognition of complex data structures, they typically require massive datasets for the training phase. Here, we create an experimental VOC dataset for nine different classes of compounds at various concentrations, using their IR absorption spectra. To further increase the amount of spectra and their diversity in term of VOC concentration, we augment the experimental dataset with synthetic spectra created via conditional generative NNs. This allows us to train robust discriminative NNs, able to reliably identify the nine VOCs, as well as to precisely predict their concentrations. The trained NN is suitable to be incorporated into sensing devices for VOCs recognition and analysis.
Related papers
- How well can off-the-shelf LLMs elucidate molecular structures from mass spectra using chain-of-thought reasoning? [51.286853421822705]
Large language models (LLMs) have shown promise for reasoning-intensive scientific tasks, but their capability for chemical interpretation is still unclear.<n>We introduce a Chain-of-Thought (CoT) prompting framework and benchmark that evaluate how LLMs reason about mass spectral data to predict molecular structures.<n>Our evaluation across metrics of SMILES validity, formula consistency, and structural similarity reveals that while LLMs can produce syntactically valid and partially plausible structures, they fail to achieve chemical accuracy or link reasoning to correct molecular predictions.
arXiv Detail & Related papers (2026-01-09T20:08:42Z) - SpectrumFM: Redefining Spectrum Cognition via Foundation Modeling [65.65474629224558]
We propose a spectrum foundation model, termed SpectrumFM, which provides a new paradigm for spectrum cognition.<n>An innovative spectrum encoder that exploits the convolutional neural networks is proposed to effectively capture both fine-grained local signal structures and high-level global dependencies in the spectrum data.<n>Two novel self-supervised learning tasks, namely masked reconstruction and next-slot signal prediction, are developed for pre-training SpectrumFM, enabling the model to learn rich and transferable representations.
arXiv Detail & Related papers (2025-08-02T14:40:50Z) - LUMIR: an LLM-Driven Unified Agent Framework for Multi-task Infrared Spectroscopy Reasoning [12.138903544219724]
This study introduces LUMIR, a framework designed to achieve accurate infrared spectral analysis under low data conditions.<n> LUMIR integrates a structured literature knowledge base, automated preprocessing, feature extraction, and predictive modeling into a unified pipeline.<n>It was validated on diverse datasets, including the publicly available Milk near-infrared dataset, Chinese medicinal herbs, Citri Reticulatae Pericarpium(CRP) with different storage durations, an industrial wastewater COD dataset, Tecator and Corn.
arXiv Detail & Related papers (2025-07-29T03:20:51Z) - Using machine learning to map simulated noisy and laser-limited multidimensional spectra to molecular electronic couplings [0.0]
We show how factors associated with experimental 2D spectral data influence the ability of NNs to map simulated 2DES spectra onto intermolecular electronic couplings.<n>In stark contrast to human-based analyses of 2DES data, we find that the NN accuracy improves significantly when the data are constrained by the bandwidth and center frequency of the pump pulses.
arXiv Detail & Related papers (2025-03-19T21:40:00Z) - DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra [60.39311767532607]
We present DiffMS, a formula-restricted encoder-decoder generative network that achieves state-of-the-art performance on this task.<n>To develop a robust decoder that bridges latent embeddings and molecular structures, we pretrain the diffusion decoder with fingerprint-structure pairs.<n>Experiments on established benchmarks show that DiffMS outperforms existing models on de novo molecule generation.
arXiv Detail & Related papers (2025-02-13T18:29:48Z) - Open-Path Detection of Organic Vapors via Quantum Infrared Spectroscopy [0.0]
QFTIR spectroscopy emerged as an alternative to conventional spectroscopy in the mid-infrared region of the spectrum.
We present the first use of a QFTIR spectrometer for open-path detection of multiple interfering organic gases in ambient air.
arXiv Detail & Related papers (2024-05-21T14:26:51Z) - ChemVise: Maximizing Out-of-Distribution Chemical Detection with the
Novel Application of Zero-Shot Learning [60.02503434201552]
This research proposes learning approximations of complex exposures from training sets of simple ones.
We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes.
arXiv Detail & Related papers (2023-02-09T20:19:57Z) - Quantum-enhanced absorption spectroscopy with bright squeezed frequency
combs [91.3755431537592]
We propose a strategy combining the advantages of frequency modulation spectroscopy with the reduced noise properties accessible by squeezing the probe state.
A homodyne detection scheme allows the simultaneous measurement of the absorption at multiple frequencies.
We predict a significant enhancement of the signal-to-noise ratio that scales exponentially with the squeezing factor.
arXiv Detail & Related papers (2022-09-30T17:57:05Z) - Machine learning identification of organic compounds using visible light [0.0]
Laser-based techniques are promising for autonomous compound detection because the optical response of materials encodes enough electronic and vibrational information for remote chemical identification.
We develop a machine learning classifier that can accurately identify organic species based on a single-wavelength dispersive measurement in the visible spectral region, away from absorption resonances.
arXiv Detail & Related papers (2022-04-06T20:55:13Z) - Unsupervised Spectral Unmixing For Telluric Correction Using A Neural
Network Autoencoder [58.720142291102135]
We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of high-precision observed solar spectra from the HARPS-N radial velocity spectrograph.
arXiv Detail & Related papers (2021-11-17T12:54:48Z) - Deep Neural Networks for the Correction of Mie Scattering in
Fourier-Transformed Infrared Spectra of Biological Samples [0.0]
We propose an approach to approximate this complex preprocessing function using deep neural networks.
Our proposed method overcomes the trade-off between time and the corrected spectrum being biased towards an artificial reference spectrum.
arXiv Detail & Related papers (2020-02-18T16:07:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.