Related papers: Interpretable Spectral Features Predict Conductivity in Self-Driving Doped Conjugated Polymer Labs

Interpretable Spectral Features Predict Conductivity in Self-Driving Doped Conjugated Polymer Labs

URL: http://arxiv.org/abs/2509.21330v1
Date: Sat, 06 Sep 2025 18:00:40 GMT
Title: Interpretable Spectral Features Predict Conductivity in Self-Driving Doped Conjugated Polymer Labs
Authors: Ankush Kumar Mishra, Jacob P. Mauthe, Nicholas Luke, Aram Amassian, Baskar Ganapathysubramanian,
Abstract summary: Self-driving labs promise faster materials discovery by coupling automation with machine learning.<n>We address this by learning interpretable spectral fingerprints from optical spectroscopy to predict electrical conductivity.
Score: 2.8914750842461583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-driving labs (SDLs) promise faster materials discovery by coupling automation with machine learning, but a central challenge is predicting costly, slow-to-measure properties from inexpensive, automatable readouts. We address this for doped conjugated polymers by learning interpretable spectral fingerprints from optical spectroscopy to predict electrical conductivity. Optical spectra are fast, non-destructive, and sensitive to aggregation and charge generation; we automate their featurization by combining a genetic algorithm (GA) with area-under-the-curve (AUC) computations over adaptively selected spectral windows. These data-driven spectral features, together with processing parameters, are used to train a quantitative structure-property relationship (QSPR) linking optical response and processing to conductivity. To improve accuracy and interpretability in the small-data regime, we add domain-knowledge-based feature expansions and apply SHAP-guided selection to retain a compact, physically meaningful feature set. The pipeline is evaluated under a leak-free train/test protocol, and GA is repeated to assess feature stability. The data-driven model matches the performance of a baseline built from expert-curated descriptors while reducing experimental effort (about 33%) by limiting direct conductivity measurements. Combining data-driven and expert features yields a hybrid QSPR with superior predictive performance, highlighting productive human-ML collaboration. The learned features recover known descriptors in pBTTT (0-0/0-1 vibronic intensity ratio) and reveal a tail-state region correlated with polymer bleaching during successful doping. This approach delivers interpretable, noise-robust, small-data-friendly features that convert rapid measurements into reliable predictions of costly properties and readily extends to other spectral modalities (e.g., XANES, Raman, FTIR).

Related papers

From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark [67.29937933325849]
Operando IR Prediction aims to forecast the time-resolved evolution of spectral fingerprints'' from a single static spectrum.<n>OpIRSpec-7K comprises 7,118 high-quality samples across 10 distinct battery systems.<n>ABCC significantly outperforms state-of-the-art static, sequential, and generative baselines.
arXiv Detail & Related papers (2026-02-20T18:58:43Z)
Spectral Gating Networks [65.9496901693099]
We introduce Spectral Gating Networks (SGN) to introduce frequency-rich expressivity in feed-forward networks.<n>SGN augments a standard activation pathway with a compact spectral pathway and learnable gates that allow the model to start from a stable base behavior.<n>It consistently improves accuracy-efficiency trade-offs under comparable computational budgets.
arXiv Detail & Related papers (2026-02-07T20:00:49Z)
SE-MLP Model for Predicting Prior Acceleration Features in Penetration Signals [21.0646467947979]
This paper proposes a multi-layer Perceptron architecture, termed squeeze and excitation multi-layer perceptron (SE-MLP)<n>It integrates a channel attention with residual connections to enable rapid prediction of acceleration feature values.<n> Numerical simulations and range recovery tests show that the discrepancies between predicted and measured acceleration peaks and pulse widths remain within acceptable engineering tolerances.
arXiv Detail & Related papers (2025-12-29T01:18:08Z)
DiSE: A diffusion probabilistic model for automatic structure elucidation of organic compounds [17.43184484460819]
DiSE is an end-to-end diffusion-based generative model that integrates multiple spectroscopic modalities.<n>It achieves superior accuracy, strong generalization across chemically diverse datasets, and robustness to experimental data despite being trained on calculated spectra.
arXiv Detail & Related papers (2025-10-30T08:10:03Z)
OASIS: A Deep Learning Framework for Universal Spectroscopic Analysis Driven by Novel Loss Functions [4.0097349146966925]
We introduce a machine learning (ML) framework for technique-independent, automated spectral analysis.<n>OASIS achieves its versatility through models trained on a strategically designed synthetic dataset.<n>This study underscores the optimization of the loss function as a key resource-efficient strategy to develop high-performance ML models.
arXiv Detail & Related papers (2025-09-15T01:28:51Z)
LUMIR: an LLM-Driven Unified Agent Framework for Multi-task Infrared Spectroscopy Reasoning [12.138903544219724]
This study introduces LUMIR, a framework designed to achieve accurate infrared spectral analysis under low data conditions.<n> LUMIR integrates a structured literature knowledge base, automated preprocessing, feature extraction, and predictive modeling into a unified pipeline.<n>It was validated on diverse datasets, including the publicly available Milk near-infrared dataset, Chinese medicinal herbs, Citri Reticulatae Pericarpium(CRP) with different storage durations, an industrial wastewater COD dataset, Tecator and Corn.
arXiv Detail & Related papers (2025-07-29T03:20:51Z)
A Hybrid Artificial Intelligence Method for Estimating Flicker in Power Systems [42.76841620787673]
This paper introduces a novel hybrid AI method combining H filtering and an adaptive linear neuron network for flicker component estimation in power distribution systems.<n>The proposed method leverages the robustness of the H filter to extract the voltage envelope under uncertain and noisy conditions followed by the use of ADALINE to accurately identify flicker frequencies embedded in the envelope.
arXiv Detail & Related papers (2025-06-16T15:38:39Z)
Rapid analysis of point-contact Andreev reflection spectra via machine learning with adaptive data augmentation [14.94657556857823]
Point-contact Andreev reflection (PCAR) measurement is a powerful tool for identifying the order parameters.<n>In this study, we employ a convolutional neural network (CNN) algorithm to create models for rapid and automated analysis of PCAR spectra of various superconductors.
arXiv Detail & Related papers (2025-03-13T04:45:38Z)
Adaptive Clustering for Efficient Phenotype Segmentation of UAV Hyperspectral Data [1.6135226672466307]
Unmanned Aerial Vehicles (UAVs) combined with Hyperspectral imaging (HSI) offer potential for environmental and agricultural applications.<n>This paper introduces an Online Hyperspectral Simple Linear Iterative Clustering algorithm (OHSLIC) framework for real-time tree phenotype segmentation.
arXiv Detail & Related papers (2025-01-17T13:48:04Z)
Semiparametric inference for impulse response functions using double/debiased machine learning [49.1574468325115]
We introduce a machine learning estimator for the impulse response function (IRF) in settings where a time series of interest is subjected to multiple discrete treatments. The proposed estimator can rely on fully nonparametric relations between treatment and outcome variables, opening up the possibility to use flexible machine learning approaches to estimate IRFs.
arXiv Detail & Related papers (2024-11-15T07:42:02Z)
Holistic Physics Solver: Learning PDEs in a Unified Spectral-Physical Space [54.13671100638092]
Holistic Physics Mixer (HPM) is a framework for integrating spectral and physical information in a unified space.<n>We show that HPM consistently outperforms state-of-the-art methods in both accuracy and computational efficiency.
arXiv Detail & Related papers (2024-10-15T08:19:39Z)
Learning Radio Environments by Differentiable Ray Tracing [56.40113938833999]
We introduce a novel gradient-based calibration method, complemented by differentiable parametrizations of material properties, scattering and antenna patterns. We have validated our method using both synthetic data and real-world indoor channel measurements, employing a distributed multiple-input multiple-output (MIMO) channel sounder.
arXiv Detail & Related papers (2023-11-30T13:50:21Z)
Closing the loop: Autonomous experiments enabled by machine-learning-based online data analysis in synchrotron beamline environments [80.49514665620008]
Machine learning can be used to enhance research involving large or rapidly generated datasets. In this study, we describe the incorporation of ML into a closed-loop workflow for X-ray reflectometry (XRR) We present solutions that provide an elementary data analysis in real time during the experiment without introducing the additional software dependencies in the beamline control software environment.
arXiv Detail & Related papers (2023-06-20T21:21:19Z)
Gaussian Process Regression for Absorption Spectra Analysis of Molecular Dimers [68.8204255655161]
We discuss an approach based on a machine learning technique, where the parameters for the numerical calculations are chosen from Gaussian Process Regression (GPR) This approach does not only quickly converge to an optimal parameter set, but in addition provides information about the complete parameter space. We find that indeed the GPR gives reliable results which are in agreement with direct calculations of these parameters using quantum chemical methods.
arXiv Detail & Related papers (2021-12-14T17:46:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.