Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification
- URL: http://arxiv.org/abs/2509.22919v1
- Date: Fri, 26 Sep 2025 20:47:00 GMT
- Title: Label-Guided Imputation via Forest-Based Proximities for Improved Time Series Classification
- Authors: Jake S. Rhodes, Adam G. Rustad, Sofia Pelagalli Maia, Evan Thacker, Hyunmi Choi, Jose Gutierrez, Tatjana Rundek, Ben Shaw,
- Abstract summary: We provide a framework for missing data imputation in the context of time series classification.<n>We define a means of imputing missing values conditional upon labels.<n>We show that imputation using this method generally provides richer information leading to higher classification accuracies.
- Score: 1.6863755729554886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Missing data is a common problem in time series data. Most methods for imputation ignore label information pertaining to the time series even if that information exists. In this paper, we provide a framework for missing data imputation in the context of time series classification, where each time series is associated with a categorical label. We define a means of imputing missing values conditional upon labels, the method being guided by powerful, existing supervised models designed for high accuracy in this task. From each model, we extract a tree-based proximity measure from which imputation can be applied. We show that imputation using this method generally provides richer information leading to higher classification accuracies, despite the imputed values differing from the true values.
Related papers
- Should We Always Train Models on Fine-Grained Classes? [0.0]
We show that training on fine-grained labels does not universally improve classification accuracy.<n>The effectiveness of this strategy depends critically on the geometric structure of the data and its relations with the label hierarchy.
arXiv Detail & Related papers (2025-09-05T14:15:46Z) - An End-to-End Model for Time Series Classification In the Presence of Missing Values [25.129396459385873]
Time series classification with missing data is a prevalent issue in time series analysis.
This study proposes an end-to-end neural network that unifies data imputation and representation learning within a single framework.
arXiv Detail & Related papers (2024-08-11T19:39:12Z) - Probabilistic Imputation for Time-series Classification with Missing
Data [17.956329906475084]
We propose a novel framework for classification with time series data with missing values.
Our deep generative model part is trained to impute the missing values in multiple plausible ways.
The classifier part takes the time series data along with the imputed missing values and classifies signals.
arXiv Detail & Related papers (2023-08-13T10:04:13Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Automated Labeling of German Chest X-Ray Radiology Reports using Deep
Learning [50.591267188664666]
We propose a deep learning-based CheXpert label prediction model, pre-trained on reports labeled by a rule-based German CheXpert model.
Our results demonstrate the effectiveness of our approach, which significantly outperformed the rule-based model on all three tasks.
arXiv Detail & Related papers (2023-06-09T16:08:35Z) - Robust Explainer Recommendation for Time Series Classification [4.817429789586127]
Time series classification is a task common in domains such as human activity recognition, sports analytics and general sensing.
Recently, a great variety of techniques have been proposed and adapted for time series to provide explanation in the form of saliency maps.
This paper provides a novel framework to quantitatively evaluate and rank explanation methods for time series classification.
arXiv Detail & Related papers (2023-06-08T18:49:23Z) - Multi-task Meta Label Correction for Time Series Prediction [10.08574256346388]
We create a label correction method to time series data with meta-learning under a multi-task framework.
Results show that our method is more effective and accurate than some existing label correction techniques.
arXiv Detail & Related papers (2023-03-09T08:20:17Z) - Association Graph Learning for Multi-Task Classification with Category
Shifts [68.58829338426712]
We focus on multi-task classification, where related classification tasks share the same label space and are learned simultaneously.
We learn an association graph to transfer knowledge among tasks for missing classes.
Our method consistently performs better than representative baselines.
arXiv Detail & Related papers (2022-10-10T12:37:41Z) - Semi-unsupervised Learning for Time Series Classification [1.8811803364757567]
Time series are ubiquitous and inherently hard to analyze and ultimately to label or cluster.
We present SuSL4TS, a deep generative Gaussian mixture model for semi-unsupervised learning to classify time series data.
arXiv Detail & Related papers (2022-07-07T06:59:38Z) - On Leveraging Unlabeled Data for Concurrent Positive-Unlabeled Classification and Robust Generation [72.062661402124]
We present a novel training framework to jointly target PU classification and conditional generation when exposed to extra data.<n>We prove the optimal condition of CNI-CGAN and experimentally, we conducted extensive evaluations on diverse datasets.
arXiv Detail & Related papers (2020-06-14T08:27:40Z) - Structured Prediction with Partial Labelling through the Infimum Loss [85.4940853372503]
The goal of weak supervision is to enable models to learn using only forms of labelling which are cheaper to collect.
This is a type of incomplete annotation where, for each datapoint, supervision is cast as a set of labels containing the real one.
This paper provides a unified framework based on structured prediction and on the concept of infimum loss to deal with partial labelling.
arXiv Detail & Related papers (2020-03-02T13:59:41Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.