Related papers: Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data

Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data

URL: http://arxiv.org/abs/2402.13103v1
Date: Tue, 20 Feb 2024 15:58:45 GMT
Title: Multivariate Functional Linear Discriminant Analysis for the Classification of Short Time Series with Missing Data
Authors: Rahul Bordoloi, Cl\'emence R\'eda, Orell Trautmann, Saptarshi Bej and Olaf Wolkenhauer
Abstract summary: Functional linear discriminant analysis (FLDA) is a powerful tool that extends LDA-mediated multiclass classification. MUDRA allows interpretable classification of data sets with large proportions of missing data.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Functional linear discriminant analysis (FLDA) is a powerful tool that extends LDA-mediated multiclass classification and dimension reduction to univariate time-series functions. However, in the age of large multivariate and incomplete data, statistical dependencies between features must be estimated in a computationally tractable way, while also dealing with missing data. There is a need for a computationally tractable approach that considers the statistical dependencies between features and can handle missing values. We here develop a multivariate version of FLDA (MUDRA) to tackle this issue and describe an efficient expectation/conditional-maximization (ECM) algorithm to infer its parameters. We assess its predictive power on the "Articulary Word Recognition" data set and show its improvement over the state-of-the-art, especially in the case of missing data. MUDRA allows interpretable classification of data sets with large proportions of missing data, which will be particularly useful for medical or psychological data sets.

Related papers

CALF: A Conditionally Adaptive Loss Function to Mitigate Class-Imbalanced Segmentation [0.2902243522110345]
Imbalanced datasets pose a challenge in training deep learning (DL) models for medical diagnostics. We propose a novel, statistically driven, conditionally adaptive loss function (CALF) tailored to accommodate the conditions of imbalanced datasets in DL training.
arXiv Detail & Related papers (2025-04-06T12:03:33Z)
DUPRE: Data Utility Prediction for Efficient Data Valuation [49.60564885180563]
Cooperative game theory-based data valuation, such as Data Shapley, requires evaluating the data utility and retraining the ML model for multiple data subsets. Our framework, textttDUPRE, takes an alternative yet complementary approach that reduces the cost per subset evaluation by predicting data utilities instead of evaluating them by model retraining. Specifically, given the evaluated data utilities of some data subsets, textttDUPRE fits a emphGaussian process (GP) regression model to predict the utility of every other data subset.
arXiv Detail & Related papers (2025-02-22T08:53:39Z)
Directly Handling Missing Data in Linear Discriminant Analysis for Enhancing Classification Accuracy and Interpretability [1.4840867281815378]
We introduce a novel and robust classification method, termed weighted missing Linear Discriminant Analysis (WLDA) WLDA extends Linear Discriminant Analysis (LDA) to handle datasets with missing values without the need for imputation. We conduct an in-depth theoretical analysis to establish the properties of WLDA and thoroughly evaluate its explainability.
arXiv Detail & Related papers (2024-06-30T14:21:32Z)
Analysis and Optimization of Wireless Federated Learning with Data Heterogeneity [72.85248553787538]
This paper focuses on performance analysis and optimization for wireless FL, considering data heterogeneity, combined with wireless resource allocation. We formulate the loss function minimization problem, under constraints on long-term energy consumption and latency, and jointly optimize client scheduling, resource allocation, and the number of local training epochs (CRE) Experiments on real-world datasets demonstrate that the proposed algorithm outperforms other benchmarks in terms of the learning accuracy and energy consumption.
arXiv Detail & Related papers (2023-08-04T04:18:01Z)
Conditional expectation with regularization for missing data imputation [19.254291863337347]
Missing data frequently occurs in datasets across various domains, such as medicine, sports, and finance. We propose a new algorithm named "conditional Distribution-based Imputation of Missing Values with Regularization" (DIMV) DIMV operates by determining the conditional distribution of a feature that has missing entries, using the information from the fully observed features as a basis.
arXiv Detail & Related papers (2023-02-02T06:59:15Z)
Learning to Bound Counterfactual Inference in Structural Causal Models from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm. The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources. It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z)
A Variational Autoencoder for Heterogeneous Temporal and Longitudinal Data [0.3749861135832073]
Recently proposed extensions to VAEs that can handle temporal and longitudinal data have applications in healthcare, behavioural modelling, and predictive maintenance. We propose the heterogeneous longitudinal VAE (HL-VAE) that extends the existing temporal and longitudinal VAEs to heterogeneous data. HL-VAE provides efficient inference for high-dimensional datasets and includes likelihood models for continuous, count, categorical, and ordinal data.
arXiv Detail & Related papers (2022-04-20T10:18:39Z)
TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. We propose a versatile method that estimates joint distributions using an attention-based decoder. We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z)
RIFLE: Imputation and Robust Inference from Low Order Marginals [10.082738539201804]
We develop a statistical inference framework for regression and classification in the presence of missing data without imputation. Our framework, RIFLE, estimates low-order moments of the underlying data distribution with corresponding confidence intervals to learn a distributionally robust model. Our experiments demonstrate that RIFLE outperforms other benchmark algorithms when the percentage of missing values is high and/or when the number of data points is relatively small.
arXiv Detail & Related papers (2021-09-01T23:17:30Z)
Efficient Multidimensional Functional Data Analysis Using Marginal Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data. We show that the resulting estimation problem can be solved efficiently by the tensor decomposition. We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z)
Multi-Source Causal Inference Using Control Variates [81.57072928775509]
We propose a general algorithm to estimate causal effects from emphmultiple data sources. We show theoretically that this reduces the variance of the ATE estimate. We apply this framework to inference from observational data under an outcome selection bias.
arXiv Detail & Related papers (2021-03-30T21:20:51Z)
Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data. Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z)
Evaluating representations by the complexity of learning low-loss predictors [55.94170724668857]
We consider the problem of evaluating representations of data for use in solving a downstream task. We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest.
arXiv Detail & Related papers (2020-09-15T22:06:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.