Multi-Target Tobit Models for Completing Water Quality Data
- URL: http://arxiv.org/abs/2302.10648v1
- Date: Tue, 21 Feb 2023 13:06:19 GMT
- Title: Multi-Target Tobit Models for Completing Water Quality Data
- Authors: Yuya Takada and Tsuyoshi Kato
- Abstract summary: Tobit model is a well-known linear regression model for analyzing censored data.
In this study, we devised a novel extension of the classical Tobit model, called the emphmulti-target Tobit model, to handle multiple censored variables simultaneously.
Experiments conducted using several real-world water quality datasets provided evidence that estimating multiple columns jointly gains a great advantage over estimating them separately.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monitoring microbiological behaviors in water is crucial to manage public
health risk from waterborne pathogens, although quantifying the concentrations
of microbiological organisms in water is still challenging because
concentrations of many pathogens in water samples may often be below the
quantification limit, producing censoring data. To enable statistical analysis
based on quantitative values, the true values of non-detected measurements are
required to be estimated with high precision. Tobit model is a well-known
linear regression model for analyzing censored data. One drawback of the Tobit
model is that only the target variable is allowed to be censored. In this
study, we devised a novel extension of the classical Tobit model, called the
\emph{multi-target Tobit model}, to handle multiple censored variables
simultaneously by introducing multiple target variables. For fitting the new
model, a numerical stable optimization algorithm was developed based on
elaborate theories. Experiments conducted using several real-world water
quality datasets provided an evidence that estimating multiple columns jointly
gains a great advantage over estimating them separately.
Related papers
- Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks.
It is quite beneficial and challenging to detect poisoned samples from a mixed dataset.
We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z) - Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z) - Short-term prediction of stream turbidity using surrogate data and a
meta-model approach [0.0]
We build and compare the ability of dynamic regression (ARIMA), long short-term memory neural nets (LSTM), and generalized additive models (GAM) to forecast stream turbidity.
We construct a meta-model, trained on time-series features of turbidity, to take advantage of the strengths of each model over different time points.
Our findings indicate that temperature and light-associated variables, for example underwater illuminance, may hold promise as cost-effective surrogates of turbidity.
arXiv Detail & Related papers (2022-10-11T23:05:32Z) - Building Robust Machine Learning Models for Small Chemical Science Data:
The Case of Shear Viscosity [3.4761212729163313]
We train several Machine Learning models to predict the shear viscosity of a Lennard-Jones (LJ) fluid.
Specifically, the issues related to model selection, performance estimation and uncertainty quantification were investigated.
arXiv Detail & Related papers (2022-08-23T07:33:14Z) - Pseudo value-based Deep Neural Networks for Multi-state Survival
Analysis [9.659041001051415]
We propose a new class of pseudo-value-based deep learning models for multi-state survival analysis.
Our proposed models achieve state-of-the-art results under various censoring settings.
arXiv Detail & Related papers (2022-07-12T03:58:05Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Modeling High-Dimensional Data with Unknown Cut Points: A Fusion
Penalized Logistic Threshold Regression [2.520538806201793]
In traditional logistic regression models, the link function is often assumed to be linear and continuous in predictors.
We consider a threshold model that all continuous features are discretized into ordinal levels, which further determine the binary responses.
We find the lasso model is well suited in the problem of early detection and prediction for chronic disease like diabetes.
arXiv Detail & Related papers (2022-02-17T04:16:40Z) - X-model: Improving Data Efficiency in Deep Learning with A Minimax Model [78.55482897452417]
We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
arXiv Detail & Related papers (2021-10-09T13:56:48Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - A General Framework for Survival Analysis and Multi-State Modelling [70.31153478610229]
We use neural ordinary differential equations as a flexible and general method for estimating multi-state survival models.
We show that our model exhibits state-of-the-art performance on popular survival data sets and demonstrate its efficacy in a multi-state setting.
arXiv Detail & Related papers (2020-06-08T19:24:54Z) - Reducing complexity and unidentifiability when modelling human atrial
cells [0.0]
It is critical to understand the uncertainty hidden in parameter estimates from their calibration to experimental data.
This study applies approximate Bayesian computation to re-calibrate the gating kinetics of four ion channels in two existing human atrial cell models.
arXiv Detail & Related papers (2020-01-29T16:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.