Controllable Missingness from Uncontrollable Missingness: Joint Learning
Measurement Policy and Imputation
- URL: http://arxiv.org/abs/2204.03872v1
- Date: Fri, 8 Apr 2022 06:51:37 GMT
- Title: Controllable Missingness from Uncontrollable Missingness: Joint Learning
Measurement Policy and Imputation
- Authors: Seongwook Yoon, Jaehyun Kim, Heejeong Lim, Sanghoon Sull
- Abstract summary: We mainly focus on retrieving complete data, so called as imputation.
We implement some variations of proposed algorithm for two different datasets and various missing rates.
- Score: 9.826027427965354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the cost or interference of measurement, we need to control
measurement system. Assuming that each variable can be measured sequentially,
there exists optimal policy choosing next measurement for the former
observations. Though optimal measurement policy is actually dependent on the
goal of measurement, we mainly focus on retrieving complete data, so called as
imputation. Also, we adapt the imputation method to missingness varying with
measurement policy. However, learning measurement policy and imputation
requires complete data which is impossible to be observed, unfortunately. To
tackle this problem, we propose a data generation method and joint learning
algorithm. The main idea is that 1) the data generation method is inherited by
imputation method, and 2) the adaptation of imputation encourages measurement
policy to learn more than individual learning. We implemented some variations
of proposed algorithm for two different datasets and various missing rates.
From the experimental results, we demonstrate that our algorithm is generally
applicable and outperforms baseline methods.
Related papers
- Improving Bias Correction Standards by Quantifying its Effects on Treatment Outcomes [54.18828236350544]
Propensity score matching (PSM) addresses selection biases by selecting comparable populations for analysis.
Different matching methods can produce significantly different Average Treatment Effects (ATE) for the same task, even when meeting all validation criteria.
To address this issue, we introduce a novel metric, A2A, to reduce the number of valid matches.
arXiv Detail & Related papers (2024-07-20T12:42:24Z) - Privacy Preserving Data Imputation via Multi-party Computation for Medical Applications [1.7999333451993955]
This study addresses privacy-preserving imputation methods for sensitive data using secure multi-party computation.
We specifically target the medical and healthcare domains considering the significance of protection of the patient data.
Experiments on the diabetes dataset validated the correctness of our privacy-preserving imputation methods, yielding the largest error around $3 times 10-3$.
arXiv Detail & Related papers (2024-05-29T08:36:42Z) - An unsupervised learning approach to evaluate questionnaire data -- what
one can learn from violations of measurement invariance [2.4762962548352467]
This paper promotes an unsupervised learning-based approach to such research data.
It works in three phases: data preparation, clustering of questionnaires, and measuring similarity based on the obtained clustering and the properties of each group.
It provides a natural comparison between groups and a natural description of the response patterns of the groups.
arXiv Detail & Related papers (2023-12-11T11:31:41Z) - Measurement-based Admission Control in Sliced Networks: A Best Arm
Identification Approach [68.8204255655161]
In sliced networks, the shared tenancy of slices requires adaptive admission control of data flows.
We devise a joint measurement and decision strategy that returns a correct decision with a certain level of confidence.
arXiv Detail & Related papers (2022-04-14T12:12:34Z) - To Impute or not to Impute? -- Missing Data in Treatment Effect
Estimation [84.76186111434818]
We identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection.
We show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates.
Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not.
arXiv Detail & Related papers (2022-02-04T12:08:31Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - A Novel Random Forest Dissimilarity Measure for Multi-View Learning [8.185807285320553]
Two methods are proposed, which modify the Random Forest proximity measure, to adapt it to the context of High Dimension Low Sample Size (HDLSS) multi-view classification problems.
The second method, based on an Instance Hardness measurement, is significantly more accurate than other state-of-the-art measurements.
arXiv Detail & Related papers (2020-07-06T07:54:52Z) - Provably Robust Metric Learning [98.50580215125142]
We show that existing metric learning algorithms can result in metrics that are less robust than the Euclidean distance.
We propose a novel metric learning algorithm to find a Mahalanobis distance that is robust against adversarial perturbations.
Experimental results show that the proposed metric learning algorithm improves both certified robust errors and empirical robust errors.
arXiv Detail & Related papers (2020-06-12T09:17:08Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z) - Supervised Categorical Metric Learning with Schatten p-Norms [10.995886294197412]
We propose a method, called CPML for emphcategorical projected metric learning, to address the problem of metric learning in categorical data.
We make use of the Value Distance Metric to represent our data and propose new distances based on this representation.
We then show how to efficiently learn new metrics.
arXiv Detail & Related papers (2020-02-26T01:17:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.