HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference
- URL: http://arxiv.org/abs/2601.05017v1
- Date: Thu, 08 Jan 2026 15:18:36 GMT
- Title: HMVI: Unifying Heterogeneous Attributes with Natural Neighbors for Missing Value Inference
- Authors: Xiaopeng Luo, Zexi Tan, Zhuowei Wang,
- Abstract summary: Current imputation methods handle numerical and categorical attributes independently, overlooking critical interdependencies among heterogeneous features.<n>We propose a novel imputation approach that explicitly models cross-type feature dependencies within a unified framework.
- Score: 1.0577954299884882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Missing value imputation is a fundamental challenge in machine intelligence, heavily dependent on data completeness. Current imputation methods often handle numerical and categorical attributes independently, overlooking critical interdependencies among heterogeneous features. To address these limitations, we propose a novel imputation approach that explicitly models cross-type feature dependencies within a unified framework. Our method leverages both complete and incomplete instances to ensure accurate and consistent imputation in tabular data. Extensive experimental results demonstrate that the proposed approach achieves superior performance over existing techniques and significantly enhances downstream machine learning tasks, providing a robust solution for real-world systems with missing data.
Related papers
- Simple Yet Effective Selective Imputation for Incomplete Multi-view Clustering [30.6002437648244]
We propose Informativeness-based Selective imputation Multi-View Clustering (ISMVC)<n>Our method evaluates the imputation-relevant informativeness of each missing position based on intra-view similarity and cross-view consistency.<n>Compared with existing cautious imputation strategies that depend on training dynamics or model feedback, our method is lightweight, data-driven, and model-agnostic.
arXiv Detail & Related papers (2025-12-11T06:22:23Z) - FDRMFL:Multi-modal Federated Feature Extraction Model Based on Information Maximization and Contrastive Learning [4.453671369861554]
This study focuses on the feature extraction problem in multi-modal data regression.<n>It addresses three core challenges in real-world scenarios: limited and non-IID data, effective extraction and fusion of multi-modal information, and susceptibility to catastrophic forgetting in model learning.
arXiv Detail & Related papers (2025-11-30T17:13:35Z) - MIBP-Cert: Certified Training against Data Perturbations with Mixed-Integer Bilinear Programs [50.41998220099097]
Data errors, corruptions, and poisoning attacks during training pose a major threat to the reliability of modern AI systems.<n>We introduce MIBP-Cert, a novel certification method based on mixed-integer bilinear programming (MIBP)<n>By computing the set of parameters reachable through perturbed or manipulated data, we can predict all possible outcomes and guarantee robustness.
arXiv Detail & Related papers (2024-12-13T14:56:39Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Generative Partial Visual-Tactile Fused Object Clustering [81.17645983141773]
We propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering.
A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality.
To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders.
arXiv Detail & Related papers (2020-12-28T02:37:03Z) - Causal Feature Selection for Algorithmic Fairness [61.767399505764736]
We consider fairness in the integration component of data management.
We propose an approach to identify a sub-collection of features that ensure the fairness of the dataset.
arXiv Detail & Related papers (2020-06-10T20:20:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.