Score Matching With Missing Data
- URL: http://arxiv.org/abs/2506.00557v1
- Date: Sat, 31 May 2025 13:26:51 GMT
- Title: Score Matching With Missing Data
- Authors: Josh Givens, Song Liu, Henry W J Reeve,
- Abstract summary: We adapt score matching to work with missing data in a flexible setting.<n>We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach.<n>We show our variational approach to be strongest in more complex high-dimensional settings.
- Score: 7.9731667982734455
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Score matching is a vital tool for learning the distribution of data with applications across many areas including diffusion processes, energy based modelling, and graphical model estimation. Despite all these applications, little work explores its use when data is incomplete. We address this by adapting score matching (and its major extensions) to work with missing data in a flexible setting where data can be partially missing over any subset of the coordinates. We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach. We provide finite sample bounds for our IW approach in finite domain settings and show it to have especially strong performance in small sample lower dimensional cases. Complementing this, we show our variational approach to be strongest in more complex high-dimensional settings which we demonstrate on graphical model estimation tasks on both real and simulated data.
Related papers
- Regression Augmentation With Data-Driven Segmentation [0.0]
Imbalanced regression arises when the target distribution is skewed, causing models to focus on dense regions and struggle with underrepresented (minority) samples.<n>We propose a fully data-driven GAN-based augmentation framework that uses Mahalanobis-Gaussian Mixture Modeling (GMM) to automatically identify minority samples.
arXiv Detail & Related papers (2025-08-02T18:12:11Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation [62.889835139583965]
We introduce an unsupervised auxiliary task of learning an implicit underlying surface representation simultaneously on source and target data.
As both domains share the same latent representation, the model is forced to accommodate discrepancies between the two sources of data.
Our experiments demonstrate that our method achieves a better performance than the current state of the art, both in real-to-real and synthetic-to-real scenarios.
arXiv Detail & Related papers (2023-04-06T17:36:23Z) - ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Con$^{2}$DA: Simplifying Semi-supervised Domain Adaptation by Learning
Consistent and Contrastive Feature Representations [1.2891210250935146]
Con$2$DA is a framework that extends recent advances in semi-supervised learning to the semi-supervised domain adaptation problem.
Our framework generates pairs of associated samples by performing data transformations to a given input.
We use different loss functions to enforce consistency between the feature representations of associated data pairs of samples.
arXiv Detail & Related papers (2022-04-04T15:05:45Z) - Reconstruction of Incomplete Wildfire Data using Deep Generative Models [0.0]
We present a variant of the powerful variational autoencoder models dubbed the Missing data Conditional-Weighted Autocoderen (CMIWAE)
Our deep variable generative model requires little to no feature engineering and does not necessarily rely on the specifics of scoring in the Data Challenge.
arXiv Detail & Related papers (2022-01-16T23:27:31Z) - Distributed Learning via Filtered Hyperinterpolation on Manifolds [2.2046162792653017]
This paper studies the problem of learning real-valued functions on manifold.
Motivated by the problem of handling large data sets, it presents a parallel data processing approach.
We prove quantitative relations between the approximation quality of the learned function over the entire manifold.
arXiv Detail & Related papers (2020-07-18T10:05:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.