Reconstruction of Incomplete Wildfire Data using Deep Generative Models
- URL: http://arxiv.org/abs/2201.06153v1
- Date: Sun, 16 Jan 2022 23:27:31 GMT
- Title: Reconstruction of Incomplete Wildfire Data using Deep Generative Models
- Authors: Tomislav Ivek and Domagoj Vlah
- Abstract summary: We present a variant of the powerful variational autoencoder models dubbed the Missing data Conditional-Weighted Autocoderen (CMIWAE)
Our deep variable generative model requires little to no feature engineering and does not necessarily rely on the specifics of scoring in the Data Challenge.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present our submission to the Extreme Value Analysis 2021 Data Challenge
in which teams were asked to accurately predict distributions of wildfire
frequency and size within spatio-temporal regions of missing data. For the
purpose of this competition we developed a variant of the powerful variational
autoencoder models dubbed the Conditional Missing data Importance-Weighted
Autoencoder (CMIWAE). Our deep latent variable generative model requires little
to no feature engineering and does not necessarily rely on the specifics of
scoring in the Data Challenge. It is fully trained on incomplete data, with the
single objective to maximize log-likelihood of the observed wildfire
information. We mitigate the effects of the relatively low number of training
samples by stochastic sampling from a variational latent variable distribution,
as well as by ensembling a set of CMIWAE models trained and validated on
different splits of the provided data. The presented approach is not
domain-specific and is amenable to application in other missing data recovery
tasks with tabular or image-like information conditioned on auxiliary
information.
Related papers
- Distributional Training Data Attribution [20.18145179467698]
We introduce distributional training data attribution (d-TDA) to predict how the distribution of model outputs depends upon the dataset.<n>We identify training examples that drastically change the distribution of some target measurement without necessarily changing the mean.<n>We also find that influence functions (IFs) emerge naturally from our distributional framework as the limit to unrolled differentiation.
arXiv Detail & Related papers (2025-06-15T21:02:36Z) - Score Matching With Missing Data [7.9731667982734455]
We adapt score matching to work with missing data in a flexible setting.<n>We provide two separate score matching variations for general use, an importance weighting (IW) approach, and a variational approach.<n>We show our variational approach to be strongest in more complex high-dimensional settings.
arXiv Detail & Related papers (2025-05-31T13:26:51Z) - Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning.
By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z) - Few-shot Online Anomaly Detection and Segmentation [29.693357653538474]
This paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task.
Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously.
In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation.
arXiv Detail & Related papers (2024-03-27T02:24:00Z) - AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning [98.26836657967162]
textbfAgentOhana aggregates agent trajectories from distinct environments, spanning a wide array of scenarios.
textbfxLAM-v0.1, a large action model tailored for AI agents, demonstrates exceptional performance across various benchmarks.
arXiv Detail & Related papers (2024-02-23T18:56:26Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - ProtoVAE: Prototypical Networks for Unsupervised Disentanglement [1.6114012813668934]
We introduce a novel deep generative VAE-based model, ProtoVAE, that leverages a deep metric learning Prototypical network trained using self-supervision.
Our model is completely unsupervised and requires no priori knowledge of the dataset, including the number of factors.
We evaluate our proposed model on the benchmark dSprites, 3DShapes, and MPI3D disentanglement datasets.
arXiv Detail & Related papers (2023-05-16T01:29:26Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Leveraging variational autoencoders for multiple data imputation [0.5156484100374059]
We investigate the ability of deep models, namely variational autoencoders (VAEs), to account for uncertainty in missing data through multiple imputation strategies.
We find that VAEs provide poor empirical coverage of missing data, with underestimation and overconfident imputations.
To overcome this, we employ $beta$-VAEs, which viewed from a generalized Bayes framework, provide robustness to model misspecification.
arXiv Detail & Related papers (2022-09-30T08:58:43Z) - Neural-Sim: Learning to Generate Training Data with NeRF [31.81496344354997]
We present the first fully differentiable synthetic data pipeline that uses Neural Radiance Fields (NeRFs) in a closed-loop with a target application's loss function.
Our approach generates data on-demand, with no human labor, to maximize accuracy for a target task.
arXiv Detail & Related papers (2022-07-22T22:48:33Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - Disentangled Recurrent Wasserstein Autoencoder [17.769077848342334]
recurrent Wasserstein Autoencoder (R-WAE) is a new framework for generative modeling of sequential data.
R-WAE disentangles the representation of an input sequence into static and dynamic factors.
Our models outperform other baselines with the same settings in terms of disentanglement and unconditional video generation.
arXiv Detail & Related papers (2021-01-19T07:43:25Z) - BlackBox: Generalizable Reconstruction of Extremal Values from
Incomplete Spatio-Temporal Data [0.0]
We present a framework to reconstruct missing data using convolutional deep neural networks.
In order to mitigate bias introduced by any one particular model, a prediction ensemble is constructed.
Our method does not rely on expert knowledge in order to accurately reproduce dynamic features of a complex oceanographic system.
arXiv Detail & Related papers (2020-04-30T21:33:46Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.