SpaCE: The Spatial Confounding Environment
- URL: http://arxiv.org/abs/2312.00710v2
- Date: Wed, 6 Dec 2023 02:00:53 GMT
- Title: SpaCE: The Spatial Confounding Environment
- Authors: Mauricio Tec, Ana Trisovic, Michelle Audirac, Sophie Woodward, Jie
Kate Hu, Naeem Khoshnevis, Francesca Dominici
- Abstract summary: SpaCE provides realistic benchmark datasets and tools for evaluating causal inference methods.
Each dataset includes training data, true counterfactuals, a spatial graph with coordinates, and smoothness and confounding scores.
SpaCE facilitates an automated end-to-end pipeline, simplifying data loading, experimental setup, and evaluating machine learning and causal inference models.
- Score: 2.572906392867547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spatial confounding poses a significant challenge in scientific studies
involving spatial data, where unobserved spatial variables can influence both
treatment and outcome, possibly leading to spurious associations. To address
this problem, we introduce SpaCE: The Spatial Confounding Environment, the
first toolkit to provide realistic benchmark datasets and tools for
systematically evaluating causal inference methods designed to alleviate
spatial confounding. Each dataset includes training data, true counterfactuals,
a spatial graph with coordinates, and smoothness and confounding scores
characterizing the effect of a missing spatial confounder. It also includes
realistic semi-synthetic outcomes and counterfactuals, generated using
state-of-the-art machine learning ensembles, following best practices for
causal inference benchmarks. The datasets cover real treatment and covariates
from diverse domains, including climate, health and social sciences. SpaCE
facilitates an automated end-to-end pipeline, simplifying data loading,
experimental setup, and evaluating machine learning and causal inference
models. The SpaCE project provides several dozens of datasets of diverse sizes
and spatial complexity. It is publicly available as a Python package,
encouraging community feedback and contributions.
Related papers
- Self-consistent Deep Geometric Learning for Heterogeneous Multi-source Spatial Point Data Prediction [10.646376827353551]
Multi-source spatial point data prediction is crucial in fields like environmental monitoring and natural resource management.
Existing models in this area often fall short due to their domain-specific nature and lack a strategy for integrating information from various sources.
We introduce an innovative multi-source spatial point data prediction framework that adeptly aligns information from varied sources without relying on ground truth labels.
arXiv Detail & Related papers (2024-06-30T16:13:13Z) - Implicitly Guided Design with PropEn: Match your Data to Follow the Gradient [52.2669490431145]
PropEn is inspired by'matching', which enables implicit guidance without training a discriminator.
We show that training with a matched dataset approximates the gradient of the property of interest while remaining within the data distribution.
arXiv Detail & Related papers (2024-05-28T11:30:19Z) - LiveHPS: LiDAR-based Scene-level Human Pose and Shape Estimation in Free
Environment [59.320414108383055]
We present LiveHPS, a novel single-LiDAR-based approach for scene-level human pose and shape estimation.
We propose a huge human motion dataset, named FreeMotion, which is collected in various scenarios with diverse human poses.
arXiv Detail & Related papers (2024-02-27T03:08:44Z) - SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation [37.212272184144]
We propose a data-driven self-supervised learning framework for rainfall spatial analysis.
By mining latent spatial patterns from historical data, SpaFormer can learn informative embeddings for raw data and then adaptively model spatial correlations.
Our method outperforms the state-of-the-art solutions in experiments on two real-world raingauge datasets.
arXiv Detail & Related papers (2023-11-27T04:23:47Z) - SPADES: A Realistic Spacecraft Pose Estimation Dataset using Event
Sensing [9.583223655096077]
Due to limited access to real target datasets, algorithms are often trained using synthetic data and applied in the real domain.
Event sensing has been explored in the past and shown to reduce the domain gap between simulations and real-world scenarios.
We introduce a novel dataset, SPADES, comprising real event data acquired in a controlled laboratory environment and simulated event data using the same camera intrinsics.
arXiv Detail & Related papers (2023-11-09T12:14:47Z) - SpaceNLI: Evaluating the Consistency of Predicting Inferences in Space [0.6778628056950066]
We create an NLI dataset for spatial reasoning, called SpaceNLI.
The data samples are automatically generated from a curated set of reasoning patterns, where the patterns are annotated with inference labels by experts.
We test several SOTA NLI systems on SpaceNLI to gauge the complexity of the dataset and the system's capacity for spatial reasoning.
arXiv Detail & Related papers (2023-07-05T13:08:18Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for
Federated Learning on Non-IID Data [69.0785021613868]
Federated learning is a distributed machine learning approach which enables a shared server model to learn by aggregating the locally-computed parameter updates with the training data from spatially-distributed client silos.
We propose the Federated Invariant Learning Consistency (FedILC) approach, which leverages the gradient covariance and the geometric mean of Hessians to capture both inter-silo and intra-silo consistencies.
This is relevant to various fields such as medical healthcare, computer vision, and the Internet of Things (IoT)
arXiv Detail & Related papers (2022-05-19T03:32:03Z) - ESTemd: A Distributed Processing Framework for Environmental Monitoring
based on Apache Kafka Streaming Engine [0.0]
Distributed networks and real-time systems are becoming the most important components for the new computer age, the Internet of Things.
Data generated offers the ability to measure, infer and understand environmental indicators, from delicate ecologies to natural resources to urban environments.
We propose a distributed framework Event STream Processing Engine for Environmental Monitoring Domain (ESTemd) for the application of stream processing on heterogeneous environmental data.
arXiv Detail & Related papers (2021-04-02T15:04:15Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.