Self-consistent Deep Geometric Learning for Heterogeneous Multi-source Spatial Point Data Prediction
- URL: http://arxiv.org/abs/2407.00748v1
- Date: Sun, 30 Jun 2024 16:13:13 GMT
- Title: Self-consistent Deep Geometric Learning for Heterogeneous Multi-source Spatial Point Data Prediction
- Authors: Dazhou Yu, Xiaoyun Gong, Yun Li, Meikang Qiu, Liang Zhao,
- Abstract summary: Multi-source spatial point data prediction is crucial in fields like environmental monitoring and natural resource management.
Existing models in this area often fall short due to their domain-specific nature and lack a strategy for integrating information from various sources.
We introduce an innovative multi-source spatial point data prediction framework that adeptly aligns information from varied sources without relying on ground truth labels.
- Score: 10.646376827353551
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-source spatial point data prediction is crucial in fields like environmental monitoring and natural resource management, where integrating data from various sensors is the key to achieving a holistic environmental understanding. Existing models in this area often fall short due to their domain-specific nature and lack a strategy for integrating information from various sources in the absence of ground truth labels. Key challenges include evaluating the quality of different data sources and modeling spatial relationships among them effectively. Addressing these issues, we introduce an innovative multi-source spatial point data prediction framework that adeptly aligns information from varied sources without relying on ground truth labels. A unique aspect of our method is the 'fidelity score,' a quantitative measure for evaluating the reliability of each data source. Furthermore, we develop a geo-location-aware graph neural network tailored to accurately depict spatial relationships between data points. Our framework has been rigorously tested on two real-world datasets and one synthetic dataset. The results consistently demonstrate its superior performance over existing state-of-the-art methods.
Related papers
- On the Power of Source Screening for Learning Shared Feature Extractors [33.10812756558517]
It is well understood that data sources with low relevance or poor quality may hinder representation learning.<n>We focus on the question of which data sources should be learned jointly by focusing on the traditionally deemed good'' collection of sources.<n>We find that source screening can play a central role in statistically optimal subspace estimation.
arXiv Detail & Related papers (2026-02-18T01:32:10Z) - Spatial Context Improves the Integration of Text with Remote Sensing for Mapping Environmental Variables [19.670023742796136]
We propose an attention-based approach that combines aerial imagery and geolocated text within a spatial neighbourhood.<n>Our model is evaluated on the task of predicting 103 environmental variables from the SWECO25 data cube.
arXiv Detail & Related papers (2026-01-13T17:27:16Z) - OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value [74.80873109856563]
OpenDataArena (ODA) is a holistic and open platform designed to benchmark the intrinsic value of post-training data.<n>ODA establishes a comprehensive ecosystem comprising four key pillars: (i) a unified training-evaluation pipeline that ensures fair, open comparisons across diverse models; (ii) a multi-dimensional scoring framework that profiles data quality along tens of distinct axes; and (iii) an interactive data lineage explorer to visualize dataset genealogy and dissect component sources.
arXiv Detail & Related papers (2025-12-16T03:33:24Z) - Harnessing Rich Multi-Modal Data for Spatial-Temporal Homophily-Embedded Graph Learning Across Domains and Localities [2.5065738436850835]
This research proposes a heterogeneous data pipeline that performs cross-domain data fusion.<n>We aim to address complex urban problems across multiple domains and localities by harnessing the rich information over 50 data sources.
arXiv Detail & Related papers (2025-12-11T23:51:54Z) - FedMSGL: A Self-Expressive Hypergraph Based Federated Multi-View Learning [12.161006152509655]
We propose a Self-expressive Hypergraph Based Federated Multi-view Learning method (FedMSGL)
The proposed method leverages self-expressive character in the local training to learn uniform dimension subspace with latent sample relation.
Experiments on multi-view datasets with different feature dimensions validated the effectiveness of the proposed method.
arXiv Detail & Related papers (2025-03-12T05:13:45Z) - Enhancing Ecological Monitoring with Multi-Objective Optimization: A Novel Dataset and Methodology for Segmentation Algorithms [17.802456388479616]
We introduce a unique semantic segmentation dataset of 6,096 high-resolution aerial images capturing indigenous and invasive grass species in Bega Valley, New South Wales, Australia.
This dataset presents a challenging task due to the overlap and distribution of grass species.
The dataset and code will be made publicly available, aiming to drive research in computer vision, machine learning, and ecological studies.
arXiv Detail & Related papers (2024-07-25T18:27:27Z) - CoRAST: Towards Foundation Model-Powered Correlated Data Analysis in Resource-Constrained CPS and IoT [16.821900475733102]
Foundation models (FMs) can harness distributed and diverse environmental data by leveraging prior knowledge.
We introduce CoRAST, a novel learning framework that utilizes FMs for enhanced analysis of distributed, correlated heterogeneous data.
Our evaluation on real-world weather dataset demonstrates CoRAST's ability to exploit correlated heterogeneous data.
arXiv Detail & Related papers (2024-03-27T11:11:06Z) - Interpretable Multi-Source Data Fusion Through Latent Variable Gaussian Process [8.207427766052044]
The proposed approach is demonstrated on and analyzed through two mathematical and two materials science case studies.
It is observed that compared to using single-source and source unaware machine learning models, the proposed multi-source data fusion framework can provide better predictions for sparse-data problems.
arXiv Detail & Related papers (2024-02-06T16:54:59Z) - SpaCE: The Spatial Confounding Environment [2.572906392867547]
SpaCE provides realistic benchmark datasets and tools for evaluating causal inference methods.
Each dataset includes training data, true counterfactuals, a spatial graph with coordinates, and smoothness and confounding scores.
SpaCE facilitates an automated end-to-end pipeline, simplifying data loading, experimental setup, and evaluating machine learning and causal inference models.
arXiv Detail & Related papers (2023-12-01T16:42:57Z) - UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human
Generation [59.77275587857252]
A holistic human dataset inevitably has insufficient and low-resolution information on local parts.
We propose to use multi-source datasets with various resolution images to jointly learn a high-resolution human generative model.
arXiv Detail & Related papers (2023-09-25T17:58:46Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - GenSyn: A Multi-stage Framework for Generating Synthetic Microdata using
Macro Data Sources [21.32471030724983]
Individual-level data (microdata) that characterizes a population is essential for studying many real-world problems.
In this study, we examine synthetic data generation as a tool to extrapolate difficult-to-obtain high-resolution data.
arXiv Detail & Related papers (2022-12-08T01:22:12Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM)
EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - Federated Causal Discovery [74.37739054932733]
This paper develops a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD)
It can learn the causal graph without directly touching local data and naturally handle the data heterogeneity.
Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
arXiv Detail & Related papers (2021-12-07T08:04:12Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.