Linked Data on Geo-annotated Events and Use Cases for the Resilience of Ukraine
- URL: http://arxiv.org/abs/2501.14762v1
- Date: Tue, 24 Dec 2024 10:59:38 GMT
- Title: Linked Data on Geo-annotated Events and Use Cases for the Resilience of Ukraine
- Authors: Manar Attar, Shuai Wang, Ronald Siebes, Eirik Kultorp, Zhisheng Huang, Tianyang Lu,
- Abstract summary: We focus on datasets about damaging events in Ukraine due to Russia's invasion between February 2022 and the end of April 2023.<n>We convert two selected datasets to Linked Data and enrich them with additional geospatial information.<n>We present an algorithm for the detection of identical events from different datasets.
- Score: 4.3944133124205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The mission of resilience of Ukrainian cities calls for international collaboration with the scientific community to increase the quality of information by identifying and integrating information from various news and social media sources. Linked Data technology can be used to unify, enrich, and integrate data from multiple sources. In our work, we focus on datasets about damaging events in Ukraine due to Russia's invasion between February 2022 and the end of April 2023. We convert two selected datasets to Linked Data and enrich them with additional geospatial information. Following that, we present an algorithm for the detection of identical events from different datasets. Our pipeline makes it easy to convert and enrich datasets to integrated Linked Data. The resulting dataset consists of 10K reported events covering damage to hospitals, schools, roads, residential buildings, etc. Finally, we demonstrate in use cases how our dataset can be applied to different scenarios for resilience purposes.
Related papers
- EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision [72.84868704100595]
This paper presents a dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks.<n>The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic.<n>Accompanying the dataset is EarthMAE, a tailored Masked Autoencoder developed to tackle the distinct challenges of remote sensing data.
arXiv Detail & Related papers (2025-01-14T13:42:22Z) - Bridging the Data Provenance Gap Across Text, Speech and Video [67.72097952282262]
We conduct the largest and first-of-its-kind longitudinal audit across modalities of popular text, speech, and video datasets.
Our manual analysis covers nearly 4000 public datasets between 1990-2024, spanning 608 languages, 798 sources, 659 organizations, and 67 countries.
We find that multimodal machine learning applications have overwhelmingly turned to web-crawled, synthetic, and social media platforms, such as YouTube, for their training sets.
arXiv Detail & Related papers (2024-12-19T01:30:19Z) - UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction [93.77809355002591]
We introduce UniTraj, a comprehensive framework that unifies various datasets, models, and evaluation criteria.
We conduct extensive experiments and find that model performance significantly drops when transferred to other datasets.
We provide insights into dataset characteristics to explain these findings.
arXiv Detail & Related papers (2024-03-22T10:36:50Z) - SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting [63.01035584154509]
We develop a fully automated pipeline and construct a large-scale dataset named MidEast-TE from about 0.6 million news articles.
This dataset focuses on the cooperation and conflict events among countries mainly in the MidEast region from 2015 to 2022.
We propose a novel method LoGo that is able to take advantage of both Local and Global contexts for SCTc-TE forecasting.
arXiv Detail & Related papers (2023-12-02T07:40:21Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - Retiring Adult: New Datasets for Fair Machine Learning [47.27417042497261]
UCI Adult has served as the basis for the development and comparison of many algorithmic fairness interventions.
We reconstruct a superset of the UCI Adult data from available US Census sources and reveal idiosyncrasies of the UCI Adult dataset that limit its external validity.
Our primary contribution is a suite of new datasets that extend the existing data ecosystem for research on fair machine learning.
arXiv Detail & Related papers (2021-08-10T19:19:41Z) - Data Requests and Scenarios for Data Design of Unobserved Events in
Corona-related Confusion Using TEEDA [0.11470070927586014]
In this study, we use the interactive platform called treasuring every encounter of data affairs (TEEDA) to externalize data requests from data users.
We analyze the characteristics of missing data in the corona-related confusion stemming from both the data requests and the providable data obtained in the workshop.
arXiv Detail & Related papers (2020-09-08T23:40:26Z) - CovidNet: To Bring Data Transparency in the Era of COVID-19 [9.808021836153712]
This paper presents CovidNet, a COVID-19 tracking project associated with a large scale epidemic dataset.
CovidNet is the only platform providing real-time global case information of more than 4,124 sub-divisions from over 27 countries worldwide.
The accuracy and freshness of the dataset is a result of the painstaking efforts from our voluntary teamwork, crowd-sourcing channels, and automated data pipelines.
arXiv Detail & Related papers (2020-05-22T00:05:17Z) - A Common Operating Picture Framework Leveraging Data Fusion and Deep
Learning [0.7348448478819135]
We present a data fusion framework for accelerating solutions for Processing, Exploitation, and Dissemination.
Our platform is a collection of services that extract information from several data sources by leveraging deep learning and other means of processing.
In our first iteration we have focused on visual data (FMV, WAMI, CCTV/PTZ-Cameras, open source video, etc.) and AIS data streams (satellite and terrestrial sources)
arXiv Detail & Related papers (2020-01-16T18:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.