Toward the Automated Construction of Probabilistic Knowledge Graphs for
the Maritime Domain
- URL: http://arxiv.org/abs/2305.02471v1
- Date: Thu, 4 May 2023 00:24:30 GMT
- Title: Toward the Automated Construction of Probabilistic Knowledge Graphs for
the Maritime Domain
- Authors: Fatemeh Shiri, Teresa Wang, Shirui Pan, Xiaojun Chang, Yuan-Fang Li,
Reza Haffari, Van Nguyen, Shuang Yu
- Abstract summary: International maritime crime is becoming increasingly sophisticated, often associated with wider criminal networks.
This has led to research and development efforts aimed at combining hard data with other types of data.
We propose Maritime DeepDive, an initial prototype for the automated construction of probabilistic knowledge graphs.
- Score: 60.76554773885988
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: International maritime crime is becoming increasingly sophisticated, often
associated with wider criminal networks. Detecting maritime threats by means of
fusing data purely related to physical movement (i.e., those generated by
physical sensors, or hard data) is not sufficient. This has led to research and
development efforts aimed at combining hard data with other types of data
(especially human-generated or soft data). Existing work often assumes that
input soft data is available in a structured format, or is focused on
extracting certain relevant entities or concepts to accompany or annotate hard
data. Much less attention has been given to extracting the rich knowledge about
the situations of interest implicitly embedded in the large amount of soft data
existing in unstructured formats (such as intelligence reports and news
articles). In order to exploit the potentially useful and rich information from
such sources, it is necessary to extract not only the relevant entities and
concepts but also their semantic relations, together with the uncertainty
associated with the extracted knowledge (i.e., in the form of probabilistic
knowledge graphs). This will increase the accuracy of and confidence in, the
extracted knowledge and facilitate subsequent reasoning and learning. To this
end, we propose Maritime DeepDive, an initial prototype for the automated
construction of probabilistic knowledge graphs from natural language data for
the maritime domain. In this paper, we report on the current implementation of
Maritime DeepDive, together with preliminary results on extracting
probabilistic events from maritime piracy incidents. This pipeline was
evaluated on a manually crafted gold standard, yielding promising results.
Related papers
- Maximizing Relation Extraction Potential: A Data-Centric Study to Unveil Challenges and Opportunities [3.8087810875611896]
This paper investigates the possible data-centric characteristics that impede neural relation extraction.
It emphasizes pivotal issues, such as contextual ambiguity, correlating relations, long-tail data, and fine-grained relation distributions.
It sets a marker for future directions to alleviate these issues, thereby proving to be a critical resource for novice and advanced researchers.
arXiv Detail & Related papers (2024-09-07T23:40:47Z) - Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic
Data [91.52783572568214]
Synthetic data may become a dominant force in the machine learning world, promising a future where datasets can be tailored to individual needs.
We discuss which fundamental challenges the community needs to overcome for wider relevance and application of synthetic data.
arXiv Detail & Related papers (2023-04-07T16:38:40Z) - Smart Agriculture : A Novel Multilevel Approach for Agricultural Risk
Assessment over Unstructured Data [0.5735035463793008]
Uncertainty refers to a state of not knowing what will happen in the future.
This paper aims to leverage natural language processing and machine learning techniques to model uncertainties and evaluate the risk level in each uncertainty cluster using massive text data.
arXiv Detail & Related papers (2022-11-22T16:47:47Z) - Neuro-Symbolic Artificial Intelligence (AI) for Intent based Semantic
Communication [85.06664206117088]
6G networks must consider semantics and effectiveness (at end-user) of the data transmission.
NeSy AI is proposed as a pillar for learning causal structure behind the observed data.
GFlowNet is leveraged for the first time in a wireless system to learn the probabilistic structure which generates the data.
arXiv Detail & Related papers (2022-05-22T07:11:57Z) - Audacity of huge: overcoming challenges of data scarcity and data
quality for machine learning in computational materials discovery [1.0036312061637764]
Machine learning (ML)-accelerated discovery requires large amounts of high-fidelity data to reveal predictive structure-property relationships.
For many properties of interest in materials discovery, the challenging nature and high cost of data generation has resulted in a data landscape that is scarcely populated and of dubious quality.
In the absence of manual curation, increasingly sophisticated natural language processing and automated image analysis are making it possible to learn structure-property relationships from the literature.
arXiv Detail & Related papers (2021-11-02T21:43:58Z) - Iterative Rule Extension for Logic Analysis of Data: an MILP-based
heuristic to derive interpretable binary classification from large datasets [0.6526824510982799]
This work presents IRELAND, an algorithm that allows for abstracting Boolean phrases in DNF from data with up to 10,000 samples and sample characteristics.
The results show that for large datasets IRELAND outperforms the current state-of-the-art and can find solutions for datasets where current models run out of memory or need excessive runtimes.
arXiv Detail & Related papers (2021-10-25T13:31:30Z) - A Bayesian Framework for Information-Theoretic Probing [51.98576673620385]
We argue that probing should be seen as approximating a mutual information.
This led to the rather unintuitive conclusion that representations encode exactly the same information about a target task as the original sentences.
This paper proposes a new framework to measure what we term Bayesian mutual information.
arXiv Detail & Related papers (2021-09-08T18:08:36Z) - Incorporating Causal Graphical Prior Knowledge into Predictive Modeling
via Simple Data Augmentation [92.96204497841032]
Causal graphs (CGs) are compact representations of the knowledge of the data generating processes behind the data distributions.
We propose a model-agnostic data augmentation method that allows us to exploit the prior knowledge of the conditional independence (CI) relations.
We experimentally show that the proposed method is effective in improving the prediction accuracy, especially in the small-data regime.
arXiv Detail & Related papers (2021-02-27T06:13:59Z) - Occams Razor for Big Data? On Detecting Quality in Large Unstructured
Datasets [0.0]
New trend towards analytic complexity represents a severe challenge for the principle of parsimony or Occams Razor in science.
Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time.
The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics.
arXiv Detail & Related papers (2020-11-12T16:06:01Z) - Deep Collaborative Embedding for information cascade prediction [58.90540495232209]
We propose a novel model called Deep Collaborative Embedding (DCE) for information cascade prediction.
We propose an auto-encoder based collaborative embedding framework to learn the node embeddings with cascade collaboration and node collaboration.
The results of extensive experiments conducted on real-world datasets verify the effectiveness of our approach.
arXiv Detail & Related papers (2020-01-18T13:32:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.