Semantic Similarity Measure of Natural Language Text through Machine
Learning and a Keyword-Aware Cross-Encoder-Ranking Summarizer -- A Case Study
Using UCGIS GIS&T Body of Knowledge
- URL: http://arxiv.org/abs/2305.09877v1
- Date: Wed, 17 May 2023 01:17:57 GMT
- Title: Semantic Similarity Measure of Natural Language Text through Machine
Learning and a Keyword-Aware Cross-Encoder-Ranking Summarizer -- A Case Study
Using UCGIS GIS&T Body of Knowledge
- Authors: Yuanyuan Tian, Wenwen Li, Sizhe Wang, Zhining Gu
- Abstract summary: GIS&T Body of Knowledge (BoK) is a community-driven endeavor to define, develop, and document geospatial topics.
This research evaluates the effectiveness of multiple natural language processing (NLP) techniques in extracting semantics from text.
It also offers a new perspective on the use of machine learning techniques for analyzing scientific publications.
- Score: 2.4909170697740968
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Initiated by the University Consortium of Geographic Information Science
(UCGIS), GIS&T Body of Knowledge (BoK) is a community-driven endeavor to
define, develop, and document geospatial topics related to geographic
information science and technologies (GIS&T). In recent years, GIS&T BoK has
undergone rigorous development in terms of its topic re-organization and
content updating, resulting in a new digital version of the project. While the
BoK topics provide useful materials for researchers and students to learn about
GIS, the semantic relationships among the topics, such as semantic similarity,
should also be identified so that a better and automated topic navigation can
be achieved. Currently, the related topics are either defined manually by
editors or authors, which may result in an incomplete assessment of topic
relationship. To address this challenge, our research evaluates the
effectiveness of multiple natural language processing (NLP) techniques in
extracting semantics from text, including both deep neural networks and
traditional machine learning approaches. Besides, a novel text summarization -
KACERS (Keyword-Aware Cross-Encoder-Ranking Summarizer) - is proposed to
generate a semantic summary of scientific publications. By identifying the
semantic linkages among key topics, this work provides guidance for future
development and content organization of the GIS&T BoK project. It also offers a
new perspective on the use of machine learning techniques for analyzing
scientific publications, and demonstrate the potential of KACERS summarizer in
semantic understanding of long text documents.
Related papers
- AHAM: Adapt, Help, Ask, Model -- Harvesting LLMs for literature mining [3.8384235322772864]
We present the AHAM' methodology and a metric that guides the domain-specific textbfadaptation of the BERTopic topic modeling framework.
By utilizing the LLaMa2 generative language model, we generate topic definitions via one-shot learning.
For inter-topic similarity evaluation, we leverage metrics from language generation and translation processes.
arXiv Detail & Related papers (2023-12-25T18:23:03Z) - Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems [58.561904356651276]
We introduce the Knowledge-Enhanced Entity Representation Learning (KERL) framework to improve the semantic understanding of entities for Conversational recommender systems.
KERL uses a knowledge graph and a pre-trained language model to improve the semantic understanding of entities.
KERL achieves state-of-the-art results in both recommendation and response generation tasks.
arXiv Detail & Related papers (2023-12-18T06:41:23Z) - Semantic Communications for Artificial Intelligence Generated Content
(AIGC) Toward Effective Content Creation [75.73229320559996]
This paper develops a conceptual model for the integration of AIGC and SemCom.
A novel framework that employs AIGC technology is proposed as an encoder and decoder for semantic information.
The framework can adapt to different types of content generated, the required quality, and the semantic information utilized.
arXiv Detail & Related papers (2023-08-09T13:17:21Z) - SKG: A Versatile Information Retrieval and Analysis Framework for
Academic Papers with Semantic Knowledge Graphs [9.668240269886413]
We propose a Semantic Knowledge Graph that integrates semantic concepts from abstracts and other meta-information to represent the corpus.
The SKG can support various semantic queries in academic literature thanks to the high diversity and rich information content stored within.
arXiv Detail & Related papers (2023-06-07T20:16:08Z) - Knowledge-Aware Bayesian Deep Topic Model [50.58975785318575]
We propose a Bayesian generative model for incorporating prior domain knowledge into hierarchical topic modeling.
Our proposed model efficiently integrates the prior knowledge and improves both hierarchical topic discovery and document representation.
arXiv Detail & Related papers (2022-09-20T09:16:05Z) - TeKo: Text-Rich Graph Neural Networks with External Knowledge [75.91477450060808]
We propose a novel text-rich graph neural network with external knowledge (TeKo)
We first present a flexible heterogeneous semantic network that incorporates high-quality entities.
We then introduce two types of external knowledge, that is, structured triplets and unstructured entity description.
arXiv Detail & Related papers (2022-06-15T02:33:10Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - Semantic and Relational Spaces in Science of Science: Deep Learning
Models for Article Vectorisation [4.178929174617172]
We focus on document-level embeddings based on the semantic and relational aspects of articles, using Natural Language Processing (NLP) and Graph Neural Networks (GNNs)
Our results show that using NLP we can encode a semantic space of articles, while with GNN we are able to build a relational space where the social practices of a research community are also encoded.
arXiv Detail & Related papers (2020-11-05T14:57:41Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z) - A Sentiment-Controllable Topic-to-Essay Generator with Topic Knowledge
Graph [44.00244549852883]
We propose a novel Sentiment-Controllable topic-to-essay generator with a Topic Knowledge Graph enhanced decoder.
We firstly inject the sentiment information into the generator for controlling sentiment for each sentence, which leads to various generated essays.
Unlike existing models that use knowledge entities separately, our model treats the knowledge graph as a whole and encodes more structured, connected semantic information in the graph to generate a more relevant essay.
arXiv Detail & Related papers (2020-10-12T08:06:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.