KGEA: A Knowledge Graph Enhanced Article Quality Identification Dataset
- URL: http://arxiv.org/abs/2206.07556v1
- Date: Wed, 15 Jun 2022 14:15:41 GMT
- Title: KGEA: A Knowledge Graph Enhanced Article Quality Identification Dataset
- Authors: Chunhui Ai and Derui Wang and Yang Xu and Wenrui Xie and Ziqiang Cao
- Abstract summary: We propose a knowledge graph enhanced article quality identification dataset (KGEA) based on Baidu Encyclopedia.
We quantified the articles through 7 dimensions and use co-occurrence of the entities between the articles and the Baidu encyclopedia to construct the knowledge graph for every article.
We also compared some text classification baselines and found that external knowledge can guide the articles to a more competitive classification with the graph neural networks.
- Score: 4.811084336809668
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With so many articles of varying quality being produced at every moment, it
is a very urgent task to screen this data for quality articles and commit them
out to social media. It is worth noting that high quality articles have many
characteristics, such as relevance, text quality, straightforward, multi-sided,
background, novelty and sentiment. Thus, it would be inadequate to purely use
the content of an article to identify its quality. Therefore, we plan to use
the external knowledge interaction to refine the performance and propose a
knowledge graph enhanced article quality identification dataset (KGEA) based on
Baidu Encyclopedia. We quantified the articles through 7 dimensions and use
co-occurrence of the entities between the articles and the Baidu encyclopedia
to construct the knowledge graph for every article. We also compared some text
classification baselines and found that external knowledge can guide the
articles to a more competitive classification with the graph neural networks.
Related papers
- Detecting text level intellectual influence with knowledge graph embeddings [0.0]
We collect a corpus of open source journal articles and generate Knowledge Graph representations using the Gemini LLM.
We attempt to predict the existence of citations between sampled pairs of articles using previously published methods and a novel Graph Neural Network based embedding model.
arXiv Detail & Related papers (2024-10-31T15:21:27Z) - Multi-Facet Counterfactual Learning for Content Quality Evaluation [48.73583736357489]
We propose a framework for efficiently constructing evaluators that perceive multiple facets of content quality evaluation.
We leverage a joint training strategy based on contrastive learning and supervised learning to enable the evaluator to distinguish between different quality facets.
arXiv Detail & Related papers (2024-10-10T08:04:10Z) - Qualitative Data Analysis in Software Engineering: Techniques and Teaching Insights [10.222207222039048]
Software repositories are rich sources of qualitative artifacts, including source code comments, commit messages, issue descriptions, and documentation.
This chapter shifts the focus towards interpreting these artifacts using various qualitative data analysis techniques.
Various coding methods are discussed along with the strategic design of a coding guide to ensure consistency and accuracy in data interpretation.
arXiv Detail & Related papers (2024-06-12T13:56:55Z) - Blind Image Quality Assessment via Vision-Language Correspondence: A
Multitask Learning Perspective [93.56647950778357]
Blind image quality assessment (BIQA) predicts the human perception of image quality without any reference information.
We develop a general and automated multitask learning scheme for BIQA to exploit auxiliary knowledge from other tasks.
arXiv Detail & Related papers (2023-03-27T07:58:09Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - The Curious Layperson: Fine-Grained Image Recognition without Expert
Labels [90.88501867321573]
We consider a new problem: fine-grained image recognition without expert annotations.
We learn a model to describe the visual appearance of objects using non-expert image descriptions.
We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.
arXiv Detail & Related papers (2021-11-05T17:58:37Z) - Assessing the quality of sources in Wikidata across languages: a hybrid
approach [64.05097584373979]
We run a series of microtasks experiments to evaluate a large corpus of references, sampled from Wikidata triples with labels in several languages.
We use a consolidated, curated version of the crowdsourced assessments to train several machine learning models to scale up the analysis to the whole of Wikidata.
The findings help us ascertain the quality of references in Wikidata, and identify common challenges in defining and capturing the quality of user-generated multilingual structured data on the web.
arXiv Detail & Related papers (2021-09-20T10:06:46Z) - A Sentiment-Controllable Topic-to-Essay Generator with Topic Knowledge
Graph [44.00244549852883]
We propose a novel Sentiment-Controllable topic-to-essay generator with a Topic Knowledge Graph enhanced decoder.
We firstly inject the sentiment information into the generator for controlling sentiment for each sentence, which leads to various generated essays.
Unlike existing models that use knowledge entities separately, our model treats the knowledge graph as a whole and encodes more structured, connected semantic information in the graph to generate a more relevant essay.
arXiv Detail & Related papers (2020-10-12T08:06:12Z) - Cognitive Representation Learning of Self-Media Online Article Quality [24.084727302752377]
Self-media online articles are mainly created by users, which have the appearance characteristics of different text levels and multi-modal hybrid editing.
We establish a joint model CoQAN in combination with the layout organization, writing characteristics and text semantics.
We have also constructed a large scale real-world assessment dataset.
arXiv Detail & Related papers (2020-08-13T02:59:52Z) - Relational Learning Analysis of Social Politics using Knowledge Graph
Embedding [11.978556412301975]
This paper presents a novel credibility domain-based KG Embedding framework.
It involves capturing a fusion of data obtained from heterogeneous resources into a formal KG representation depicted by a domain.
The framework also embodies a credibility module to ensure data quality and trustworthiness.
arXiv Detail & Related papers (2020-06-02T14:10:28Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.