Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph
- URL: http://arxiv.org/abs/2505.17507v1
- Date: Fri, 23 May 2025 06:00:20 GMT
- Title: Benchmarking Recommendation, Classification, and Tracing Based on Hugging Face Knowledge Graph
- Authors: Qiaosheng Chen, Kaijia Huang, Xiao Zhou, Weiqing Luo, Yuanning Cui, Gong Cheng,
- Abstract summary: HuggingKG is the first large-scale knowledge graph built from the Hugging Face community for ML resource management.<n>With 2.6 million nodes and 6.2 million edges, HuggingKG captures domain-specific relations and rich textual attributes.<n>HuggingBench is a multi-task benchmark with three novel test collections for IR tasks including resource recommendation, classification, and tracing.
- Score: 17.251910893293626
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The rapid growth of open source machine learning (ML) resources, such as models and datasets, has accelerated IR research. However, existing platforms like Hugging Face do not explicitly utilize structured representations, limiting advanced queries and analyses such as tracing model evolution and recommending relevant datasets. To fill the gap, we construct HuggingKG, the first large-scale knowledge graph built from the Hugging Face community for ML resource management. With 2.6 million nodes and 6.2 million edges, HuggingKG captures domain-specific relations and rich textual attributes. It enables us to further present HuggingBench, a multi-task benchmark with three novel test collections for IR tasks including resource recommendation, classification, and tracing. Our experiments reveal unique characteristics of HuggingKG and the derived tasks. Both resources are publicly available, expected to advance research in open source resource sharing and management.
Related papers
- MMGraphRAG: Bridging Vision and Language with Interpretable Multimodal Knowledge Graphs [6.165053219836395]
We propose MMGraphRAG, which refines visual content through scene graphs and constructs a multimodal knowledge graph.<n>It employs spectral clustering to achieve cross-modal entity linking and retrieves context along reasoning paths to guide the generative process.<n> Experimental results show that MMGraphRAG achieves state-of-the-art performance on the DocBench and MMLongBench datasets.
arXiv Detail & Related papers (2025-07-28T13:16:23Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - Explain What You Mean: Intent Augmented Knowledge Graph Recommender Built With An LLM [8.40367600570591]
Intent Knowledge Recommender (IKGR) is a novel framework that leverages retrieval-augmented generation and an encoding approach to construct and densify a knowledge graph.<n>IKGR overcomes knowledge gaps and achieves substantial gains over state-of-the-art baselines on both publicly available and our internal recommendation datasets.
arXiv Detail & Related papers (2025-05-16T06:07:19Z) - G-OSR: A Comprehensive Benchmark for Graph Open-Set Recognition [54.45837774534411]
We introduce textbfG-OSR, a benchmark for evaluating Graph Open-Set Recognition (GOSR) methods at both the node and graph levels.<n>Results offer critical insights into the generalizability and limitations of current GOSR methods.
arXiv Detail & Related papers (2025-03-01T13:02:47Z) - ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation [16.204046295248546]
Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs)<n>We introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG)<n>We build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method.<n>ArchRAG has been successfully applied to domain knowledge QA in Huawei Cloud Computing.
arXiv Detail & Related papers (2025-02-14T03:28:36Z) - GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.<n>GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.<n>It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z) - Retrieval-Augmented Generation with Graphs (GraphRAG) [84.29507404866257]
Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information.<n>Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information.<n>Unlike conventional RAG, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains.
arXiv Detail & Related papers (2024-12-31T06:59:35Z) - Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [46.55831783809377]
Retrieval-augmented generation (RAG) is increasingly recognized as an effective approach to mitigating the hallucination of large language models (LLMs)<n>We develop PruningRAG, a plug-and-play RAG framework that uses multi-granularity pruning strategies to more effectively incorporate relevant context and mitigate the negative impact of misleading information.
arXiv Detail & Related papers (2024-09-03T03:31:37Z) - Learning To Rank Resources with GNN [7.337247167823921]
We propose a graph neural network (GNN) based approach to learning-to-rank that is capable of modeling resource-query and resource-resource relationships.
Our method outperforms the state-of-the-art by 6.4% to 42% on various performance metrics.
arXiv Detail & Related papers (2023-04-17T02:01:45Z) - A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic,
and Multimodal [57.8455911689554]
Knowledge graph reasoning (KGR) aims to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs)
It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering, recommendation systems, and etc.
arXiv Detail & Related papers (2022-12-12T08:40:04Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z) - Deep Learning on Knowledge Graph for Recommender System: A Survey [36.41255991011155]
A knowledge graph is capable of encoding high-order relations that connect two objects with one or multiple related attributes.
With the help of the emerging Graph Neural Networks (GNN), it is possible to extract both object characteristics and relations from KG.
arXiv Detail & Related papers (2020-03-25T22:53:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.