Learning To Rank Resources with GNN
- URL: http://arxiv.org/abs/2304.07946v1
- Date: Mon, 17 Apr 2023 02:01:45 GMT
- Title: Learning To Rank Resources with GNN
- Authors: Ulugbek Ergashev, Eduard C. Dragut, Weiyi Meng
- Abstract summary: We propose a graph neural network (GNN) based approach to learning-to-rank that is capable of modeling resource-query and resource-resource relationships.
Our method outperforms the state-of-the-art by 6.4% to 42% on various performance metrics.
- Score: 7.337247167823921
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As the content on the Internet continues to grow, many new dynamically
changing and heterogeneous sources of data constantly emerge. A conventional
search engine cannot crawl and index at the same pace as the expansion of the
Internet. Moreover, a large portion of the data on the Internet is not
accessible to traditional search engines. Distributed Information Retrieval
(DIR) is a viable solution to this as it integrates multiple shards (resources)
and provides a unified access to them. Resource selection is a key component of
DIR systems. There is a rich body of literature on resource selection
approaches for DIR. A key limitation of the existing approaches is that they
primarily use term-based statistical features and do not generally model
resource-query and resource-resource relationships. In this paper, we propose a
graph neural network (GNN) based approach to learning-to-rank that is capable
of modeling resource-query and resource-resource relationships. Specifically,
we utilize a pre-trained language model (PTLM) to obtain semantic information
from queries and resources. Then, we explicitly build a heterogeneous graph to
preserve structural information of query-resource relationships and employ GNN
to extract structural information. In addition, the heterogeneous graph is
enriched with resource-resource type of edges to further enhance the ranking
accuracy. Extensive experiments on benchmark datasets show that our proposed
approach is highly effective in resource selection. Our method outperforms the
state-of-the-art by 6.4% to 42% on various performance metrics.
Related papers
- ReSLLM: Large Language Models are Strong Resource Selectors for
Federated Search [35.44746116088232]
Federated search will become increasingly pivotal in the context of Retrieval-Augmented Generation pipelines.
Current SOTA resource selection methodologies rely on feature-based learning approaches.
We propose ReSLLM to drive the selection of resources in federated search in a zero-shot setting.
arXiv Detail & Related papers (2024-01-31T07:58:54Z) - Query of CC: Unearthing Large Scale Domain-Specific Knowledge from
Public Corpora [104.16648246740543]
We propose an efficient data collection method based on large language models.
The method bootstraps seed information through a large language model and retrieves related data from public corpora.
It not only collects knowledge-related data for specific domains but unearths the data with potential reasoning procedures.
arXiv Detail & Related papers (2024-01-26T03:38:23Z) - Towards a Gateway for Knowledge Graph Schemas Collection, Analysis, and
Embedding [10.19939896927137]
This paper describes the Live Semantic Web initiative, namely a first version of a gateway that has the main scope of leveraging the gold mine of relational data collected by many existing knowledge graphs.
arXiv Detail & Related papers (2023-11-21T09:22:02Z) - Low Resource Summarization using Pre-trained Language Models [1.26404863283601]
We propose a methodology for adapting self-attentive transformer-based architecture models (mBERT, mT5) for low-resource summarization.
Our adapted summarization model textiturT5 can capture contextual information of low resource language effectively with evaluation score (up to 46.35 ROUGE-1, 77 BERTScore) at par with state-of-the-art models in high resource language English.
arXiv Detail & Related papers (2023-10-04T13:09:39Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Towards Realistic Low-resource Relation Extraction: A Benchmark with
Empirical Baseline Study [51.33182775762785]
This paper presents an empirical study to build relation extraction systems in low-resource settings.
We investigate three schemes to evaluate the performance in low-resource settings: (i) different types of prompt-based methods with few-shot labeled data; (ii) diverse balancing methods to address the long-tailed distribution issue; and (iii) data augmentation technologies and self-training to generate more labeled in-domain data.
arXiv Detail & Related papers (2022-10-19T15:46:37Z) - A Transfer Learning Pipeline for Educational Resource Discovery with
Application in Leading Paragraph Generation [71.92338855383238]
We propose a pipeline that automates web resource discovery for novel domains.
The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains.
This is the first study that considers various web resources for survey generation.
arXiv Detail & Related papers (2022-01-07T03:35:40Z) - Deep Transfer Learning for Multi-source Entity Linkage via Domain
Adaptation [63.24594955429465]
Multi-source entity linkage is critical in high-impact applications such as data cleaning and user stitching.
AdaMEL is a deep transfer learning framework that learns generic high-level knowledge to perform multi-source entity linkage.
Our framework achieves state-of-the-art results with 8.21% improvement on average over methods based on supervised learning.
arXiv Detail & Related papers (2021-10-27T15:20:41Z) - HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain
Language Model Compression [53.90578309960526]
Large pre-trained language models (PLMs) have shown overwhelming performances compared with traditional neural network methods.
We propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
arXiv Detail & Related papers (2021-10-16T11:23:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.