Effective Distributed Representations for Academic Expert Search
- URL: http://arxiv.org/abs/2010.08269v1
- Date: Fri, 16 Oct 2020 09:43:18 GMT
- Title: Effective Distributed Representations for Academic Expert Search
- Authors: Mark Berger, Jakub Zavrel, Paul Groth
- Abstract summary: We study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval.
In particular, we explore the impact of the use of contextualized embeddings on search performance.
We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations.
- Score: 1.9815631757151737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Expert search aims to find and rank experts based on a user's query. In
academia, retrieving experts is an efficient way to navigate through a large
amount of academic knowledge. Here, we study how different distributed
representations of academic papers (i.e. embeddings) impact academic expert
retrieval. We use the Microsoft Academic Graph dataset and experiment with
different configurations of a document-centric voting model for retrieval. In
particular, we explore the impact of the use of contextualized embeddings on
search performance. We also present results for paper embeddings that
incorporate citation information through retrofitting. Additionally,
experiments are conducted using different techniques for assigning author
weights based on author order. We observe that using contextual embeddings
produced by a transformer model trained for sentence similarity tasks produces
the most effective paper representations for document-centric expert retrieval.
However, retrofitting the paper embeddings and using elaborate author
contribution weighting strategies did not improve retrieval performance.
Related papers
- Taxonomy-guided Semantic Indexing for Academic Paper Search [51.07749719327668]
TaxoIndex is a semantic index framework for academic paper search.
It organizes key concepts from papers as a semantic index guided by an academic taxonomy.
It can be flexibly employed to enhance existing dense retrievers.
arXiv Detail & Related papers (2024-10-25T00:00:17Z) - Conversational Exploratory Search of Scholarly Publications Using Knowledge Graphs [3.3916160303055567]
We develop a conversational search system for exploring scholarly publications using a knowledge graph.
To assess the system's effectiveness, we employed various performance metrics and conducted a human evaluation with 40 participants.
arXiv Detail & Related papers (2024-10-01T06:16:07Z) - Improving Retrieval in Theme-specific Applications using a Corpus
Topical Taxonomy [52.426623750562335]
We introduce ToTER (Topical taxonomy Enhanced Retrieval) framework.
ToTER identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts.
As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers.
arXiv Detail & Related papers (2024-03-07T02:34:54Z) - Chain-of-Factors Paper-Reviewer Matching [32.86512592730291]
We propose a unified model for paper-reviewer matching that jointly considers semantic, topic, and citation factors.
We demonstrate the effectiveness of our proposed Chain-of-Factors model in comparison with state-of-the-art paper-reviewer matching methods and scientific pre-trained language models.
arXiv Detail & Related papers (2023-10-23T01:29:18Z) - DiscoverPath: A Knowledge Refinement and Retrieval System for
Interdisciplinarity on Biomedical Research [96.10765714077208]
Traditional keyword-based search engines fall short in assisting users who may not be familiar with specific terminologies.
We present a knowledge graph-based paper search engine for biomedical research to enhance the user experience.
The system, dubbed DiscoverPath, employs Named Entity Recognition (NER) and part-of-speech (POS) tagging to extract terminologies and relationships from article abstracts to create a KG.
arXiv Detail & Related papers (2023-09-04T20:52:33Z) - Retrieval Augmentation for Commonsense Reasoning: A Unified Approach [64.63071051375289]
We propose a unified framework of retrieval-augmented commonsense reasoning (called RACo)
Our proposed RACo can significantly outperform other knowledge-enhanced method counterparts.
arXiv Detail & Related papers (2022-10-23T23:49:08Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - An Explanatory Query-Based Framework for Exploring Academic Expertise [10.887008988767061]
Finding potential collaborators in institutions is a time-consuming manual search task prone to bias.
We propose a novel query-based framework for searching, scoring, and exploring research expertise automatically.
We show that our simple method is effective in identifying matches, while satisfying desirable properties and being efficient.
arXiv Detail & Related papers (2021-05-28T10:48:08Z) - Enhancing Reading Strategies by Exploring A Theme-based Approach to
Literature Surveys [5.004814662623872]
We have designed a methodology that allows users to visually and thematically explore corpora, while developing personalised holistic reading strategies.
Using in-depth semi-structured interviews and stimulated recall, we found that users: (i) selected papers that they otherwise would not have read, (ii) developed a more coherent reading strategy, and (iii) understood the thematic structure and relationships between papers more effectively.
arXiv Detail & Related papers (2021-02-10T10:36:45Z) - Machine Identification of High Impact Research through Text and Image
Analysis [0.4737991126491218]
We present a system to automatically separate papers with a high from those with a low likelihood of gaining citations.
Our system uses both a visual classifier, useful for surmising a document's overall appearance, and a text classifier, for making content-informed decisions.
arXiv Detail & Related papers (2020-05-20T19:12:24Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.