A Bayesian Learning, Greedy agglomerative clustering approach and
evaluation techniques for Author Name Disambiguation Problem
- URL: http://arxiv.org/abs/2211.01303v1
- Date: Tue, 1 Nov 2022 08:22:53 GMT
- Title: A Bayesian Learning, Greedy agglomerative clustering approach and
evaluation techniques for Author Name Disambiguation Problem
- Authors: Shashwat Sourav
- Abstract summary: Author names often suffer from ambiguity owing to the same author appearing under different names and multiple authors possessing similar names.
I try to focus on the research efforts targeted to disambiguate author names.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Author names often suffer from ambiguity owing to the same author appearing
under different names and multiple authors possessing similar names. It creates
difficulty in associating a scholarly work with the person who wrote it,
thereby introducing inaccuracy in credit attribution, bibliometric analysis,
search-by-author in a digital library, and expert discovery. A plethora of
techniques for disambiguation of author names has been proposed in the
literature. I try to focus on the research efforts targeted to disambiguate
author names. I first go through the conventional methods, then I discuss
evaluation techniques and the clustering model which finally leads to the
Bayesian learning and Greedy agglomerative approach. I believe this
concentrated review will be useful for the research community because it
discusses techniques applied to a very large real database that is actively
used worldwide. The Bayesian and the greedy agglomerative approach used will
help to tackle AND problems in a better way. Finally, I try to outline a few
directions for future work
Related papers
- A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document.
Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative.
Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z) - A Gold Standard Dataset for the Reviewer Assignment Problem [117.59690218507565]
"Similarity score" is a numerical estimate of the expertise of a reviewer in reviewing a paper.
Our dataset consists of 477 self-reported expertise scores provided by 58 researchers.
For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases.
arXiv Detail & Related papers (2023-03-23T16:15:03Z) - Deep Author Name Disambiguation using DBLP Data [7.081604594416337]
Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries.
This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities.
arXiv Detail & Related papers (2023-03-17T15:50:00Z) - Author Name Disambiguation via Heterogeneous Network Embedding from
Structural and Semantic Perspectives [13.266320447769564]
Name ambiguity is common in academic digital libraries, such as multiple authors having the same name.
The proposed method is mainly based on representation learning for heterogeneous networks and clustering.
The semantic representation is generated using NLP tools.
arXiv Detail & Related papers (2022-12-24T11:22:34Z) - Cracking Double-Blind Review: Authorship Attribution with Deep Learning [43.483063713471935]
We propose a transformer-based, neural-network architecture to attribute an anonymous manuscript to an author.
We leverage all research papers publicly available on arXiv amounting to over 2 million manuscripts.
Our method achieves an unprecedented authorship attribution accuracy, where up to 73% of papers are attributed correctly.
arXiv Detail & Related papers (2022-11-14T15:50:24Z) - Tag-Aware Document Representation for Research Paper Recommendation [68.8204255655161]
We propose a hybrid approach that leverages deep semantic representation of research papers based on social tags assigned by users.
The proposed model is effective in recommending research papers even when the rating data is very sparse.
arXiv Detail & Related papers (2022-09-08T09:13:07Z) - Whois? Deep Author Name Disambiguation using Bibliographic Data [7.081604594416337]
Author Name Ambiguity (ANA) is considered a critical open problem in digital libraries.
This paper proposes an Author Name Disambiguation (AND) approach that links author names to their real-world entities.
arXiv Detail & Related papers (2022-07-11T11:03:39Z) - Bib2Auth: Deep Learning Approach for Author Disambiguation using
Bibliographic Data [4.817368273632451]
We propose a novel approach to link author names to their real-world entities by relying on their co-authorship pattern and area of research.
Our supervised deep learning model identifies an author by capturing his/her relationship with his/her co-authors and area of research.
Bib2Auth has shown good performance on a relatively large dataset.
arXiv Detail & Related papers (2021-07-09T12:25:11Z) - DeepStyle: User Style Embedding for Authorship Attribution of Short
Texts [57.503904346336384]
Authorship attribution (AA) is an important and widely studied research topic with many applications.
Recent works have shown that deep learning methods could achieve significant accuracy improvement for the AA task.
We propose DeepStyle, a novel embedding-based framework that learns the representations of users' salient writing styles.
arXiv Detail & Related papers (2021-03-14T15:56:37Z) - Pairwise Learning for Name Disambiguation in Large-Scale Heterogeneous
Academic Networks [81.00481125272098]
We introduce Multi-view Attention-based Pairwise Recurrent Neural Network (MA-PairRNN) to solve the name disambiguation problem.
MA-PairRNN combines heterogeneous graph embedding learning and pairwise similarity learning into a framework.
Results on two real-world datasets demonstrate that our framework has a significant and consistent improvement of performance on the name disambiguation task.
arXiv Detail & Related papers (2020-08-30T06:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.