A Decade of In-text Citation Analysis based on Natural Language
Processing and Machine Learning Techniques: An overview of empirical studies
- URL: http://arxiv.org/abs/2008.13020v1
- Date: Sat, 29 Aug 2020 17:27:08 GMT
- Title: A Decade of In-text Citation Analysis based on Natural Language
Processing and Machine Learning Techniques: An overview of empirical studies
- Authors: Sehrish Iqbal, Saeed-Ul Hassan, Naif Radi Aljohani, Salem Alelyani,
Raheel Nawaz and Lutz Bornmann
- Abstract summary: Information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques.
This article aims to narratively review the studies on these developments.
Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations.
- Score: 3.474275085556876
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Citation analysis is one of the most frequently used methods in research
evaluation. We are seeing significant growth in citation analysis through
bibliometric metadata, primarily due to the availability of citation databases
such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and
Dimensions. Due to better access to full-text publication corpora in recent
years, information scientists have gone far beyond traditional bibliometrics by
tapping into advancements in full-text data processing techniques to measure
the impact of scientific publications in contextual terms. This has led to
technical developments in citation context and content analysis, citation
classifications, citation sentiment analysis, citation summarisation, and
citation-based recommendation. This article aims to narratively review the
studies on these developments. Its primary focus is on publications that have
used natural language processing and machine learning techniques to analyse
citations.
Related papers
- Sentiment Analysis of Citations in Scientific Articles Using ChatGPT: Identifying Potential Biases and Conflicts of Interest [4.13365552362244]
This article introduces the innovative use of large language models, particularly ChatGPT, for comprehensive sentiment analysis of citations within scientific articles.
ChatGPT can discern the nuanced positivity or negativity of citations, offering insights into the reception and impact of cited works.
arXiv Detail & Related papers (2024-04-02T09:59:49Z) - How do software citation formats evolve over time? A longitudinal
analysis of R programming language packages [12.082972614614413]
This study compares and analyzes a longitudinal dataset of citation formats of all R packages collected in 2021 and 2022.
We investigate the different document types underlying the citations and what metadata elements in the citation formats changed over time.
arXiv Detail & Related papers (2023-07-17T09:18:57Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - Citation Trajectory Prediction via Publication Influence Representation
Using Temporal Knowledge Graph [52.07771598974385]
Existing approaches mainly rely on mining temporal and graph data from academic articles.
Our framework is composed of three modules: difference-preserved graph embedding, fine-grained influence representation, and learning-based trajectory calculation.
Experiments are conducted on both the APS academic dataset and our contributed AIPatent dataset.
arXiv Detail & Related papers (2022-10-02T07:43:26Z) - The Impact of Social Media in Learning and Teaching: A
Bibliometric-based Citation Analysis [0.4297070083645049]
The study explored the overall theoretical foundation of social media research involving in learning and studying.
International Journal of Management Education is the leading journal in social media in learning and teaching research.
arXiv Detail & Related papers (2022-09-22T19:32:31Z) - Cross-Lingual Citations in English Papers: A Large-Scale Analysis of
Prevalence, Usage, and Impact [0.0]
We present an analysis of cross-lingual citations based on over one million English papers.
Among our findings are an increasing rate of citations to publications written in Chinese.
To facilitate further research, we make our collected data and source code publicly available.
arXiv Detail & Related papers (2021-11-07T15:34:02Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z) - Citation Recommendation: Approaches and Datasets [20.47628019708079]
Citation recommendation describes the task of recommending citations for a given text.
In recent years, several approaches and evaluation data sets have been presented.
No literature survey has been conducted explicitly on citation recommendation.
arXiv Detail & Related papers (2020-02-17T13:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.