Related papers: Predicting the clinical citation count of biomedical papers using multilayer perceptron neural network

Predicting the clinical citation count of biomedical papers using multilayer perceptron neural network

URL: http://arxiv.org/abs/2210.06346v3
Date: Fri, 21 Oct 2022 07:15:53 GMT
Title: Predicting the clinical citation count of biomedical papers using multilayer perceptron neural network
Authors: Xin Li, Xuli Tang, Qikai Cheng
Abstract summary: The early prediction of the clinical citation count of biomedical papers is critical to scientific activities in biomedicine. We designed a four-layer multilayer perceptron neural network (MPNN) model to predict the clinical citation count of biomedical papers in the future.
Score: 4.64065792373245
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The number of clinical citations received from clinical guidelines or clinical trials has been considered as one of the most appropriate indicators for quantifying the clinical impact of biomedical papers. Therefore, the early prediction of the clinical citation count of biomedical papers is critical to scientific activities in biomedicine, such as research evaluation, resource allocation, and clinical translation. In this study, we designed a four-layer multilayer perceptron neural network (MPNN) model to predict the clinical citation count of biomedical papers in the future by using 9,822,620 biomedical papers published from 1985 to 2005. We extracted ninety-one paper features from three dimensions as the input of the model, including twenty-one features in the paper dimension, thirty-five in the reference dimension, and thirty-five in the citing paper dimension. In each dimension, the features can be classified into three categories, i.e., the citation-related features, the clinical translation-related features, and the topic-related features. Besides, in the paper dimension, we also considered the features that have previously been demonstrated to be related to the citation counts of research papers. The results showed that the proposed MPNN model outperformed the other five baseline models, and the features in the reference dimension were the most important.

Related papers

Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content [0.10241134756773229]
We introduce Biomed-Enriched, a biomedical text dataset constructed from PubMed via a two-stage annotation process.<n>In the first stage, a large language model annotates 400K paragraphs from PubMed scientific articles, assigning scores for their type (review, study, clinical case, other), domain (clinical, biomedical, other), and educational quality.<n>The resulting metadata allows us to extract refined subsets, including 2M clinical case paragraphs with over 450K high-quality ones from articles with commercial-use licenses.
arXiv Detail & Related papers (2025-06-25T11:30:25Z)
A Review on Generative AI Models for Synthetic Medical Text, Time Series, and Longitudinal Data [0.3374875022248865]
This paper presents the results of a novel scoping review on the practical models for generating three different types of synthetic health records (SHRs) In total, 52 publications met the eligibility criteria for generating medical time series (22), longitudinal data (17), and medical text (13). Privacy preservation was found to be the main research objective of the studied papers, along with class imbalance, data scarcity, and data imputation as the other objectives.
arXiv Detail & Related papers (2024-11-19T06:53:54Z)
AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis [0.0]
The study selected the textitApplication of Artificial Intelligence in Breast Cancer Treatment as the research topic. Research papers related to this topic were collected from three major publication databases Google Scholar, Pubmed, and Scopus. ChatGPT models were used to identify the category, scope, and relevant information from the research papers.
arXiv Detail & Related papers (2024-03-05T19:47:57Z)
Hierarchical Pretraining for Biomedical Term Embeddings [4.69793648771741]
We propose HiPrBERT, a novel biomedical term representation model trained on hierarchical data. We show that HiPrBERT effectively learns the pair-wise distance from hierarchical information, resulting in a substantially more informative embeddings for further biomedical applications.
arXiv Detail & Related papers (2023-07-01T08:16:00Z)
Development and validation of a natural language processing algorithm to pseudonymize documents in the context of a clinical data warehouse [53.797797404164946]
The study highlights the difficulties faced in sharing tools and resources in this domain. We annotated a corpus of clinical documents according to 12 types of identifying entities. We build a hybrid system, merging the results of a deep learning model as well as manual rules.
arXiv Detail & Related papers (2023-03-23T17:17:46Z)
This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text [56.32427751440426]
In clinical practice such models must not only be accurate, but provide doctors with interpretable and helpful results. We introduce ProtoPatient, a novel method based on prototypical networks and label-wise attention. We evaluate the model on two publicly available clinical datasets and show that it outperforms existing baselines.
arXiv Detail & Related papers (2022-10-16T10:12:07Z)
Cross-Lingual Knowledge Transfer for Clinical Phenotyping [55.92262310716537]
We investigate cross-lingual knowledge transfer strategies to execute this task for clinics that do not use the English language. We evaluate these strategies for a Greek and a Spanish clinic leveraging clinical notes from different clinical domains. Our results show that using multilingual data overall improves clinical phenotyping models and can compensate for data sparseness.
arXiv Detail & Related papers (2022-08-03T08:33:21Z)
Deep forecasting of translational impact in medical research [1.8130872753848115]
We develop a suite of representational and discriminative mathematical models of multi-scale publication data. We show that citations are only moderately predictive of translational impact as judged by inclusion in patents, guidelines, or policy documents. We argue that content-based models of impact are superior in performance to conventional, citation-based measures.
arXiv Detail & Related papers (2021-10-17T19:29:41Z)
Self-supervised Answer Retrieval on Clinical Notes [68.87777592015402]
We introduce CAPR, a rule-based self-supervision objective for training Transformer language models for domain-specific passage matching. We apply our objective in four Transformer-based architectures: Contextual Document Vectors, Bi-, Poly- and Cross-encoders. We report that CAPR outperforms strong baselines in the retrieval of domain-specific passages and effectively generalizes across rule-based and human-labeled passages.
arXiv Detail & Related papers (2021-08-02T10:42:52Z)
What is the State of the Art of Computer Vision-Assisted Cytology? A Systematic Literature Review [47.42354724922676]
We conducted a Systematic Literature Review to identify the state-of-art of computer vision techniques currently applied to cytology. The most used methods in the analyzed works are deep learning-based (70 papers), while fewer works employ classic computer vision only (101 papers) We conclude that there still is a lack of high-quality datasets for many types of stains and most of the works are not mature enough to be applied in a daily clinical diagnostic routine.
arXiv Detail & Related papers (2021-05-24T13:50:45Z)
CREATe: Clinical Report Extraction and Annotation Technology [53.731999072534876]
Clinical case reports are written descriptions of the unique aspects of a particular clinical case. There has been no attempt to develop an end-to-end system to annotate, index, or otherwise curate these reports. We propose a novel computational resource platform, CREATe, for extracting, indexing, and querying the contents of clinical case reports.
arXiv Detail & Related papers (2021-02-28T16:50:14Z)
Multi-Ontology Refined Embeddings (MORE): A Hybrid Multi-Ontology and Corpus-based Semantic Representation for Biomedical Concepts [0.5812284760539712]
This paper introduces Multi-Ontology Embeddings (MORE), a framework for incorporating domain knowledge from multiple ontologies into a distributional semantic model. We use the RadCore and MIMIC-III free-text datasets for the corpus-based component of MORE. For the corpus-based part, we use the Medical Subject Headings (MeSH) and three state-of-the-art-based similarity measures.
arXiv Detail & Related papers (2020-04-14T14:38:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.