Journal Impact Factor and Peer Review Thoroughness and Helpfulness: A
Supervised Machine Learning Study
- URL: http://arxiv.org/abs/2207.09821v1
- Date: Wed, 20 Jul 2022 11:14:15 GMT
- Title: Journal Impact Factor and Peer Review Thoroughness and Helpfulness: A
Supervised Machine Learning Study
- Authors: Anna Severin, Michaela Strinzel, Matthias Egger, Tiago Barros,
Alexander Sokolov, Julia Vilstrup Mouatt, Stefan M\"uller
- Abstract summary: The journal impact factor (JIF) is often equated with journal quality and the quality of the peer review of the papers submitted to the journal.
We examined the association between the content of peer review and JIF by analysing 10,000 peer review reports submitted to 1,644 medical and life sciences journals.
- Score: 52.77024349608834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The journal impact factor (JIF) is often equated with journal quality and the
quality of the peer review of the papers submitted to the journal. We examined
the association between the content of peer review and JIF by analysing 10,000
peer review reports submitted to 1,644 medical and life sciences journals. Two
researchers hand-coded a random sample of 2,000 sentences. We then trained
machine learning models to classify all 187,240 sentences as contributing or
not contributing to content categories. We examined the association between ten
groups of journals defined by JIF deciles and the content of peer reviews using
linear mixed-effects models, adjusting for the length of the review. The JIF
ranged from 0.21 to 74.70. The length of peer reviews increased from the lowest
(median number of words 185) to the JIF group (387 words). The proportion of
sentences allocated to different content categories varied widely, even within
JIF groups. For thoroughness, sentences on 'Materials and Methods' were more
common in the highest JIF journals than in the lowest JIF group (difference of
7.8 percentage points; 95% CI 4.9 to 10.7%). The trend for 'Presentation and
Reporting' went in the opposite direction, with the highest JIF journals giving
less emphasis to such content (difference -8.9%; 95% CI -11.3 to -6.5%). For
helpfulness, reviews for higher JIF journals devoted less attention to
'Suggestion and Solution' and provided fewer Examples than lower impact factor
journals. No, or only small differences were evident for other content
categories. In conclusion, peer review in journals with higher JIF tends to be
more thorough in discussing the methods used but less helpful in terms of
suggesting solutions and providing examples. Differences were modest and
variability high, indicating that the JIF is a bad predictor for the quality of
peer review of an individual manuscript.
Related papers
- Evaluating the Predictive Capacity of ChatGPT for Academic Peer Review Outcomes Across Multiple Platforms [3.3543455244780223]
This paper introduces two new contexts and employs a more robust method - averaging multiple ChatGPT scores.
Findings that averaging 30 ChatGPT predictions, based on reviewer guidelines and using only submitted titles and abstracts, failed to predict peer review outcomes for F1000Research.
arXiv Detail & Related papers (2024-11-14T19:20:33Z) - Ta'keed: The First Generative Fact-Checking System for Arabic Claims [0.0]
This paper introduces Ta'keed, an explainable Arabic automatic fact-checking system.
Ta'keed generates explanations for claim credibility, particularly in Arabic.
The system achieved a promising F1 score of 0.72 in the classification task.
arXiv Detail & Related papers (2024-01-25T10:43:00Z) - Position: AI/ML Influencers Have a Place in the Academic Process [82.2069685579588]
We investigate the role of social media influencers in enhancing the visibility of machine learning research.
We have compiled a comprehensive dataset of over 8,000 papers, spanning tweets from December 2018 to October 2023.
Our statistical and causal inference analysis reveals a significant increase in citations for papers endorsed by these influencers.
arXiv Detail & Related papers (2024-01-24T20:05:49Z) - CausalCite: A Causal Formulation of Paper Citations [80.82622421055734]
CausalCite is a new way to measure the significance of a paper by assessing the causal impact of the paper on its follow-up papers.
It is based on a novel causal inference method, TextMatch, which adapts the traditional matching framework to high-dimensional text embeddings.
We demonstrate the effectiveness of CausalCite on various criteria, such as high correlation with paper impact as reported by scientific experts.
arXiv Detail & Related papers (2023-11-05T23:09:39Z) - Neural Media Bias Detection Using Distant Supervision With BABE -- Bias
Annotations By Experts [24.51774048437496]
This paper presents BABE, a robust and diverse data set for media bias research.
It consists of 3,700 sentences balanced among topics and outlets, containing media bias labels on the word and sentence level.
Based on our data, we also introduce a way to detect bias-inducing sentences in news articles automatically.
arXiv Detail & Related papers (2022-09-29T05:32:55Z) - Document-Level Relation Extraction with Sentences Importance Estimation
and Focusing [52.069206266557266]
Document-level relation extraction (DocRE) aims to determine the relation between two entities from a document of multiple sentences.
We propose a Sentence Estimation and Focusing (SIEF) framework for DocRE, where we design a sentence importance score and a sentence focusing loss.
Experimental results on two domains show that our SIEF not only improves overall performance, but also makes DocRE models more robust.
arXiv Detail & Related papers (2022-04-27T03:20:07Z) - A new baseline for retinal vessel segmentation: Numerical identification
and correction of methodological inconsistencies affecting 100+ papers [0.0]
We performed a detailed numerical analysis of the coherence of the published performance scores.
We found inconsistencies in the reported scores related to the use of the field of view.
The highest accuracy score achieved to date is 0.9582 in the FoV region, which is 1% higher than that of human annotators.
arXiv Detail & Related papers (2021-11-06T11:09:11Z) - Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS
Experiment [26.30237757653724]
We revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review.
We find that for emphaccepted papers, there is no correlation between quality scores and impact of the paper.
arXiv Detail & Related papers (2021-09-20T18:06:22Z) - Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores.
We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z) - Dynamic Semantic Matching and Aggregation Network for Few-shot Intent
Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances.
Semantic components are distilled from utterances via multi-head self-attention.
Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.