NLP Reproducibility For All: Understanding Experiences of Beginners
- URL: http://arxiv.org/abs/2305.16579v3
- Date: Sat, 3 Jun 2023 14:01:24 GMT
- Title: NLP Reproducibility For All: Understanding Experiences of Beginners
- Authors: Shane Storks, Keunwoo Peter Yu, Ziqiao Ma, Joyce Chai
- Abstract summary: We conducted a study with 93 students in an introductory NLP course, where students reproduced the results of recent NLP papers.
We find that programming skill and comprehension of research papers have a limited impact on their effort spent completing the exercise.
We recommend that NLP researchers pay close attention to these simple aspects of open-sourcing their work, and use insights from beginners' feedback to provide actionable ideas on how to better support them.
- Score: 6.190897257068862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As natural language processing (NLP) has recently seen an unprecedented level
of excitement, and more people are eager to enter the field, it is unclear
whether current research reproducibility efforts are sufficient for this group
of beginners to apply the latest developments. To understand their needs, we
conducted a study with 93 students in an introductory NLP course, where
students reproduced the results of recent NLP papers. Surprisingly, we find
that their programming skill and comprehension of research papers have a
limited impact on their effort spent completing the exercise. Instead, we find
accessibility efforts by research authors to be the key to success, including
complete documentation, better coding practice, and easier access to data
files. Going forward, we recommend that NLP researchers pay close attention to
these simple aspects of open-sourcing their work, and use insights from
beginners' feedback to provide actionable ideas on how to better support them.
Related papers
- Beyond Text: Characterizing Domain Expert Needs in Document Research [10.98467955215441]
We ask sixteen domain experts across two domains to understand their processes of document research.
We find that our participants processes are idiosyncratic, iterative, and rely extensively on the social context of a document.
We call on the NLP community to more carefully consider the role of the document in building useful tools.
arXiv Detail & Related papers (2025-04-16T21:24:41Z) - Have LLMs Made Active Learning Obsolete? Surveying the NLP Community [7.99984266570379]
Supervised learning relies on annotated data, which is expensive to obtain.<n>Large language models (LLMs) have pushed the effectiveness of active learning.<n>We conduct an online survey in the NLP community to collect intangible insights on the perceived relevance of data annotation.
arXiv Detail & Related papers (2025-03-12T18:00:04Z) - O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey.
Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects.
We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z) - The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - What Can Natural Language Processing Do for Peer Review? [173.8912784451817]
In modern science, peer review is widely used, yet it is hard, time-consuming, and prone to error.
Since the artifacts involved in peer review are largely text-based, Natural Language Processing has great potential to improve reviewing.
We detail each step of the process from manuscript submission to camera-ready revision, and discuss the associated challenges and opportunities for NLP assistance.
arXiv Detail & Related papers (2024-05-10T16:06:43Z) - A Survey on Deep Active Learning: Recent Advances and New Frontiers [27.07154361976248]
This work aims to serve as a useful and quick guide for researchers in overcoming difficulties in deep learning-based active learning (DAL)
This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep learning-based active learning (DAL), remain scarce.
arXiv Detail & Related papers (2024-05-01T05:54:33Z) - Beyond Good Intentions: Reporting the Research Landscape of NLP for
Social Good [115.1507728564964]
We introduce NLP4SG Papers, a scientific dataset with three associated tasks.
These tasks help identify NLP4SG papers and characterize the NLP4SG landscape.
We use state-of-the-art NLP models to address each of these tasks and use them on the entire ACL Anthology.
arXiv Detail & Related papers (2023-05-09T14:16:25Z) - NLPeer: A Unified Resource for the Computational Study of Peer Review [58.71736531356398]
We introduce NLPeer -- the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues.
We augment previous peer review datasets to include parsed and structured paper representations, rich metadata and versioning information.
Our work paves the path towards systematic, multi-faceted, evidence-based study of peer review in NLP and beyond.
arXiv Detail & Related papers (2022-11-12T12:29:38Z) - Reproducibility Beyond the Research Community: Experience from NLP
Beginners [6.957948096979098]
We conducted a study with 93 students in an introductory NLP course, where students reproduced results of recent NLP papers.
Surprisingly, our results suggest that their technical skill (i.e., programming experience) has limited impact on their effort spent completing the exercise.
We find accessibility efforts by research authors to be key to a successful experience, including thorough documentation and easy access to required models and datasets.
arXiv Detail & Related papers (2022-05-04T16:54:00Z) - Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area.
Deep learning requires many labeled data and is less generalizable across domains.
Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.