Related papers: Reproducibility Beyond the Research Community: Experience from NLP Beginners

Reproducibility Beyond the Research Community: Experience from NLP Beginners

URL: http://arxiv.org/abs/2205.02182v2
Date: Thu, 5 May 2022 23:25:40 GMT
Title: Reproducibility Beyond the Research Community: Experience from NLP Beginners
Authors: Shane Storks, Keunwoo Peter Yu, Joyce Chai
Abstract summary: We conducted a study with 93 students in an introductory NLP course, where students reproduced results of recent NLP papers. Surprisingly, our results suggest that their technical skill (i.e., programming experience) has limited impact on their effort spent completing the exercise. We find accessibility efforts by research authors to be key to a successful experience, including thorough documentation and easy access to required models and datasets.
Score: 6.957948096979098
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As NLP research attracts public attention and excitement, it becomes increasingly important for it to be accessible to a broad audience. As the research community works to democratize NLP, it remains unclear whether beginners to the field can easily apply the latest developments. To understand their needs, we conducted a study with 93 students in an introductory NLP course, where students reproduced results of recent NLP papers. Surprisingly, our results suggest that their technical skill (i.e., programming experience) has limited impact on their effort spent completing the exercise. Instead, we find accessibility efforts by research authors to be key to a successful experience, including thorough documentation and easy access to required models and datasets.

Related papers

Have LLMs Made Active Learning Obsolete? Surveying the NLP Community [7.99984266570379]
Supervised learning relies on annotated data, which is expensive to obtain. Large language models have pushed the effectiveness of active learning, but have also improved methods such as few- or zero-shot learning. This raises the question: has active learning become obsolete?
arXiv Detail & Related papers (2025-03-12T18:00:04Z)
O1 Replication Journey: A Strategic Progress Report -- Part 1 [52.062216849476776]
This paper introduces a pioneering approach to artificial intelligence research, embodied in our O1 Replication Journey. Our methodology addresses critical challenges in modern AI research, including the insularity of prolonged team-based projects. We propose the journey learning paradigm, which encourages models to learn not just shortcuts, but the complete exploration process.
arXiv Detail & Related papers (2024-10-08T15:13:01Z)
The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers. Our findings reveal a rising involvement of machine learning in NLP since the early nineties. In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z)
What Can Natural Language Processing Do for Peer Review? [173.8912784451817]
In modern science, peer review is widely used, yet it is hard, time-consuming, and prone to error. Since the artifacts involved in peer review are largely text-based, Natural Language Processing has great potential to improve reviewing. We detail each step of the process from manuscript submission to camera-ready revision, and discuss the associated challenges and opportunities for NLP assistance.
arXiv Detail & Related papers (2024-05-10T16:06:43Z)
NLP Reproducibility For All: Understanding Experiences of Beginners [6.190897257068862]
We conducted a study with 93 students in an introductory NLP course, where students reproduced the results of recent NLP papers. We find that programming skill and comprehension of research papers have a limited impact on their effort spent completing the exercise. We recommend that NLP researchers pay close attention to these simple aspects of open-sourcing their work, and use insights from beginners' feedback to provide actionable ideas on how to better support them.
arXiv Detail & Related papers (2023-05-26T02:08:54Z)
Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good [115.1507728564964]
We introduce NLP4SG Papers, a scientific dataset with three associated tasks. These tasks help identify NLP4SG papers and characterize the NLP4SG landscape. We use state-of-the-art NLP models to address each of these tasks and use them on the entire ACL Anthology.
arXiv Detail & Related papers (2023-05-09T14:16:25Z)
NLPeer: A Unified Resource for the Computational Study of Peer Review [58.71736531356398]
We introduce NLPeer -- the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues. We augment previous peer review datasets to include parsed and structured paper representations, rich metadata and versioning information. Our work paves the path towards systematic, multi-faceted, evidence-based study of peer review in NLP and beyond.
arXiv Detail & Related papers (2022-11-12T12:29:38Z)
Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area. Deep learning requires many labeled data and is less generalizable across domains. Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z)
We Need to Talk About Data: The Importance of Data Readiness in Natural Language Processing [3.096615629099618]
We argue that there is a gap between academic research in NLP and its application to problems outside academia. We propose a method for improving the communication between researchers and external stakeholders regarding the accessibility, validity, and utility of data.
arXiv Detail & Related papers (2021-10-11T17:55:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.