Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
- URL: http://arxiv.org/abs/2406.06021v1
- Date: Mon, 10 Jun 2024 04:47:27 GMT
- Title: Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
- Authors: Surangika Ranathunga, Nisansa de Silva, Dilith Jayakody, Aloka Fernando,
- Abstract summary: We observe that papers published in different NLP venues show different patterns related to artefact reuse.
More than 30% of the papers we analysed do not release their artefacts publicly, despite promising to do so.
We observe a wide language-wise disparity in publicly available NLP-related artefacts.
- Score: 1.1650821883155187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analysed a sample of NLP research papers archived in ACL Anthology as an attempt to quantify the degree of openness and the benefit of such an open culture in the NLP community. We observe that papers published in different NLP venues show different patterns related to artefact reuse. We also note that more than 30% of the papers we analysed do not release their artefacts publicly, despite promising to do so. Further, we observe a wide language-wise disparity in publicly available NLP-related artefacts.
Related papers
- The Nature of NLP: Analyzing Contributions in NLP Papers [77.31665252336157]
We quantitatively investigate what constitutes NLP research by examining research papers.
Our findings reveal a rising involvement of machine learning in NLP since the early nineties.
In post-2020, there has been a resurgence of focus on language and people.
arXiv Detail & Related papers (2024-09-29T01:29:28Z) - From Insights to Actions: The Impact of Interpretability and Analysis Research on NLP [28.942812379900673]
Interpretability and analysis (IA) research is a growing subfield within NLP.
We seek to quantify the impact of IA research on the broader field of NLP.
arXiv Detail & Related papers (2024-06-18T13:45:07Z) - Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and Contributions [2.6746207141044582]
We surveyed 100 papers published at EMNLP 2022 to determine the degree to which researchers rely on industry models.
Our work serves as a scaffold to enable future researchers to more accurately address whether collaboration with industry is still collaboration in the absence of an alternative.
arXiv Detail & Related papers (2023-12-06T21:12:22Z) - We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields [30.550895983110806]
Cross-field engagement of Natural Language Processing has declined.
Less than 8% of NLP citations are to linguistics.
Less than 3% of NLP citations are to math and psychology.
arXiv Detail & Related papers (2023-10-23T12:42:06Z) - Beyond Good Intentions: Reporting the Research Landscape of NLP for
Social Good [115.1507728564964]
We introduce NLP4SG Papers, a scientific dataset with three associated tasks.
These tasks help identify NLP4SG papers and characterize the NLP4SG landscape.
We use state-of-the-art NLP models to address each of these tasks and use them on the entire ACL Anthology.
arXiv Detail & Related papers (2023-05-09T14:16:25Z) - NLPeer: A Unified Resource for the Computational Study of Peer Review [58.71736531356398]
We introduce NLPeer -- the first ethically sourced multidomain corpus of more than 5k papers and 11k review reports from five different venues.
We augment previous peer review datasets to include parsed and structured paper representations, rich metadata and versioning information.
Our work paves the path towards systematic, multi-faceted, evidence-based study of peer review in NLP and beyond.
arXiv Detail & Related papers (2022-11-12T12:29:38Z) - Geographic Citation Gaps in NLP Research [63.13508571014673]
This work asks a series of questions on the relationship between geographical location and publication success.
We first created a dataset of 70,000 papers from the ACL Anthology, extracted their meta-information, and generated their citation network.
We show that not only are there substantial geographical disparities in paper acceptance and citation but also that these disparities persist even when controlling for a number of variables such as venue of publication and sub-field of NLP.
arXiv Detail & Related papers (2022-10-26T02:25:23Z) - State-of-the-art generalisation research in NLP: A taxonomy and review [87.1541712509283]
We present a taxonomy for characterising and understanding generalisation research in NLP.
Our taxonomy is based on an extensive literature review of generalisation research.
We use our taxonomy to classify over 400 papers that test generalisation.
arXiv Detail & Related papers (2022-10-06T16:53:33Z) - Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the
Research Manifold [88.83876819883653]
We show through a manual classification of recent NLP research papers that this is indeed the case.
We observe that NLP research often goes beyond the square one setup, focusing not only on accuracy, but also on fairness or interpretability, but typically only along a single dimension.
arXiv Detail & Related papers (2022-06-20T13:04:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.