Acronym Identification and Disambiguation Shared Tasks for Scientific
Document Understanding
- URL: http://arxiv.org/abs/2012.11760v4
- Date: Wed, 6 Jan 2021 04:34:52 GMT
- Title: Acronym Identification and Disambiguation Shared Tasks for Scientific
Document Understanding
- Authors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen, Walter
Chang, Leo Anthony Celi
- Abstract summary: Acronyms are short forms of longer phrases frequently used in writing.
Every text understanding tool should be capable of recognizing acronyms in text.
To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents.
- Score: 41.63345823743157
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Acronyms are the short forms of longer phrases and they are frequently used
in writing, especially scholarly writing, to save space and facilitate the
communication of information. As such, every text understanding tool should be
capable of recognizing acronyms in text (i.e., acronym identification) and also
finding their correct meaning (i.e., acronym disambiguation). As most of the
prior works on these tasks are restricted to the biomedical domain and use
unsupervised methods or models trained on limited datasets, they fail to
perform well for scientific document understanding. To push forward research in
this direction, we have organized two shared task for acronym identification
and acronym disambiguation in scientific documents, named AI@SDU and AD@SDU,
respectively. The two shared tasks have attracted 52 and 43 participants,
respectively. While the submitted systems make substantial improvements
compared to the existing baselines, there are still far from the human-level
performance. This paper reviews the two shared tasks and the prominent
participating systems for each of them.
Related papers
- Bridging Research and Readers: A Multi-Modal Automated Academic Papers
Interpretation System [47.13932723910289]
We introduce an open-source multi-modal automated academic paper interpretation system (MMAPIS) with three-step process stages.
It employs the hybrid modality preprocessing and alignment module to extract plain text, and tables or figures from documents separately.
It then aligns this information based on the section names they belong to, ensuring that data with identical section names are categorized under the same section.
It utilizes the extracted section names to divide the article into shorter text segments, facilitating specific summarizations both within and between sections via LLMs.
arXiv Detail & Related papers (2024-01-17T11:50:53Z) - Pre-training Multi-task Contrastive Learning Models for Scientific
Literature Understanding [52.723297744257536]
Pre-trained language models (LMs) have shown effectiveness in scientific literature understanding tasks.
We propose a multi-task contrastive learning framework, SciMult, to facilitate common knowledge sharing across different literature understanding tasks.
arXiv Detail & Related papers (2023-05-23T16:47:22Z) - LDKP: A Dataset for Identifying Keyphrases from Long Scientific
Documents [48.84086818702328]
Identifying keyphrases (KPs) from text documents is a fundamental task in natural language processing and information retrieval.
Vast majority of the benchmark datasets for this task are from the scientific domain containing only the document title and abstract information.
This presents three challenges for real-world applications: human-written summaries are unavailable for most documents, the documents are almost always long, and a high percentage of KPs are directly found beyond the limited context of title and abstract.
arXiv Detail & Related papers (2022-03-29T08:44:57Z) - Leveraging Domain Agnostic and Specific Knowledge for Acronym
Disambiguation [5.766754189548904]
Acronym disambiguation aims to find the correct meaning of an ambiguous acronym in a text.
We propose a Hierarchical Dual-path BERT method coined hdBERT to capture the general fine-grained and high-level specific representations.
With a widely adopted SciAD dataset contained 62,441 sentences, we investigate the effectiveness of hdBERT.
arXiv Detail & Related papers (2021-07-01T09:10:00Z) - BERT-based Acronym Disambiguation with Multiple Training Strategies [8.82012912690778]
Acronym disambiguation (AD) task aims to find the correct expansions of an ambiguous ancronym in a given sentence.
We propose a binary classification model incorporating BERT and several training strategies including dynamic negative sample selection.
Experiments on SciAD show the effectiveness of our proposed model and our score ranks 1st in SDU@AAAI-21 shared task 2: Acronym Disambiguation.
arXiv Detail & Related papers (2021-02-25T05:40:21Z) - MadDog: A Web-based System for Acronym Identification and Disambiguation [44.33455510438843]
Acronyms and abbreviations are the short-form of longer phrases and they are ubiquitously employed in various types of writing.
Despite their usefulness, they also provide challenges for understanding the text especially if the acronym is not defined in the text.
We provide the first web-based acronym identification and disambiguation system which can process acronyms from various domains.
arXiv Detail & Related papers (2021-01-25T04:49:25Z) - Primer AI's Systems for Acronym Identification and Disambiguation [0.0]
We introduce new methods for acronym identification and disambiguation.
Our systems achieve significant performance gains over previously suggested methods.
Both of our systems perform competitively on the SDU@AAAI-21 shared task leaderboard.
arXiv Detail & Related papers (2020-12-14T23:59:05Z) - What Does This Acronym Mean? Introducing a New Dataset for Acronym
Identification and Disambiguation [74.42107665213909]
Acronyms are the short forms of phrases that facilitate conveying lengthy sentences in documents and serve as one of the mainstays of writing.
Due to their importance, identifying acronyms and corresponding phrases (AI) and finding the correct meaning of each acronym (i.e., acronym disambiguation (AD)) are crucial for text understanding.
Despite the recent progress on this task, there are some limitations in the existing datasets which hinder further improvement.
arXiv Detail & Related papers (2020-10-28T00:12:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.