MadDog: A Web-based System for Acronym Identification and Disambiguation
- URL: http://arxiv.org/abs/2101.09893v1
- Date: Mon, 25 Jan 2021 04:49:25 GMT
- Title: MadDog: A Web-based System for Acronym Identification and Disambiguation
- Authors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Walter Chang, Thien Huu
Nguyen
- Abstract summary: Acronyms and abbreviations are the short-form of longer phrases and they are ubiquitously employed in various types of writing.
Despite their usefulness, they also provide challenges for understanding the text especially if the acronym is not defined in the text.
We provide the first web-based acronym identification and disambiguation system which can process acronyms from various domains.
- Score: 44.33455510438843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Acronyms and abbreviations are the short-form of longer phrases and they are
ubiquitously employed in various types of writing. Despite their usefulness to
save space in writing and reader's time in reading, they also provide
challenges for understanding the text especially if the acronym is not defined
in the text or if it is used far from its definition in long texts. To
alleviate this issue, there are considerable efforts both from the research
community and software developers to build systems for identifying acronyms and
finding their correct meanings in the text. However, none of the existing works
provide a unified solution capable of processing acronyms in various domains
and to be publicly available. Thus, we provide the first web-based acronym
identification and disambiguation system which can process acronyms from
various domains including scientific, biomedical, and general domains. The
web-based system is publicly available at http://iq.cs.uoregon.edu:5000 and a
demo video is available at https://youtu.be/IkSh7LqI42M. The system source code
is also available at https://github.com/amirveyseh/MadDog.
Related papers
- MACRONYM: A Large-Scale Dataset for Multilingual and Multi-Domain
Acronym Extraction [66.60031336330547]
Acronyms and their expanded forms are necessary for various NLP applications.
One limitation of existing AE research is that they are limited to the English language and certain domains.
Lacking annotated datasets in multiple languages and domains has been a major issue to hinder research in this area.
arXiv Detail & Related papers (2022-02-19T23:08:38Z) - CABACE: Injecting Character Sequence Information and Domain Knowledge
for Enhanced Acronym and Long-Form Extraction [0.0]
We propose a novel framework CABACE: Character-Aware BERT for ACronym Extraction.
It takes into account character sequences in text and is adapted to scientific and legal domains by masked language modelling.
We show that the proposed framework is better suited than baseline models for zero-shot generalization to non-English languages.
arXiv Detail & Related papers (2021-12-25T14:03:09Z) - SimCLAD: A Simple Framework for Contrastive Learning of Acronym
Disambiguation [26.896811663334162]
We propose a Contrastive Learning of Acronym Disambiguation (SimCLAD) method to better understand the acronym meanings.
The results on the acronym disambiguation of the scientific domain in English show that the proposed method outperforms all other competitive state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2021-11-29T02:39:59Z) - CDistNet: Perceiving Multi-Domain Character Distance for Robust Text
Recognition [87.3894423816705]
We propose a novel module called Multi-Domain Character Distance Perception (MDCDP) to establish a visually and semantically related position embedding.
MDCDP uses the position embedding to query both visual and semantic features following the cross-attention mechanism.
We develop CDistNet that stacks multiple MDCDPs to guide a gradually precise distance modeling.
arXiv Detail & Related papers (2021-11-22T06:27:29Z) - Acronym Identification and Disambiguation Shared Tasks for Scientific
Document Understanding [41.63345823743157]
Acronyms are short forms of longer phrases frequently used in writing.
Every text understanding tool should be capable of recognizing acronyms in text.
To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents.
arXiv Detail & Related papers (2020-12-22T00:29:15Z) - Primer AI's Systems for Acronym Identification and Disambiguation [0.0]
We introduce new methods for acronym identification and disambiguation.
Our systems achieve significant performance gains over previously suggested methods.
Both of our systems perform competitively on the SDU@AAAI-21 shared task leaderboard.
arXiv Detail & Related papers (2020-12-14T23:59:05Z) - What Does This Acronym Mean? Introducing a New Dataset for Acronym
Identification and Disambiguation [74.42107665213909]
Acronyms are the short forms of phrases that facilitate conveying lengthy sentences in documents and serve as one of the mainstays of writing.
Due to their importance, identifying acronyms and corresponding phrases (AI) and finding the correct meaning of each acronym (i.e., acronym disambiguation (AD)) are crucial for text understanding.
Despite the recent progress on this task, there are some limitations in the existing datasets which hinder further improvement.
arXiv Detail & Related papers (2020-10-28T00:12:36Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.