MadDog: A Web-based System for Acronym Identification and Disambiguation
- URL: http://arxiv.org/abs/2101.09893v1
- Date: Mon, 25 Jan 2021 04:49:25 GMT
- Title: MadDog: A Web-based System for Acronym Identification and Disambiguation
- Authors: Amir Pouran Ben Veyseh, Franck Dernoncourt, Walter Chang, Thien Huu
Nguyen
- Abstract summary: Acronyms and abbreviations are the short-form of longer phrases and they are ubiquitously employed in various types of writing.
Despite their usefulness, they also provide challenges for understanding the text especially if the acronym is not defined in the text.
We provide the first web-based acronym identification and disambiguation system which can process acronyms from various domains.
- Score: 44.33455510438843
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Acronyms and abbreviations are the short-form of longer phrases and they are
ubiquitously employed in various types of writing. Despite their usefulness to
save space in writing and reader's time in reading, they also provide
challenges for understanding the text especially if the acronym is not defined
in the text or if it is used far from its definition in long texts. To
alleviate this issue, there are considerable efforts both from the research
community and software developers to build systems for identifying acronyms and
finding their correct meanings in the text. However, none of the existing works
provide a unified solution capable of processing acronyms in various domains
and to be publicly available. Thus, we provide the first web-based acronym
identification and disambiguation system which can process acronyms from
various domains including scientific, biomedical, and general domains. The
web-based system is publicly available at http://iq.cs.uoregon.edu:5000 and a
demo video is available at https://youtu.be/IkSh7LqI42M. The system source code
is also available at https://github.com/amirveyseh/MadDog.
Related papers
- On Translating Technical Terminology: A Translation Workflow for
Machine-Translated Acronyms [3.053989095162017]
We find that an important step is being missed: the translation of technical terms, specifically acronyms.
Some state-of-the art machine translation systems like Google Translate which are publicly available can be erroneous when dealing with acronyms.
We propose an additional step to the SL-TL (FR-EN) translation workflow where we first offer a new acronym corpus for public consumption and then experiment with a search-based thresholding algorithm.
arXiv Detail & Related papers (2024-09-26T15:18:34Z) - MACRONYM: A Large-Scale Dataset for Multilingual and Multi-Domain
Acronym Extraction [66.60031336330547]
Acronyms and their expanded forms are necessary for various NLP applications.
One limitation of existing AE research is that they are limited to the English language and certain domains.
Lacking annotated datasets in multiple languages and domains has been a major issue to hinder research in this area.
arXiv Detail & Related papers (2022-02-19T23:08:38Z) - SimCLAD: A Simple Framework for Contrastive Learning of Acronym
Disambiguation [26.896811663334162]
We propose a Contrastive Learning of Acronym Disambiguation (SimCLAD) method to better understand the acronym meanings.
The results on the acronym disambiguation of the scientific domain in English show that the proposed method outperforms all other competitive state-of-the-art (SOTA) methods.
arXiv Detail & Related papers (2021-11-29T02:39:59Z) - CDistNet: Perceiving Multi-Domain Character Distance for Robust Text
Recognition [87.3894423816705]
We propose a novel module called Multi-Domain Character Distance Perception (MDCDP) to establish a visually and semantically related position embedding.
MDCDP uses the position embedding to query both visual and semantic features following the cross-attention mechanism.
We develop CDistNet that stacks multiple MDCDPs to guide a gradually precise distance modeling.
arXiv Detail & Related papers (2021-11-22T06:27:29Z) - Acronym Identification and Disambiguation Shared Tasks for Scientific
Document Understanding [41.63345823743157]
Acronyms are short forms of longer phrases frequently used in writing.
Every text understanding tool should be capable of recognizing acronyms in text.
To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents.
arXiv Detail & Related papers (2020-12-22T00:29:15Z) - Primer AI's Systems for Acronym Identification and Disambiguation [0.0]
We introduce new methods for acronym identification and disambiguation.
Our systems achieve significant performance gains over previously suggested methods.
Both of our systems perform competitively on the SDU@AAAI-21 shared task leaderboard.
arXiv Detail & Related papers (2020-12-14T23:59:05Z) - What Does This Acronym Mean? Introducing a New Dataset for Acronym
Identification and Disambiguation [74.42107665213909]
Acronyms are the short forms of phrases that facilitate conveying lengthy sentences in documents and serve as one of the mainstays of writing.
Due to their importance, identifying acronyms and corresponding phrases (AI) and finding the correct meaning of each acronym (i.e., acronym disambiguation (AD)) are crucial for text understanding.
Despite the recent progress on this task, there are some limitations in the existing datasets which hinder further improvement.
arXiv Detail & Related papers (2020-10-28T00:12:36Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.