Medical Text Simplification: Optimizing for Readability with
Unlikelihood Training and Reranked Beam Search Decoding
- URL: http://arxiv.org/abs/2310.11191v2
- Date: Thu, 26 Oct 2023 02:54:41 GMT
- Title: Medical Text Simplification: Optimizing for Readability with
Unlikelihood Training and Reranked Beam Search Decoding
- Authors: Lorenzo Jaime Yu Flores, Heyuan Huang, Kejian Shi, Sophie Chheang,
Arman Cohan
- Abstract summary: Text simplification has emerged as an increasingly useful application of AI for bridging the communication gap in specialized fields such as medicine.
Despite notable progress, methods in medical simplification sometimes result in the generated text having lower quality and diversity.
We propose a new unlikelihood loss that encourages generation of simpler terms and a reranked beam search decoding method that optimize for simplicity.
- Score: 18.06012822620814
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text simplification has emerged as an increasingly useful application of AI
for bridging the communication gap in specialized fields such as medicine,
where the lexicon is often dominated by technical jargon and complex
constructs. Despite notable progress, methods in medical simplification
sometimes result in the generated text having lower quality and diversity. In
this work, we explore ways to further improve the readability of text
simplification in the medical domain. We propose (1) a new unlikelihood loss
that encourages generation of simpler terms and (2) a reranked beam search
decoding method that optimizes for simplicity, which achieve better performance
on readability metrics on three datasets. This study's findings offer promising
avenues for improving text simplification in the medical field.
Related papers
- Society of Medical Simplifiers [7.4751114996742]
We introduce the Society of Medical Simplifiers, a novel framework inspired by the "Society of Mind" (SOM) philosophy.
Our approach leverages the strengths of LLMs by assigning five distinct roles, i.e., Layperson, Simplifier, Medical Expert, Language Clarifier, and Redundancy Checker.
Our framework is on par with or outperforms state-of-the-art methods, achieving superior readability and content preservation.
arXiv Detail & Related papers (2024-10-12T19:52:56Z) - Self-Verification Improves Few-Shot Clinical Information Extraction [73.6905567014859]
Large language models (LLMs) have shown the potential to accelerate clinical curation via few-shot in-context learning.
They still struggle with issues regarding accuracy and interpretability, especially in mission-critical domains such as health.
Here, we explore a general mitigation framework using self-verification, which leverages the LLM to provide provenance for its own extraction and check its own outputs.
arXiv Detail & Related papers (2023-05-30T22:05:11Z) - Multilingual Simplification of Medical Texts [49.469685530201716]
We introduce MultiCochrane, the first sentence-aligned multilingual text simplification dataset for the medical domain in four languages.
We evaluate fine-tuned and zero-shot models across these languages, with extensive human assessments and analyses.
Although models can now generate viable simplified texts, we identify outstanding challenges that this dataset might be used to address.
arXiv Detail & Related papers (2023-05-21T18:25:07Z) - NapSS: Paragraph-level Medical Text Simplification via Narrative
Prompting and Sentence-matching Summarization [46.772517928718216]
We propose a summarize-then-simplify two-stage strategy, which we call NapSS.
NapSS identifies the relevant content to simplify while ensuring that the original narrative flow is preserved.
Our model achieves significantly better than the seq2seq baseline on an English medical corpus.
arXiv Detail & Related papers (2023-02-11T02:20:25Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Paragraph-level Simplification of Medical Texts [35.650619024498425]
Manual simplification does not scale to the rapidly growing body of biomedical literature.
We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics.
We propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts.
arXiv Detail & Related papers (2021-04-12T18:56:05Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Benchmarking Automated Clinical Language Simplification: Dataset,
Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches.
We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z) - Elaborative Simplification: Content Addition and Explanation Generation
in Text Simplification [33.08519864889526]
We present the first data-driven study of content addition in text simplification.
We analyze how entities, ideas, and concepts are elaborated through the lens of contextual specificity.
Our results illustrate the complexities of elaborative simplification, suggesting many interesting directions for future work.
arXiv Detail & Related papers (2020-10-20T05:06:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.