Automatic Speech Recognition for Greek Medical Dictation
- URL: http://arxiv.org/abs/2509.23550v1
- Date: Sun, 28 Sep 2025 01:15:47 GMT
- Title: Automatic Speech Recognition for Greek Medical Dictation
- Authors: Vardis Georgilas, Themos Stafylakis,
- Abstract summary: The main objective of this paper is to create a domain-specific system for Greek medical speech transcriptions.<n>We develop a system that combines automatic speech recognition techniques with text correction model.
- Score: 5.543902482518564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical dictation systems are essential tools in modern healthcare, enabling accurate and efficient conversion of speech into written medical documentation. The main objective of this paper is to create a domain-specific system for Greek medical speech transcriptions. The ultimate goal is to assist healthcare professionals by reducing the overload of manual documentation and improving workflow efficiency. Towards this goal, we develop a system that combines automatic speech recognition techniques with text correction model, allowing better handling of domain-specific terminology and linguistic variations in Greek. Our approach leverages both acoustic and textual modeling to create more realistic and reliable transcriptions. We focused on adapting existing language and speech technologies to the Greek medical context, addressing challenges such as complex medical terminology and linguistic inconsistencies. Through domain-specific fine-tuning, our system achieves more accurate and coherent transcriptions, contributing to the development of practical language technologies for the Greek healthcare sector.
Related papers
- Plasticine: A Traceable Diffusion Model for Medical Image Translation [79.39689106440389]
We propose Plasticine, to the best of our knowledge, the first end-to-end image-to-image translation framework explicitly designed with traceability as a core objective.<n>Our method combines intensity translation and spatial transformation within a denoising diffusion framework.<n>This design enables the generation of synthetic images with interpretable intensity transitions and spatially coherent deformations, supporting pixel-wise traceability throughout the translation process.
arXiv Detail & Related papers (2025-12-20T18:01:57Z) - Bridging the Gap with Retrieval-Augmented Generation: Making Prosthetic Device User Manuals Available in Marginalised Languages [1.7218681244575125]
This work presents an AI-powered framework designed to process and translate medical documents, e.g., user manuals for prosthetic devices, into marginalised languages.<n>The system enables users -- such as healthcare workers or patients -- to upload English-language medical equipment manuals, pose questions in their native language, and receive accurate, localised answers in real time.
arXiv Detail & Related papers (2025-06-30T15:25:58Z) - Searching for Best Practices in Medical Transcription with Large Language Model [1.0855602842179624]
This paper introduces a novel approach leveraging a Large Language Model (LLM) to generate highly accurate medical transcripts.
Our methodology integrates advanced language modeling techniques to lower the Word Error Rate (WER) and ensure the precise recognition of critical medical terms.
arXiv Detail & Related papers (2024-10-04T03:41:16Z) - Uncertainty-aware Medical Diagnostic Phrase Identification and Grounding [72.18719355481052]
We introduce a novel task called Medical Report Grounding (MRG)<n>MRG aims to directly identify diagnostic phrases and their corresponding grounding boxes from medical reports in an end-to-end manner.<n>We propose uMedGround, a robust and reliable framework that leverages a multimodal large language model to predict diagnostic phrases.
arXiv Detail & Related papers (2024-04-10T07:41:35Z) - The Sound of Healthcare: Improving Medical Transcription ASR Accuracy
with Large Language Models [0.0]
Large Language Models (LLMs) can enhance the accuracy of Automatic Speech Recognition (ASR) systems in medical transcription.
Our research focuses on improvements in Word Error Rate (WER), Medical Concept WER (MC-WER) for the accurate transcription of essential medical terms, and speaker diarization accuracy.
arXiv Detail & Related papers (2024-02-12T14:01:12Z) - Prompt Engineering for Healthcare: Methodologies and Applications [93.63832575498844]
This review will introduce the latest advances in prompt engineering in the field of natural language processing for the medical field.
We will provide the development of prompt engineering and emphasize its significant contributions to healthcare natural language processing applications.
arXiv Detail & Related papers (2023-04-28T08:03:42Z) - Terminology-aware Medical Dialogue Generation [23.54754465832362]
Medical dialogue generation aims to generate responses according to a history of dialogue turns between doctors and patients.
Unlike open-domain dialogue generation, this requires background knowledge specific to the medical domain.
We propose a novel framework to improve medical dialogue generation by considering features centered on domain-specific terminology.
arXiv Detail & Related papers (2022-10-27T15:41:46Z) - Towards more patient friendly clinical notes through language models and
ontologies [57.51898902864543]
We present a novel approach to automated medical text based on word simplification and language modelling.
We use a new dataset pairs of publicly available medical sentences and a version of them simplified by clinicians.
Our method based on a language model trained on medical forum data generates simpler sentences while preserving both grammar and the original meaning.
arXiv Detail & Related papers (2021-12-23T16:11:19Z) - Self-Supervised Knowledge Assimilation for Expert-Layman Text Style
Transfer [63.72621204057025]
Expert-layman text style transfer technologies have the potential to improve communication between scientific communities and the general public.
High-quality information produced by experts is often filled with difficult jargon laypeople struggle to understand.
This is a particularly notable issue in the medical domain, where layman are often confused by medical text online.
arXiv Detail & Related papers (2021-10-06T17:57:22Z) - Paragraph-level Simplification of Medical Texts [35.650619024498425]
Manual simplification does not scale to the rapidly growing body of biomedical literature.
We introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics.
We propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts.
arXiv Detail & Related papers (2021-04-12T18:56:05Z) - Bridging the Modality Gap for Speech-to-Text Translation [57.47099674461832]
End-to-end speech translation aims to translate speech in one language into text in another language via an end-to-end way.
Most existing methods employ an encoder-decoder structure with a single encoder to learn acoustic representation and semantic information simultaneously.
We propose a Speech-to-Text Adaptation for Speech Translation model which aims to improve the end-to-end model performance by bridging the modality gap between speech and text.
arXiv Detail & Related papers (2020-10-28T12:33:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.