Medical Informatics
Research Article
Automatic Speech Recognition coupled LLM
Yurij Gennadievich Sidorov1
, Vladimir Leonidovich Malykh2, Aleksey Nikolayevich Kalinin3, Olga Sergeevna Yelistratova4
| 1,3 | Interin Group of Companies, Moscow, Russia |
| 2,4 | Ailamazyan Program Systems Institute of RAS, Ves'kovo, Russia |
| 1 |
|
Abstract. One of the barriers preventing the widespread use of speech medical data entry in HIS is the insufficient consumer quality of texts obtained after transcription. Not all medical terms and words of the general lexicon are recognized correctly, the coordination of words by gender, number and case is disrupted, the text is not well formatted from the point of view of grammar. All this requires further revision of the text. Another difficult problem is the need to bring the text to the structure of the medical document in HIS. The document structure can be quite complex, contain many elements, and have requirements for the type and format of the data in the structure elements. Speech input can only be partially used to generate a document, and the missing data can be taken from a custom template.
To solve these problems, we propose to use LLM as a corrector of speech transcription results, an integrator of text data and text data from a template, and a structurer of the resulting data. The paper proposes a solution architecture for the input of speech medical data based on the composition of the transcription system and LLM. A methodology for conducting solution tests is proposed, including the preparation of a dataset and a metric for calculating the quality of the solution. The implementation of the solution based on a free and proprietary component is described.
The results can be used in the development and evaluation of AI systems used for speech data input, and not only in medicine. (In Russian).
Keywords: medical informatics, medical information systems, speech recognition, medical voice recognition, LLM
MSC-2020
94A05; 92C50, 93BxxFor citation: Yurij G. Sidorov, Vladimir L. Malykh, Aleksey N. Kalinin, Olga S. Yelistratova. Automatic Speech Recognition coupled LLM. Program Systems: Theory and Applications, 2025, 16:6, pp. 197–219. (In Russ.). https://psta.psiras.ru/2025/6_197-219.
Full text of article (PDF): https://psta.psiras.ru/read/psta2025_6_197-219.pdf.
The article was submitted 22.10.2025; approved after reviewing 30.10.2025; accepted for publication 17.11.2025; published online 15.12.2025.