Homepage Program Systems: Theory and Applications Русская версия
ISSN 2079-3316 Bilingual online scientific Online scientific journal of the Ailamazyan Program System Institute of the Ailamazyan PSI of PSI of Russian Academy of Science of RAS 12+ 
Volume 16 (2025) . Issue 6 (71) . Paper No. 5 (452)

Medical Informatics

Research Article

Automatic Speech Recognition coupled LLM

Yurij Gennadievich Sidorov1Correspondent author, Vladimir Leonidovich Malykh2, Aleksey Nikolayevich Kalinin3, Olga Sergeevna Yelistratova4

1,3Interin Group of Companies, Moscow, Russia
2,4Ailamazyan Program Systems Institute of RAS, Ves'kovo, Russia
1 Yurij Gennadievich Sidorov — Correspondent author sidorov@interin.ru

Abstract. One of the barriers preventing the widespread use of speech medical data entry in HIS is the insufficient consumer quality of texts obtained after transcription. Not all medical terms and words of the general lexicon are recognized correctly, the coordination of words by gender, number and case is disrupted, the text is not well formatted from the point of view of grammar. All this requires further revision of the text. Another difficult problem is the need to bring the text to the structure of the medical document in HIS. The document structure can be quite complex, contain many elements, and have requirements for the type and format of the data in the structure elements. Speech input can only be partially used to generate a document, and the missing data can be taken from a custom template.

To solve these problems, we propose to use LLM as a corrector of speech transcription results, an integrator of text data and text data from a template, and a structurer of the resulting data. The paper proposes a solution architecture for the input of speech medical data based on the composition of the transcription system and LLM. A methodology for conducting solution tests is proposed, including the preparation of a dataset and a metric for calculating the quality of the solution. The implementation of the solution based on a free and proprietary component is described.

The results can be used in the development and evaluation of AI systems used for speech data input, and not only in medicine. (In Russian).

Keywords: medical informatics, medical information systems, speech recognition, medical voice recognition, LLM

MSC-20202020 Mathematics Subject Classification 94A05; 92C50, 93BxxMSC-2020 94-XX: Information and communication theory, circuits
MSC-2020 94Axx: Communication, information
MSC-2020 94A05: Communication theory
MSC-2020 92-XX: Biology and other natural sciences
MSC-2020 92Cxx: Physiological, cellular and medical topics
MSC-2020 92C50: Medical applications (general)
MSC-2020 : 
MSC-2020 93-XX: Systems theory; control
MSC-2020 93Bxx: Controllability, observability, and system structure

For citation: Yurij G. Sidorov, Vladimir L. Malykh, Aleksey N. Kalinin, Olga S. Yelistratova. Automatic Speech Recognition coupled LLM. Program Systems: Theory and Applications, 2025, 16:6, pp. 197–219. (In Russ.). https://psta.psiras.ru/2025/6_197-219.

Full text of article (PDF): https://psta.psiras.ru/read/psta2025_6_197-219.pdf.

The article was submitted 22.10.2025; approved after reviewing 30.10.2025; accepted for publication 17.11.2025; published online 15.12.2025.

© Sidorov Y. G., Malykh V. L., Kalinin A. N., Yelistratova O. S.
2025
Editorial address: Ailamazyan Program Systems Institute of the Russian Academy of Sciences, Peter the First Street 4«a», Veskovo village, Pereslavl area, Yaroslavl region, 152021 Russia;   Website:  http://psta.psiras.ru Phone: +7(4852) 695-228;   E-mail: ;   License: CC-BY-4.0License text on the Creative Commons site
© Ailamazyan Program System Institute of Russian Academy of Science (site design) 2010–2025