Automatic Speech Recognition coupled LLM

Yurij G. Sidorov; Vladimir L. Malykh; Olga S. Yelistratova; Aleksey N. Kalinin

Program Systems: Theory and Applications

ISSN 2079-3316

Bilingual online scientific Online scientific journal of the Ailamazyan Program System Institute of the Ailamazyan PSI of PSI of Russian Academy of Science of RAS

12+

Volume 16 (2025) . Issue 6 (71) . Paper No. 5 (452)

Medical Informatics

Research Article

DOI

10.25209/2079-3316-2025-16-6-197-219

Automatic Speech Recognition coupled LLM

Yurij Gennadievich Sidorov¹, Vladimir Leonidovich Malykh², Aleksey Nikolayevich Kalinin³, Olga Sergeevna Yelistratova⁴

^1,3	Interin Group of Companies, Moscow, Russia
^2,4	Ailamazyan Program Systems Institute of RAS, Ves'kovo, Russia
²	mvl@interin.ru

Abstract. One of the barriers preventing the widespread use of speech medical data entry in HIS is the insufficient consumer quality of texts obtained after transcription. Not all medical terms and words of the general lexicon are recognized correctly, the coordination of words by gender, number and case is disrupted, the text is not well formatted from the point of view of grammar. All this requires further revision of the text. Another difficult problem is the need to bring the text to the structure of the medical document in HIS. The document structure can be quite complex, contain many elements, and have requirements for the type and format of the data in the structure elements. Speech input can only be partially used to generate a document, and the missing data can be taken from a custom template.

To solve these problems, we propose to use LLM as a corrector of speech transcription results, an integrator of text data and text data from a template, and a structurer of the resulting data. The paper proposes a solution architecture for the input of speech medical data based on the composition of the transcription system and LLM. A methodology for conducting solution tests is proposed, including the preparation of a dataset and a metric for calculating the quality of the solution. The implementation of the solution based on a free and proprietary component is described.

The results can be used in the development and evaluation of AI systems used for speech data input, and not only in medicine. (In Russian).

Keywords: medical information systems, MIS, artificial intelligence, AI, speech input, transcription system, large language models, LLM

MSC-2020

94A05; 92C50, 93Bxx

MSC-2020 94-XX: Information and communication theory, circuits
MSC-2020 94Axx: Communication, information
MSC-2020 94A05: Communication theory
MSC-2020 92-XX: Biology and other natural sciences
MSC-2020 92Cxx: Physiological, cellular and medical topics
MSC-2020 92C50: Medical applications (general)
MSC-2020 :
MSC-2020 93-XX: Systems theory; control
MSC-2020 93Bxx: Controllability, observability, and system structure

For citation: Yurij G. Sidorov, Vladimir L. Malykh, Aleksey N. Kalinin, Olga S. Yelistratova. Automatic Speech Recognition coupled LLM. Program Systems: Theory and Applications, 2025, 16:6, pp. 197–219. (In Russ.). https://psta.psiras.ru/2025/6_197-219.

Full text of article (PDF): https://psta.psiras.ru/read/psta2025_6_197-219.pdf.

The article was submitted 22.10.2025; approved after reviewing 30.10.2025; accepted for publication 17.11.2025; published online 15.12.2025.

2025

Editorial address: Ailamazyan Program Systems Institute of the Russian Academy of Sciences, Peter the First Street 4«a», Veskovo village, Pereslavl area, Yaroslavl region, 152021 Russia; Website: http://psta.psiras.ru

Phone: +7(4852) 695-228; E-mail: ; License: CC-BY-4.0 License text on the Creative Commons site

Medical Informatics

Research Article

Automatic Speech Recognition coupled LLM

Yurij Gennadievich Sidorov1, Vladimir Leonidovich Malykh2, Aleksey Nikolayevich Kalinin3, Olga Sergeevna Yelistratova4

Yurij Gennadievich Sidorov¹, Vladimir Leonidovich Malykh², Aleksey Nikolayevich Kalinin³, Olga Sergeevna Yelistratova⁴