Creating Resources for Marking Diagnoses in Electronic Health Reports in Serbian

Authors

  • Ulfeta Marovac 1Department of Technical Sciences, State University of Novi Pazar, Novi Pazar, Serbia
  • Aldina Avdić Department of Technical Sciences, State University of Novi Pazar, Novi Pazar, Serbia
  • Dragan Janković Faculty of Electronic Engineering, University of Nis, Nis, Serbia
  • Sead Marovac Department of General Surgery, General Hospital of Novi Pazar, Novi Pazar, Serbia

DOI:

https://doi.org/10.7251/IJEEC2001018M

Abstract

Thanks to medical information systems, many medical reports are collected in an electronic form daily. Apart from the fields with allowed values for input (the structural part), one part of these reports consists of the free, non-structural text. It contains a more detailed description of the patient's condition, which could not be described using the structural part. Symptoms, results of laboratory analyses, accompanying diagnoses, etc. can often be found in it. Due to a lack of time, doctors often write these descriptions in non-standard ways, using their abbreviations and synonyms, and they often contain typos. All this makes it difficult to extract information in documents specific to the medical domain. This paper presents the creation of medical lexical resources for the automatic labeling of terms from diagnoses in medical reports. In order to perform the automatic marking of the free text, methods of the computer processing of natural languages are needed, as well as appropriate lexical resources. As there are no publicly available medical lexical resources for the Serbian language, as well as a corpus with medical reports, the contribution of this paper is the construction of such resources for needs of automatic marking of diagnoses. Using the proposed resources, diagnosis codes, Latin and Serbian terms specific to certain ICD-10 can be mapped with precision of 83.47%, 86.86% and 78.29%, respectively.

Downloads

Published

2021-10-05