Selected Publication:
SHR
Neuro
Cancer
Cardio
Lipid
Metab
Microb
Napravnik, M; Hržić, F; Tschauner, S; Štajduhar, I.
Building RadiologyNET: an unsupervised approach to annotating a large-scale multimodal medical database.
BioData Min. 2024; 17(1): 22
Doi: 10.1186/s13040-024-00373-1
[OPEN ACCESS]
Web of Science
PubMed
FullText
FullText_MUG
- Co-authors Med Uni Graz
-
Tschauner Sebastian
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Scite (citation analytics):
- Abstract:
- BACKGROUND: The use of machine learning in medical diagnosis and treatment has grown significantly in recent years with the development of computer-aided diagnosis systems, often based on annotated medical radiology images. However, the lack of large annotated image datasets remains a major obstacle, as the annotation process is time-consuming and costly. This study aims to overcome this challenge by proposing an automated method for annotating a large database of medical radiology images based on their semantic similarity. RESULTS: An automated, unsupervised approach is used to create a large annotated dataset of medical radiology images originating from the Clinical Hospital Centre Rijeka, Croatia. The pipeline is built by data-mining three different types of medical data: images, DICOM metadata and narrative diagnoses. The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation. CONCLUSIONS: The results indicate that fusing the embeddings of all three data sources together provides the best results for the task of unsupervised clustering of large-scale medical data and leads to the most concise clusters. Hence, this work marks the initial step towards building a much larger and more fine-grained annotated dataset of medical radiology images.
- Find related publications in this database (Keywords)
-
Medical data annotation
-
Data mining
-
Big data
-
Feature extraction
-
Multimodal representation
-
Unsupervised machine learning