Selected Publication:
Hahn, U; Honeck, M; Piotrowski, M; Schulz, S.
Subword segmentation - Leveling out morphological variations for medical document retrieval
J AMER MED INFORM ASSOC. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION; 229-233. ( Presented at: Annual Symposium of the American-Medical-Informatics-Association (AMIA 2001), WASHINGTON, D.C., NOV 03-07, 2001)
[OPEN ACCESS]
Web of Science
PubMed
- Co-authors Med Uni Graz
-
Schulz Stefan
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Abstract:
- Many lexical items from medical sublanguages exhibit a complex morphological structure that is hard to account for by simple string matching (e.g., truncation). While inflection is usually easy to deal with, productive morphological processes in terms of derivation and (single-word) composition constitute a real challenge. We here propose an approach in which morphologically complex word forms are segmented into medically significant subwords. After segmentation, both query terms and document terms are submitted to the matching procedure. This way, problems arising from morphologically motivated word form alterations can be eliminated from the retrieval procedure. We provide empirical data which reveals that subword-based indexing and retrieval performs significantly better than conventional string matching approaches.
- Find related publications in this database (using NLM MeSH Indexing)
-
Information Storage and Retrieval - methods
-
Language -
-
Terminology as Topic -