Selected Publication:
SHR
Neuro
Cancer
Cardio
Lipid
Metab
Microb
Marko, K; Schulz, S; Hahn, U.
Automatic lexeme acquisition for a multilingual medical subword thesaurus.
Int J Med Inform. 2006; 76(2-3):184-189
Doi: 10.1016/j.ijmedinf.2006.05.032
Web of Science
PubMed
FullText
FullText_MUG
- Co-authors Med Uni Graz
-
Schulz Stefan
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Scite (citation analytics):
- Abstract:
- Purpose: We present a method for the automated acquisition of a multilingual medical lexicon (for Spanish, French and Swedish) to be used within the framework of a medical cross-language text retrieval system. Methods: For the lexical acquisition process, we incorporate seed lexicons and lists of trusted term translations derived from the UMLS Metathesaurus. The seed lexicons for Spanish, French and Swedish are automatically generated from (previously manually constructed) Portuguese, German and English sources by simple string transformations. Lexical and semantic hypotheses are then validated by processing pairs of term translations. In a last step, we use the cleaned list of "approved" translations in order to augment, step by step, the target dictionaries by processing the parallel corpora in terms of co-occurrence patterns of hypothesized translation equivalents which cannot be derived by simple character substitutions. Results: An existing multilingual lexicon for the medical domain with about 60,000 entries for English, German, and Portuguese was automatically augmented by more then 17,000 new lexemes for Spanish, French, and Swedish. Conclusions: Our approach constitutes a promising method for the automated creation of new lexicon entries and their linkage to semantic identifiers. (c) 2006 Elsevier Ireland Ltd. All rights reserved.
- Find related publications in this database (using NLM MeSH Indexing)
-
Automatic Data Processing -
-
Humans -
-
Information Storage and Retrieval -
-
Language -
-
Medical Informatics -
-
Multilingualism -
-
Natural Language Processing -
-
Semantics -
-
Terminology as Topic -
-
Unified Medical Language System -
-
Vocabulary, Controlled -
- Find related publications in this database (Keywords)
-
medical informatics
-
information storage and retrieval
-
multilingualism
-
vocabulary
-
controlled