Gewählte Publikation:
SHR
Neuro
Krebs
Kardio
Lipid
Stoffw
Microb
Modersohn, L; Schulz, S; Lohr, C; Hahn, U.
GRASCCO - The First Publicly Shareable, Multiply-Alienated German Clinical Text Corpus.
Stud Health Technol Inform. 2022; 296: 66-72.
Doi: 10.3233/SHTI220805
PubMed
FullText
FullText_MUG
- Führende Autor*innen der Med Uni Graz
-
Schulz Stefan
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Scite (citation analytics):
- Abstract:
- We describe the creation of GRASCCO, a novel German-language corpus composed of some 60 clinical documents with more than.43,000 tokens. GRASCCO is a synthetic corpus resulting from a series of alienation steps to obfuscate privacy-sensitive information contained in real clinical documents, the true origin of all GRASCCO texts. Therefore, it is publicly shareable without any legal restrictions We also explore whether this corpus still represents common clinical language use by comparison with a real (non-shareable) clinical corpus we developed as a contribution to the Medical Informatics Initiative in Germany (MII) within the SMITH consortium. We find evidence that such a claim can indeed be made.
- Find related publications in this database (using NLM MeSH Indexing)
-
Germany - administration & dosage
-
Language - administration & dosage
-
Natural Language Processing - administration & dosage