Gewählte Publikation:
SHR
Neuro
Krebs
Kardio
Lipid
Stoffw
Microb
Pokorny, FB; Linke, J; Seddiki, N; Lohrmann, S; Gerstenberger, C; Haspl, K; Feiner, M; Eyben, F; Hagmüller, M; Schuppler, B; Kubin, G; Gugatschka, M.
VocDoc, what happened to my voice? Towards automatically capturing vocal fatigue in the wild
BIOMED SIGNAL PROCES. 2024; 88: 105595
Doi: 10.1016/j.bspc.2023.105595
Web of Science
FullText
FullText_MUG
- Führende Autor*innen der Med Uni Graz
-
Pokorny Florian
- Co-Autor*innen der Med Uni Graz
-
Feiner Marlies
-
Gerstenberger Claus
-
Gugatschka Markus
-
Haspl Katja
- Altmetrics:
- Dimensions Citations:
- Plum Analytics:
- Scite (citation analytics):
- Abstract:
- Objective: Voice problems that arise during everyday vocal use can hardly be captured by standard outpatient voice assessments. In preparation for a digital health application to automatically assess longitudinal voice data 'in the wild' - the VocDoc, the aim of this paper was to study vocal fatigue from the speaker's perspective, the healthcare professional's perspective, and the 'machine's' perspective. Methods: We collected data of four voice healthy speakers completing a 90-min reading task. Every 10 min the speakers were asked about subjective voice characteristics. Then, we elaborated on the task of elapsed speaking time recognition: We carried out listening experiments with speech and language therapists and employed random forests on the basis of extracted acoustic features. We validated our models speaker -dependently and speaker-independently and analysed underlying feature importances. For an additional, clinical application-oriented scenario, we extended our dataset for lecture recordings of another two speakers. Results: Self-and expert-assessments were not consistent. With mean F1 scores up to 0.78, automatic elapsed speaking time recognition worked reliably in the speaker-dependent scenario only. A small set of acoustic features - other than features previously reported to reflect vocal fatigue - was found to universally describe long-term variations of the voice. Conclusion: Vocal fatigue seems to have individual effects across different speakers. Machine learning has the potential to automatically detect and characterise vocal changes over time. Significance: Our study provides technical underpinnings for a future mobile solution to objectively capture pathological long-term voice variations in everyday life settings and make them clinically accessible.
- Find related publications in this database (Keywords)
-
Vocal fatigue
-
Voice features
-
Voice assessment
-
Speech-language pathology
-
Machine learning
-
Mobile application
-
Digital health