Medizinische Universität Graz - Research portal

Logo MUG Resarch Portal

Selected Publication:

SHR Neuro Cancer Cardio Lipid Metab Microb

Zohrer, M; Peharz, R; Pernkopf, F; .
Representation Learning for Single-Channel Source Separation and Bandwidth Extension.
IEEE-ACM TRANS AUDIO SPEECH L. 2015; 23(12): 2398-2409. Doi: 10.1109/TASLP.2015.2470560
Web of Science FullText FullText_MUG

 

Co-authors Med Uni Graz
Peharz Robert
Altmetrics:

Dimensions Citations:
Plum Analytics:


Scite (citation analytics):

Abstract:
In this paper, we use deep representation learning for model-based single-channel source separation (SCSS) and artificial bandwidth extension (ABE). Both tasks are ill-posed and source-specific prior knowledge is required. In addition to well-known generative models such as restricted Boltzmann machines and higher order contractive autoencoders two recently introduced deep models, namely generative stochastic networks (GSNs) and sum-product networks (SPNs), are used for learning spectrogram representations. For SCSS we evaluate the deep architectures on data of the 2 CHiME speech separation challenge and provide results for a speaker dependent, a speaker independent, a matched noise condition and an unmatched noise condition task. GSNs obtain the best PESQ and overall perceptual score on average in all four tasks. Similarly, frame-wise GSNs are able to reconstruct the missing frequency bands in ABE best, measured in frequency-domain segmental SNR. They outperform SPNs embedded in hidden Markov models and the other representation models significantly.

Find related publications in this database (Keywords)
Bandwidth extension
deep neural networks (DNNs)
generative stochastic networks
representation learning
single-channel source separation (SCSS)
sum-product networks
© Med Uni GrazImprint