Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...
Abstract: This study proposes an innovative speech translation method based on Pix2PixGAN, which maps the Mel spectrograms of speech produced by deaf individuals to those of normal-hearing individuals ...
All the datasets must be located in the datasets folder. This folder should contain the following subfolders after downloading the datasets: GTZAN Speech_Music: Contains the GTZAN Speech Music dataset ...
CATS is a signal processing technique and framework for detecting and denoising sparse signals in the time-frequency domain. Particularly, very useful for processing earthquakes. This work is still in ...