Abstract: Environmental Sound Recognition (ESR) is an essential task in audio analysis, involving the identification and classification of sounds from various environmental contexts. This study ...
Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...
Checkout our new project: Unsupervised Speech Decomposition for Rhythm, Pitch, and Timbre Conversion https://github.com/auspicious3000/SpeechSplit This repository ...
All the datasets must be located in the datasets folder. This folder should contain the following subfolders after downloading the datasets: GTZAN Speech_Music: Contains the GTZAN Speech Music dataset ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results