Librosa Spectrogram Python

Environmental Sound Recognition (ESR) with Python

Abstract: Environmental Sound Recognition (ESR) is an essential task in audio analysis, involving the identification and classification of sounds from various environmental contexts. This study ...

IEEE

CleanMel: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR

Abstract: In this work, we propose CleanMel, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance ...

GitHub

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

Checkout our new project: Unsupervised Speech Decomposition for Rhythm, Pitch, and Timbre Conversion https://github.com/auspicious3000/SpeechSplit This repository ...

GitHub

Beyond Spectrograms: Rethinking Audio Classification from EnCodec’s Latent Space

All the datasets must be located in the datasets folder. This folder should contain the following subfolders after downloading the datasets: GTZAN Speech_Music: Contains the GTZAN Speech Music dataset ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results