An automatic pipeline for generating high-quality datasets for TTS and ASR systems.
open-source tts dataset speech-recognition persian dataset-generation data-pipeline audio-processing asr forced-alignment text-cleaning speech-dataset low-resource-languages speech-corpus dataset-preparation mana-tts speech-data-collection manatts dataset-processing
-
Updated
May 22, 2025 - Jupyter Notebook